Faithfulness Metrics Don't Measure Faithfulness: A Meta-Evaluation with Ground Truth Paper • 2605.25052 • Published 7 days ago • 13
Mixing Mechanisms: How Language Models Retrieve Bound Entities In-Context Paper • 2510.06182 • Published Oct 7, 2025 • 9
Enhancing Automated Interpretability with Output-Centric Feature Descriptions Paper • 2501.08319 • Published May 29, 2025 • 11