CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published 11 days ago • 262
Predicting Decisions of AI Agents from Limited Interaction through Text-Tabular Modeling Paper • 2605.12411 • Published 12 days ago • 48
Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers Paper • 2605.06169 • Published 17 days ago • 215
Maximal Brain Damage Without Data or Optimization: Disrupting Neural Networks via Sign-Bit Flips Paper • 2502.07408 • Published Apr 16 • 59
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 503
Lighting-grounded Video Generation with Renderer-based Agent Reasoning Paper • 2604.07966 • Published Apr 9 • 10
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver Paper • 2604.08377 • Published Apr 9 • 291
Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents Paper • 2604.06132 • Published Apr 7 • 121
BizGenEval: A Systematic Benchmark for Commercial Visual Content Generation Paper • 2603.25732 • Published Mar 26 • 11
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published Mar 20 • 351