Full Attention Strikes Back: Transferring Full Attention into Sparse within Hundred Training Steps Paper • 2605.16928 • Published 11 days ago • 90
It Takes Two: Complementary Self-Distillation for Contextual Integrity in LLMs Paper • 2605.20258 • Published 9 days ago • 30
Mega-ASR: Towards In-the-wild^2 Speech Recognition via Scaling up Real-world Acoustic Simulation Paper • 2605.19833 • Published 8 days ago • 129
view article Article OlmoEarth v1.1: A more efficient family of Earth observation models allenai • 8 days ago • 19
FashionChameleon: Towards Real-Time and Interactive Human-Garment Video Customization Paper • 2605.15824 • Published 12 days ago • 62
Memory-Efficient Looped Transformer: Decoupling Compute from Memory in Looped Language Models Paper • 2605.07721 • Published 19 days ago • 29
CausalCine: Real-Time Autoregressive Generation for Multi-Shot Video Narratives Paper • 2605.12496 • Published 15 days ago • 28
MemEye: A Visual-Centric Evaluation Framework for Multimodal Agent Memory Paper • 2605.15128 • Published 13 days ago • 60
Nexus : An Agentic Framework for Time Series Forecasting Paper • 2605.14389 • Published 13 days ago • 6
MinT: Managed Infrastructure for Training and Serving Millions of LLMs Paper • 2605.13779 • Published 14 days ago • 217
Do Enterprise Systems Need Learned World Models? The Importance of Context to Infer Dynamics Paper • 2605.12178 • Published 15 days ago • 60
AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation Paper • 2605.13724 • Published 14 days ago • 97
RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards Paper • 2605.10899 • Published 16 days ago • 75