TERMINATOR: Learning Optimal Exit Points for Early Stopping in Chain-of-Thought Reasoning Paper • 2603.12529 • Published Mar 13 • 19
EntRGi: Entropy Aware Reward Guidance for Diffusion Language Models Paper • 2602.05000 • Published Feb 4 • 2
Least-Loaded Expert Parallelism: Load Balancing An Imbalanced Mixture-of-Experts Paper • 2601.17111 • Published Jan 23 • 5
Least-Loaded Expert Parallelism: Load Balancing An Imbalanced Mixture-of-Experts Paper • 2601.17111 • Published Jan 23 • 5
Mitigating Catastrophic Forgetting in Mathematical Reasoning Finetuning through Mixed Training Paper • 2512.13706 • Published Dec 5, 2025 • 1
Synthesizing Agentic Data for Web Agents with Progressive Difficulty Enhancement Mechanisms Paper • 2510.13913 • Published Oct 15, 2025 • 4
EgoVLM: Policy Optimization for Egocentric Video Understanding Paper • 2506.03097 • Published Jun 3, 2025
Hard2Verify: A Step-Level Verification Benchmark for Open-Ended Frontier Math Paper • 2510.13744 • Published Oct 15, 2025 • 6
ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction? Paper • 2411.06469 • Published Nov 10, 2024 • 17
SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents Paper • 2509.06283 • Published Sep 8, 2025 • 17
Concentration of Measure for Distributions Generated via Diffusion Models Paper • 2501.07741 • Published Jan 13, 2025
MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding Paper • 2507.12463 • Published Jul 16, 2025 • 27
Noise Contrastive Alignment of Language Models with Explicit Rewards Paper • 2402.05369 • Published Feb 8, 2024 • 2
Vidu: a Highly Consistent, Dynamic and Skilled Text-to-Video Generator with Diffusion Models Paper • 2405.04233 • Published May 7, 2024 • 3
Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion Paper • 2506.08009 • Published Jun 9, 2025 • 31
Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator Paper • 2503.01103 • Published Mar 3, 2025 • 5
RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers Paper • 2502.15894 • Published Feb 21, 2025 • 20