bilabila/b-b7_olr_ts10_gru_hib_turn001_sym7_202601_lossq_ms400k_h12 68k • Updated 1 day ago • 391 • 1
DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards Paper • 2605.21467 • Published 5 days ago • 192
Useful Memories Become Faulty When Continuously Updated by LLMs Paper • 2605.12978 • Published 12 days ago • 19
MedSkillAudit: A Domain-Specific Audit Framework for Medical Research Agent Skills Paper • 2604.20441 • Published Apr 22 • 3
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 503
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published Apr 3 • 629
Think over Trajectories: Leveraging Video Generation to Reconstruct GPS Trajectories from Cellular Signaling Paper • 2603.26610 • Published Mar 27 • 9
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence Paper • 2603.28032 • Published Mar 30 • 342