UniT: Toward a Unified Physical Language for Human-to-Humanoid Policy Learning and World Modeling Paper • 2604.19734 • Published 7 days ago • 29
VINO: A Unified Visual Generator with Interleaved OmniModal Context Paper • 2601.02358 • Published Jan 5 • 30
Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models Paper • 2512.20557 • Published Dec 23, 2025 • 51