Rethinking Cross-Layer Information Routing in Diffusion Transformers Paper • 2605.20708 • Published 12 days ago • 109
LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws Paper • 2605.23901 • Published 10 days ago • 13
Rethinking Muon Beyond Pretraining: Spectral Failures and High-Pass Remedies for VLA and RLVR Paper • 2605.19282 • Published 13 days ago • 8
LatentUMM: Dual Latent Alignment for Unified Multimodal Models Paper • 2605.17766 • Published 14 days ago • 8