DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation Paper • 2601.22153 • Published Jan 29 • 75
VISTA-Bench: Do Vision-Language Models Really Understand Visualized Text as Well as Pure Text? Paper • 2602.04802 • Published Feb 4 • 2
SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture Paper • 2605.12500 • Published 4 days ago • 168
SenseNova-U1 Collection SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-Unify Architecture • 8 items • Updated about 15 hours ago • 62
SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture Paper • 2605.12500 • Published 4 days ago • 168
SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture Paper • 2605.12500 • Published 4 days ago • 168
SenseNova-U1 Collection SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-Unify Architecture • 8 items • Updated about 15 hours ago • 62
SenseNova-U1 Collection SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-Unify Architecture • 8 items • Updated about 15 hours ago • 62
view reply Model available on HF ? 👀 We just open-sourced it here: https://github.com/OpenSenseNova/SenseNova-U1 .Give it a try!
Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond Paper • 2604.22748 • Published 22 days ago • 226
SenseNova-U1 Collection SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-Unify Architecture • 8 items • Updated about 15 hours ago • 62