Learning Video LLM with Streaming Speech Transcription at Scale (CVPR 2025)
Joya Chen
chenjoya
AI & ML interests
Video LLM
Recent Activity
upvoted a paper about 5 hours ago
Seed2.0 Model Card: Towards Intelligence Frontier for Real-World Complexity upvoted a paper 22 days ago
Data Journalist Agent: Transforming Data into Verifiable Multimodal Stories upvoted a paper 28 days ago
Dream.exe: Can Video Generation Models Dream Executable Robot Manipulation?