2 25 5

Yuran Wang

Ryann829

AI & ML interests

Multimodal Large Language Model

Recent Activity

authored a paper 3 days ago

LatentOmni: Rethinking Omni-Modal Understanding via Unified Audio-Visual Latent Reasoning

upvoted a paper 4 days ago

LatentOmni: Rethinking Omni-Modal Understanding via Unified Audio-Visual Latent Reasoning

authored a paper 5 days ago

Artifact-Bench: Evaluating MLLMs on Detecting and Assessing the Artifacts of AI-Generated Videos

View all activity

Organizations

authored a paper 3 days ago

LatentOmni: Rethinking Omni-Modal Understanding via Unified Audio-Visual Latent Reasoning

Paper • 2605.22012 • Published 5 days ago • 43

upvoted a paper 4 days ago

LatentOmni: Rethinking Omni-Modal Understanding via Unified Audio-Visual Latent Reasoning

Paper • 2605.22012 • Published 5 days ago • 43

authored a paper 5 days ago

Artifact-Bench: Evaluating MLLMs on Detecting and Assessing the Artifacts of AI-Generated Videos

Paper • 2605.18984 • Published 8 days ago • 22

upvoted a paper 6 days ago

Artifact-Bench: Evaluating MLLMs on Detecting and Assessing the Artifacts of AI-Generated Videos

Paper • 2605.18984 • Published 8 days ago • 22

authored a paper 11 days ago

Edit-Compass & EditReward-Compass: A Unified Benchmark for Image Editing and Reward Modeling

Paper • 2605.13062 • Published 13 days ago • 33

upvoted a paper 12 days ago

Edit-Compass & EditReward-Compass: A Unified Benchmark for Image Editing and Reward Modeling

Paper • 2605.13062 • Published 13 days ago • 33

upvoted a paper 13 days ago

Beyond the Last Layer: Multi-Layer Representation Fusion for Visual Tokenization

Paper • 2605.10780 • Published 14 days ago • 33

updated 2 datasets about 2 months ago

Ryann829/SconeEval

Viewer • Updated Apr 9 • 761 • 147 • 3

Ryann829/Scone-S2I-57K

Viewer • Updated Apr 9 • 56.8k • 201 • 3

updated a model about 2 months ago

Ryann829/Scone

Text-to-Image • 15B • Updated Apr 9 • 5 • 4

authored a paper about 2 months ago

OpenWorldLib: A Unified Codebase and Definition of Advanced World Models

Paper • 2604.04707 • Published Apr 6 • 203

upvoted 3 papers about 2 months ago

OpenWorldLib: A Unified Codebase and Definition of Advanced World Models

Paper • 2604.04707 • Published Apr 6 • 203

Generative World Renderer

Paper • 2604.02329 • Published Apr 2 • 102

DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models

Paper • 2603.26164 • Published Mar 27 • 364

upvoted a paper 3 months ago

TimeChat-Captioner: Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions

Paper • 2602.08711 • Published Feb 9 • 28

upvoted 2 papers 4 months ago

OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models

Paper • 2602.04804 • Published Feb 4 • 50

Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers

Paper • 2602.03510 • Published Feb 3 • 27

authored a paper 4 months ago

Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks

Paper • 2602.01630 • Published Feb 2 • 50

upvoted 2 papers 4 months ago

3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation

Paper • 2602.03796 • Published Feb 3 • 64

Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks

Paper • 2602.01630 • Published Feb 2 • 50

Yuran Wang

AI & ML interests

Recent Activity

Organizations

Ryann829's activity