MentalBench: A Benchmark for Evaluating Psychiatric Diagnostic Capability of Large Language Models Paper • 2602.12871 • Published Feb 13 • 17
MolmoAct2: Action Reasoning Models for Real-world Deployment Paper • 2605.02881 • Published about 1 month ago • 348