Datasets used in the paper
Jiayu (Mila) Wang PRO
MilaWang
AI & ML interests
Large Language Model, Multimodal Large Language Model, Agentic System, Reasoning, Efficiency
Recent Activity
updated a model about 2 hours ago
MilaWang/lirpg-fullparam-qwen2-5-math-7b-handrolled-rin0.1 updated a model about 2 hours ago
MilaWang/lirpg-fullparam-qwen2-5-math-7b-answeronly-handrolled updated a model about 3 hours ago
MilaWang/grpo-fullparam-qwen2-5-math-7b-answeronly