Bingxiang He

hbx

·

https://hbx-hbx.github.io/

AI & ML interests

NLP

Recent Activity

upvoted a paper 11 days ago

Weak-to-Strong Generalization via Direct On-Policy Distillation

upvoted a paper 25 days ago

Trimming the Long-Tail of Visual World Modeling Evaluation

upvoted a paper about 1 month ago

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

View all activity

Organizations

commented 2 papers 3 months ago

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 114 •

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 114 •

commented a paper 5 months ago

How Far Can Unsupervised RLVR Scale LLM Training?

Paper • 2603.08660 • Published Mar 9 • 60 •

New activity in hbx/JustRL-Nemotron-1.5B 7 months ago

Add Hugging Face paper link badge to model card

#1 opened 7 months ago by

New activity in hbx/JustRL-DeepSeek-1.5B 7 months ago

Improve model card: Update title, add paper link, correct license and citation

#1 opened 7 months ago by

commented a paper 7 months ago

JustRL: Scaling a 1.5B LLM with a Simple RL Recipe

Paper • 2512.16649 • Published Dec 18, 2025 • 32 •