W.Jimmy
WJimmy
ยท
AI & ML interests
None yet
Recent Activity
upvoted a paper 8 days ago
Self-Distilled Agentic Reinforcement Learning upvoted a paper 28 days ago
Pause or Fabricate? Training Language Models for Grounded Reasoning upvoted a paper about 1 month ago
GFT: From Imitation to Reward Fine-Tuning with Unbiased Group Advantages and Dynamic Coefficient RectificationOrganizations
None yet