On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification Paper • 2508.05629 • Published Aug 7, 2025 • 190
Training language models to follow instructions with human feedback Paper • 2203.02155 • Published Mar 4, 2022 • 24
Preserving Diversity in Supervised Fine-Tuning of Large Language Models Paper • 2408.16673 • Published Apr 5, 2025
SED-SFT: Selectively Encouraging Diversity in Supervised Fine-Tuning Paper • 2602.07464 • Published Feb 7 • 1
Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity Paper • 2510.01171 • Published Oct 1, 2025 • 19
SFT Doesn't Always Hurt General Capabilities: Revisiting Domain-Specific Fine-Tuning in LLMs Paper • 2509.20758 • Published Sep 25, 2025 • 2