AdversarialRLHF/rloo_pythia410m_tldr6.9b_rm410mdata_mergedsft0.3_prefix_nokl_checkpoint-184_eval-dataset Viewer • Updated May 1, 2025 • 6.45k • 92
AdversarialRLHF/rloo_pythia410m_tldr6.9b_rm410mdata_mergedsft0.3_prefix_nokl_checkpoint-184_eval-dataset Viewer • Updated May 1, 2025 • 6.45k • 92
AdversarialRLHF/rloo_pythia410m_tldr6.9b_rm410mdata_mergedsft0.3_prefix_nokl_checkpoint-26_eval-dataset Viewer • Updated May 1, 2025 • 6.45k • 14
AdversarialRLHF/rloo_pythia410m_tldr6.9b_rm410mdata_mergedsft0.3_prefix_nokl_checkpoint-26_eval-dataset Viewer • Updated May 1, 2025 • 6.45k • 14
AdversarialRLHF/rloo_pythia410m_tldr6.9b_rm410mdata_mergedsft0.3_prefix_nokl_checkpoint-78_eval-dataset Viewer • Updated May 1, 2025 • 6.45k • 8
AdversarialRLHF/rloo_pythia410m_tldr6.9b_rm410mdata_mergedsft0.3_prefix_nokl_checkpoint-78_eval-dataset Viewer • Updated May 1, 2025 • 6.45k • 8
AdversarialRLHF/rloo_pythia410m_tldr6.9b_rm410mdata_mergedsft0.3_prefix_nokl_checkpoint-52_eval-dataset Viewer • Updated May 1, 2025 • 6.45k • 5
AdversarialRLHF/rloo_pythia410m_tldr6.9b_rm410mdata_mergedsft0.3_prefix_nokl_checkpoint-52_eval-dataset Viewer • Updated May 1, 2025 • 6.45k • 5
AdversarialRLHF/rloo_pythia410m_tldr6.9b_rm410mdata_mergedsft0.3_prefix_nokl_checkpoint-104_eval-dataset Viewer • Updated May 1, 2025 • 6.45k • 7
AdversarialRLHF/rloo_pythia410m_tldr6.9b_rm410mdata_mergedsft0.3_prefix_nokl_checkpoint-104_eval-dataset Viewer • Updated May 1, 2025 • 6.45k • 7
AdversarialRLHF/rloo_pythia410m_tldr6.9b_rm410mdata_mergedsft0.3_prefix_nokl_checkpoint-255_eval-dataset Viewer • Updated May 1, 2025 • 6.45k • 22
AdversarialRLHF/rloo_pythia410m_tldr6.9b_rm410mdata_mergedsft0.3_prefix_nokl_checkpoint-255_eval-dataset Viewer • Updated May 1, 2025 • 6.45k • 22
AdversarialRLHF/rloo_pythia410m_tldr6.9b_rm410mdata_mergedsft_prefix_kl0.005 0.4B • Updated Apr 30, 2025 • 64
AdversarialRLHF/rloo_pythia410m_tldr6.9b_rm410mdata_mergedsft0.3_prefix_nokl 0.4B • Updated Apr 30, 2025 • 3
AdversarialRLHF/rloo_pythia410m_tldr6.9b_rm410mdata_mergedsft_prefix_nokl_checkpoint-255_eval-dataset Viewer • Updated Apr 30, 2025 • 6.45k • 5
AdversarialRLHF/rloo_pythia410m_tldr6.9b_rm410mdata_mergedsft_prefix_nokl_checkpoint-255_eval-dataset Viewer • Updated Apr 30, 2025 • 6.45k • 5
AdversarialRLHF/rloo_pythia410m_tldr6.9b_rm410mdata_checkpoint-255_eval-dataset Viewer • Updated Apr 30, 2025 • 6.45k • 31
AdversarialRLHF/rloo_pythia410m_tldr6.9b_rm410mdata_checkpoint-255_eval-dataset Viewer • Updated Apr 30, 2025 • 6.45k • 31
AdversarialRLHF/rloo_pythia410m_tldr6.9b_rm410mdata_checkpoint-104_eval-dataset Viewer • Updated Apr 30, 2025 • 6.45k • 47
AdversarialRLHF/rloo_pythia410m_tldr6.9b_rm410mdata_checkpoint-104_eval-dataset Viewer • Updated Apr 30, 2025 • 6.45k • 47