SafeSci Data and safety-enhanced LLMs via finetuning.
Zhu Xiangyang
yyy127
AI & ML interests
None yet
Recent Activity
upvoted a paper 39 minutes ago
Emergent Misalignment Can Be Induced by Sycophancy and Reversed via Alignment Gating upvoted a paper about 2 months ago
Emergent Social Intelligence Risks in Generative Multi-Agent Systems updated a dataset 2 months ago
yyy127/SafeSciOrganizations
None yet