arxiv:2606.10747
Lukas Galke Poech
lgalke
AI & ML interests
LLM interpretability, agentic/multi-agent safety
Recent Activity
authored a paper 1 day ago
The Arbiter Agent: Continually Monitoring Multi-Agent Conversations to Detect Emergent Misalignment authored a paper 7 days ago
BrainSurgery: Reproducible and Reliable Declarative Weight Manipulations for Model Editing and Upcycling