Sleeper Agents: Training Deceptive LLMs That Persist Through Safety Training

Publication
arXiv:2401.05566