<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Et Al. | TSG Lab – Technical Safety &amp; Governance Lab</title><link>https://tsglab.github.io/author/et-al./</link><atom:link href="https://tsglab.github.io/author/et-al./index.xml" rel="self" type="application/rss+xml"/><description>Et Al.</description><generator>Hugo Blox Builder (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Sun, 01 Feb 2026 00:00:00 +0000</lastBuildDate><image><url>https://tsglab.github.io/media/logo.svg</url><title>Et Al.</title><link>https://tsglab.github.io/author/et-al./</link></image><item><title>Same Answer, Different Representations: Hidden Instability in VLMs</title><link>https://tsglab.github.io/publication/same-answer-different-representations/</link><pubDate>Sun, 01 Feb 2026 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/same-answer-different-representations/</guid><description/></item><item><title>The Hitchhiker's Guide to Actionable Interpretability</title><link>https://tsglab.github.io/publication/hitchhikers-guide-actionable-interpretability/</link><pubDate>Thu, 15 Jan 2026 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/hitchhikers-guide-actionable-interpretability/</guid><description/></item><item><title>Agentic Product Maturity Ladder V0.1</title><link>https://tsglab.github.io/publication/agentic-product-maturity-ladder/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/agentic-product-maturity-ladder/</guid><description/></item><item><title>Interpretability Can Be Actionable</title><link>https://tsglab.github.io/publication/interpretability-can-be-actionable/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/interpretability-can-be-actionable/</guid><description/></item><item><title>Quantifying the Effect of Test Set Contamination on Generative Evaluations</title><link>https://tsglab.github.io/publication/quantifying-test-set-contamination/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/quantifying-test-set-contamination/</guid><description/></item><item><title>The Capability Frontier: Benchmarks Miss 82% of Model Performance</title><link>https://tsglab.github.io/publication/capability-frontier-benchmarks/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/capability-frontier-benchmarks/</guid><description/></item><item><title>Establishing Best Practices for Building Rigorous Agentic Benchmarks</title><link>https://tsglab.github.io/publication/agentic-benchmarks-best-practices/</link><pubDate>Mon, 01 Dec 2025 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/agentic-benchmarks-best-practices/</guid><description/></item><item><title>Full-Stack Alignment: Co-Aligning AI and Institutions with Thicker Models of Value</title><link>https://tsglab.github.io/publication/full-stack-alignment-institutions/</link><pubDate>Mon, 01 Dec 2025 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/full-stack-alignment-institutions/</guid><description/></item><item><title>HACK: Hallucinations Along Certainty and Knowledge Axes</title><link>https://tsglab.github.io/publication/hack-hallucinations-certainty-knowledge/</link><pubDate>Wed, 01 Oct 2025 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/hack-hallucinations-certainty-knowledge/</guid><description/></item><item><title>Beyond Monoliths: Expert Orchestration for More Capable, Democratic, and Safe Language Models</title><link>https://tsglab.github.io/publication/beyond-monoliths-expert-orchestration/</link><pubDate>Sun, 01 Jun 2025 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/beyond-monoliths-expert-orchestration/</guid><description/></item><item><title>In Which Areas of Technical AI Safety Could Geopolitical Rivals Cooperate?</title><link>https://tsglab.github.io/publication/geopolitical-rivals-ai-safety-cooperation/</link><pubDate>Sun, 01 Jun 2025 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/geopolitical-rivals-ai-safety-cooperation/</guid><description/></item><item><title>The Singapore Consensus on Global AI Safety Research Priorities</title><link>https://tsglab.github.io/publication/singapore-consensus-ai-safety/</link><pubDate>Sun, 01 Jun 2025 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/singapore-consensus-ai-safety/</guid><description/></item><item><title>AILuminate: Introducing v1.0 of the AI Risk and Reliability Benchmark from MLCommons</title><link>https://tsglab.github.io/publication/ailuminate-mlcommons/</link><pubDate>Sat, 01 Mar 2025 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/ailuminate-mlcommons/</guid><description/></item><item><title>Safety Frameworks and Standards: A Comparative Analysis to Advance Risk Management of Frontier AI</title><link>https://tsglab.github.io/publication/safety-frameworks-standards/</link><pubDate>Wed, 01 Jan 2025 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/safety-frameworks-standards/</guid><description/></item><item><title>Verification for International AI Governance</title><link>https://tsglab.github.io/publication/verification-international-ai-governance/</link><pubDate>Wed, 01 Jan 2025 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/verification-international-ai-governance/</guid><description/></item><item><title>Jailbreak Defense in a Narrow Domain: Limitations of Existing Methods and a New Transcript-Classifier Approach</title><link>https://tsglab.github.io/publication/jailbreak-defense-narrow-domain/</link><pubDate>Sun, 01 Dec 2024 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/jailbreak-defense-narrow-domain/</guid><description/></item><item><title>Mechanistic Interpretability Workshop at ICML 2024</title><link>https://tsglab.github.io/publication/mechanistic-interpretability-workshop-icml-2024/</link><pubDate>Mon, 01 Jul 2024 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/mechanistic-interpretability-workshop-icml-2024/</guid><description/></item><item><title>Position: Near to Mid-Term Risks and Opportunities of Open-Source Generative AI</title><link>https://tsglab.github.io/publication/open-source-generative-ai-risks/</link><pubDate>Mon, 01 Jul 2024 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/open-source-generative-ai-risks/</guid><description/></item><item><title>Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models</title><link>https://tsglab.github.io/publication/sycophancy-to-subterfuge/</link><pubDate>Sat, 01 Jun 2024 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/sycophancy-to-subterfuge/</guid><description/></item><item><title>Sleeper Agents: Training Deceptive LLMs That Persist Through Safety Training</title><link>https://tsglab.github.io/publication/sleeper-agents/</link><pubDate>Mon, 01 Jan 2024 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/sleeper-agents/</guid><description/></item><item><title>The Alan Turing Institute's Response to the House of Lords Large Language Models Call for Evidence</title><link>https://tsglab.github.io/publication/turing-institute-lords-llm-evidence/</link><pubDate>Sun, 01 Jan 2023 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/turing-institute-lords-llm-evidence/</guid><description/></item></channel></rss>