<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>ACL/EMNLP | TSG Lab – Technical Safety &amp; Governance Lab</title><link>https://tsglab.github.io/tag/acl/emnlp/</link><atom:link href="https://tsglab.github.io/tag/acl/emnlp/index.xml" rel="self" type="application/rss+xml"/><description>ACL/EMNLP</description><generator>Hugo Blox Builder (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Sat, 01 Nov 2025 00:00:00 +0000</lastBuildDate><image><url>https://tsglab.github.io/media/logo.svg</url><title>ACL/EMNLP</title><link>https://tsglab.github.io/tag/acl/emnlp/</link></image><item><title>Beyond Linear Steering: Unified Multi-Attribute Control for Language Models</title><link>https://tsglab.github.io/publication/beyond-linear-steering-multi-attribute-control/</link><pubDate>Sat, 01 Nov 2025 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/beyond-linear-steering-multi-attribute-control/</guid><description/></item><item><title>Precise In-Parameter Concept Erasure in Large Language Models</title><link>https://tsglab.github.io/publication/precise-concept-erasure-llms/</link><pubDate>Sat, 01 Nov 2025 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/precise-concept-erasure-llms/</guid><description/></item><item><title>Same Question, Different Words: A Latent Adversarial Framework for Prompt Robustness</title><link>https://tsglab.github.io/publication/latent-adversarial-prompt-robustness/</link><pubDate>Sat, 01 Nov 2025 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/latent-adversarial-prompt-robustness/</guid><description/></item><item><title>Trust Me, I'm Wrong: High-Certainty Hallucinations in LLMs</title><link>https://tsglab.github.io/publication/trust-me-im-wrong-hallucinations/</link><pubDate>Sat, 01 Nov 2025 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/trust-me-im-wrong-hallucinations/</guid><description/></item><item><title>Interpreting Context Look-ups in Transformers: Investigating Attention-MLP Interactions</title><link>https://tsglab.github.io/publication/attention-mlp-interactions/</link><pubDate>Fri, 01 Nov 2024 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/attention-mlp-interactions/</guid><description/></item><item><title>Towards Interpretable Sequence Continuation: Analyzing Shared Circuits in Large Language Models</title><link>https://tsglab.github.io/publication/interpretable-sequence-continuation/</link><pubDate>Fri, 01 Nov 2024 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/interpretable-sequence-continuation/</guid><description/></item><item><title>Large Language Models Relearn Removed Concepts</title><link>https://tsglab.github.io/publication/llms-relearn-removed-concepts/</link><pubDate>Thu, 01 Aug 2024 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/llms-relearn-removed-concepts/</guid><description/></item><item><title>Detecting Edit Failures in Large Language Models: An Improved Specificity Benchmark</title><link>https://tsglab.github.io/publication/detecting-edit-failures-llms/</link><pubDate>Sat, 01 Jul 2023 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/detecting-edit-failures-llms/</guid><description/></item><item><title>The Larger They Are, the Harder They Fail: Language Models Do Not Recognize Identifier Swaps in Python</title><link>https://tsglab.github.io/publication/identifier-swaps-python/</link><pubDate>Sat, 01 Jul 2023 00:00:00 +0000</pubDate><guid>https://tsglab.github.io/publication/identifier-swaps-python/</guid><description/></item></channel></rss>