Research
Publications
People
Media
Events
Vacancies
Contact
ICML
Do Sparse Autoencoders Generalize? A Case Study of Answerability
L. Heindrich
,
P. Torr
,
F. Barez
,
V. Thost
PoisonBench: Assessing Large Language Model Vulnerability to Data Poisoning
T. Fu*
,
M. Sharma
,
P. Torr
,
S. B. Cohen
,
D. Krueger
,
F. Barez*
Scaling Sparse Feature Circuit Finding for In-Context Learning
D. Kharlapenko
,
S. Shabalin
,
F. Barez
,
A. Conmy
,
N. Nanda
Mechanistic Interpretability Workshop at ICML 2024
F. Barez
,
M. Geva
,
L. Chan
,
A. Geiger
,
K. Yin
,
N. Nanda
,
Et Al.
Position: Near to Mid-Term Risks and Opportunities of Open-Source Generative AI
F. Eiras
,
A. Petrov
,
B. Vidgen
,
C. S. De Witt
,
F. Pizzati
,
K. Elkins
,
F. Barez
,
Et Al.
Visualizing Neural Network Imagination
N. Wichers
,
V. Tao
,
R. Volpato
,
F. Barez
Cite
×