Research
Publications
People
Media
Events
Vacancies
Contact
Technical Governance
AutoControl Arena: Synthesizing Executable Test Environments for Frontier AI Risk Evaluation
C. Li
,
P. Lu
,
X. Pan
,
F. Barez
,
M. Yang
Agentic Product Maturity Ladder V0.1
S. McGregor
,
D. Nathani
,
L. Saouma
,
F. Barez
,
A. Foundjem
,
Et Al.
The Capability Frontier: Benchmarks Miss 82% of Model Performance
B. Fowler
,
R. Smith
,
D. T. Graviet
,
W. Myers
,
J. Greaves
,
N. F. Oozeer
,
A. García
,
Et Al.
Establishing Best Practices for Building Rigorous Agentic Benchmarks
Y. Zhu
,
T. Jin
,
Y. Pruksachatkun
,
A. Zhang
,
S. Liu
,
S. Cui
,
S. Kapoor
,
F. Barez
,
Et Al.
Beyond Monoliths: Expert Orchestration for More Capable, Democratic, and Safe Language Models
P. Quirke
,
N. Oozeer
,
C. Bandi
,
A. Abdullah
,
J. Hoelscher-Obermaier
,
F. Barez
,
Et Al.
In Which Areas of Technical AI Safety Could Geopolitical Rivals Cooperate?
B. Bucknall
,
S. Siddiqui
,
L. Thurnherr
,
C. McGurk
,
B. Harack
,
A. Reuel
,
F. Barez
,
Et Al.
The Singapore Consensus on Global AI Safety Research Priorities
Y. Bengio
,
T. Maharaj
,
L. Ong
,
S. Russell
,
D. Song
,
M. Tegmark
,
L. Xue
,
F. Barez
,
Et Al.
Safety Frameworks and Standards: A Comparative Analysis to Advance Risk Management of Frontier AI
M. Ziosi
,
J. Gealy
,
M. Plueckebaum
,
D. Kossack
,
S. Campos
,
L. Saouma
,
F. Barez
,
Et Al.
Verification for International AI Governance
B. Harack
,
R. Trager
,
A. Reuel
,
D. Manheim
,
M. Brundage
,
O. Aarne
,
Et Al.
Position: Near to Mid-Term Risks and Opportunities of Open-Source Generative AI
F. Eiras
,
A. Petrov
,
B. Vidgen
,
C. S. De Witt
,
F. Pizzati
,
K. Elkins
,
F. Barez
,
Et Al.
»
Cite
×