Research
Publications
People
Media
Events
Vacancies
Contact
Establishing Best Practices for Building Rigorous Agentic Benchmarks
Y. Zhu
,
T. Jin
,
Y. Pruksachatkun
,
A. Zhang
,
S. Liu
,
S. Cui
,
S. Kapoor
,
F. Barez
,
Et Al.
December 2025
Type
Conference paper
Publication
NeurIPS 2025
Technical Governance
NeurIPS
Cite
×