Top
 

Distinguished Lecture Series
2025-2026 Speakers


 
Headshot of David Duvenaud

David Duvenaud

Associate Professor, Department of Computer Science
University of Toronto

The big picture of LLM dangerous capability evals

Wednesday, October 15, 2025
12:30 p.m.

Schwartz Reisman Innovation Campus, Room W240
108 College Street, Toronto, ON M5G 0C6

We gratefully acknowledge the support of the Webster Family Charitable Giving Foundation for this event.

Abstract:

How can we avoid AI disasters? The plan so far is mostly to check the extent to which AIs could cause catastrophic harms based on tests in controlled conditions. However, there are obvious problems with this approach, both technical and due to their limited scope. I'll give an overview of the work my team at Anthropic did to evaluate risks due to models feigning incompetence, colluding, or sabotaging human decision-making. I'll also discuss the idea of “control” techniques, which use AIs to monitor and set traps to look for bad behavior in other AIs. Finally, I'll outline the main problems beyond the scope of these approaches, in particular that of robustly aligning our institutions to human interests.

Bio:

David Duvenaud is an associate professor in the Department of Computer Science and Statistical Sciences at the University of Toronto, where he holds a Schwartz Reisman Chair in Technology and Society. A leading voice in AI safety and artificial general intelligence (AGI) governance, Duvenaud’s current work focuses on evaluating dangerous capabilities in advanced AI systems, mitigating catastrophic risks from future models, and developing institutional designs for post-AGI futures. Duvenaud is a Canada CIFAR AI Chair and a founding faculty member at the Vector Institute, a member of Innovation, Science and Economic Development Canada’s Safe and Secure AI Advisory Group, and recently completed an extended sabbatical with the Alignment Science team at Anthropic.

Duvenaud’s early helped shape the field of probabilistic deep learning, with contributions including neural ordinary differential equations, gradient-based hyperparameter optimization, and generative models for molecular design. He has received numerous honors, including the Sloan Research Fellowship, Ontario Early Researcher Award, and best paper awards at NeurIPS, ICML, and ICFP. Before joining the University of Toronto, Duvenaud was a postdoctoral fellow in the Harvard Intelligent Probabilistic Systems group and completed his PhD at the University of Cambridge under Carl Rasmussen and Zoubin Ghahramani.