Top
Back to All Events

SRI Seminar Series: Roger Grosse, “On the origin of rogue AI”

  • Rotman School of Management, Room 1065 95 Saint George Street Toronto, ON, M5S 1A5 Canada (map)

This event is organized by the Schwartz Reisman Institute for Technology and Society.

Our weekly SRI Seminar Series welcomes Roger Grosse for a special in-person talk that will also be broadcast online. Grosse is an associate professor of computer science at the University of Toronto, a Schwartz Reisman Chair in Technology and Society, and a founding member of the Vector Institute. Grosse’s research focuses on better understanding neural net training dynamics, with his current work exploring how understandings of deep learning can be applied to generate safe and aligned AI systems. 

In this special in-person lecture, Grosse will articulate the underlying model of how LLMs or agents built on top of them could spontaneously “go rogue.” This session will be moderated by Sheila McIlraith.

Talk title:

“On the origin of rogue AI”

Abstract:

One of the most concerning scenarios for future AI systems is that the AI autonomously carries out a malign plan not intended by any human. But how could this happen? Classical arguments for catastrophic AI risk were made in terms of idealized long-horizon planning agents which seemingly bear little relationship to current-day large language models (LLMs). In this talk, I’ll try to articulate the underlying model of how LLMs or agents built on top of them could spontaneously “go rogue.” I’ll argue that LLM pre-training, by making complex behaviours more compressible, creates smoother fitness landscapes for evolutionary searches. Such evolutionary searches could lead to tendencies such as reward hacking, consequentialism, and punishment. If this hypothesis is correct, then continued scaling of LLMs will enable a variety of catastrophic risk pathways which, up to now, have been limited to philosophical thought experiments.

About Roger Grosse

Roger Grosse is an associate professor of computer science at the University of Toronto, a Schwartz Reisman Chair in Technology and Society, and a founding member of the Vector Institute. Grosse is also a member of technical staff on the Alignment Science Team at Anthropic. Grosse’s research focuses on better understanding neural net training dynamics, and uses this understanding to improve training speed, generalization, uncertainty estimation, and automatic hyperparameter tuning. His current research seeks to apply understandings of deep learning to AI alignment. 

Grosse holds a Sloan Fellowship, Canada Research Chair, and Canada CIFAR AI Chair. He received a BS in symbolic systems from Stanford in 2008, a MS in computer science from Stanford in 2009, and a PhD in computer science from MIT in 2014, studying under Bill Freeman and Josh Tenenbaum. From 2014 to 2016, Grosse was a postdoctoral researcher at the University of Toronto, working with Ruslan Salakhutdinov. Along with Colorado Reed, he created Metacademy, a website which uses a dependency graph of concepts to create personalized learning plans for machine learning and related fields.

About the SRI Seminar Series

The SRI Seminar Series brings together the Schwartz Reisman community and beyond for a robust exchange of ideas that advance scholarship at the intersection of technology and society. Seminars are led by a leading or emerging scholar and feature extensive discussion.

Each week, a featured speaker will present for 45 minutes, followed by an open discussion. Registered attendees will be emailed a Zoom link before the event begins. The event will be recorded and posted online.

Note: Event details can change. Please visit the unit’s website for the latest information about this event.