Colloquium Series: Ruiqi Zhong, "Building Expert-Level Language Models from Decomposed Weak Validations"

Thursday, April 3, 2025
11:00 AM 12:00 PM 11:00 12:00

Google Calendar ICS

Speaker:

Ruiqi Zhong

Talk Title:

Building Expert-Level Language Models from Decomposed Weak Validations

Date and Location:

Thursday, April 3, 2025

Virtual

This lecture is open to the Department of Computer Science community. Attendance information will be shared via email.

Abstract:

Language models (LMs) can process large volumes of information and perform complex reasoning. They hold the promise of executing expert-level tasks, such as brainstorming scientific hypotheses or developing complex software. However, building these LMs requires humans to validate their outputs, which is challenging; e.g., developers cannot easily validate whether complex software is bug-free. If our validation is fallible, LMs may learn to "hack" the validators, convincing us that they are right even when they are wrong.

To address this, I show how to decompose complex validation tasks into "weaker" ones that are easier for humans or LMs: e.g., validating return values rather than entire programs, or validating discoveries on individual samples rather than on entire datasets. Through several examples, I show how these techniques allow us to use LMs for expert-level tasks more reliably. Looking forward, I discuss how to use LMs to automate these task decompositions, and how we can use these frameworks to monitor both individual AI systems and their broader impact within society.

About Ruiqi Zhong:

Ruiqi Zhong is a final-year Ph.D. student at UC Berkeley, co-advised by Jacob Steinhardt and Dan Klein. He was previously a part-time member of technical staff at Anthropic, where he worked on the automated red teaming team. His research is at the intersection of machine learning and NLP, and he develops language model systems to advance the frontier of human capabilities. He developed the earliest prototype of instruction-tuning, and his research contribution has been scaled up by leading language model companies, such as Google, OpenAI, and Anthropic.