FOLDS seminar: Coherence Mechanisms for Provable Self-Improvement
March 19 at 12:00 PM - 1:00 PM
Zoom link: https://upenn.zoom.us/j/98220304722
Large language models are increasingly trained to improve themselves, yet the mechanisms driving this, such as self-reflection or RLAIF, rely almost entirely on empirical heuristics. Is it possible to mathematically guarantee self-improvement without human supervision?
In this talk, I will introduce a geometric framework that proves self-improvement is not only possible but monotonic, grounded in the principle of coherence. By formalizing self-improvement as a Bregman projection onto a space of logically consistent models, we can guarantee enhanced performance. Furthermore, I will present a surprising characterization theorem: any self-improvement mechanism that offers similar theoretical guarantees must, fundamentally, be a coherence projection in disguise.
(Joint work with Jon Schneider and Yifan Wu.)
Mehryar Mohri
Head of the Machine Learning Theory team at Google Research and a Professor of Computer Science at the Courant Institute of Mathematical Sciences
Mehryar Mohri is the Head of the Machine Learning Theory team at Google Research and a Professor of Computer Science at the Courant Institute of Mathematical Sciences. His research spans machine learning, computational learning theory, automata theory, and algorithms. Prior to joining Google and Courant, he spent a decade as a research department head and technology leader at AT&T Bell Labs. He is the lead author of the standard reference and textbook Foundations of Machine Learning and serves as President of the Association for Algorithmic Learning Theory (AALT). Beyond theory, he has developed core algorithms that underpin numerous deployed systems in machine learning, speech recognition, and natural language processing.