Loading Events

« All Events

  • This event has passed.

FOLDS seminar: Weak to Strong Generalization in Random Feature Models

October 30, 2025 at 12:00 PM - 1:00 PM

Zoom linkhttps://upenn.zoom.us/j/98220304722

 

Weak-to-Strong Generalization (Burns et al., 2023) is the phenomenon whereby a strong student, say GPT-4, learns a task from a weak teacher, say GPT-2, and ends up significantly outperforming the teacher. We show that this phenomenon does not require a strong and complex learner like GPT-4, nor pre-training. We consider students and teachers that are random feature models, described by two-layer networks with a random and fixed bottom layer and trained top layer. A ‘weak’ teacher, with a small number of units (i.e. random features), is trained on the population, and a ‘strong’ student, with a much larger number of units (i.e. random features), is trained only on labels generated by the weak teacher. We demonstrate, prove and understand, how the student can outperform the teacher, even though trained only on data labeled by the teacher, with no pretraining or other knowledge or data advantage over the teacher. We explain how such weak-to-strong generalization is enabled by early stopping. Importantly, we also show the quantitative limits of weak-to-strong generalization in this model.

Joint work with Marko Medvedev, Kaifeng Lyu, Dingli Yu, Sanjeev Arora and Zhiyuan Li.

Nati Srebro

Professor at the Toyota Technological Institute at Chicago

Nati (Nathan) Srebro is a professor at the Toyota Technological Institute at Chicago, with cross-appointments at the University of Chicago’s Department of Computer Science, and Committee on Computational and Applied Mathematics. He obtained his PhD from the Massachusetts Institute of Technology in 2004, and previously was a postdoctoral fellow at the University of Toronto, a visiting scientist at IBM, and an associate professor at the Technion.

Dr. Srebro’s research encompasses methodological, statistical and computational aspects of machine learning, as well as related problems in optimization. Some of Srebro’s significant contributions include work on learning “wider” Markov networks, introducing the use of the nuclear norm for machine learning and matrix reconstruction, work on fast optimization techniques for machine learning, and on the relationship between learning and optimization. His current interests include understanding deep learning through a detailed understanding of optimization, distributed and federated learning, algorithmic fairness and practical adaptive data analysis.

Details

Organizers

  • IDEAS Center
  • PennAI
  • Wharton Statistics and Data Science Department

Venue

  • Amy Gutmann Hall, Room 414
  • 3333 Chestnut Street
    Philadelphia, 19104 United States
    + Google Map