CIS Special Industry Colloquium: “Scaling Paradigms for Large Language Models”
November 20 at 1:45 PM - 3:15 PM
In this talk I will tell you how scaling has been the engine of progress in AI for the past five years. In the first scaling paradigm, our field scaled large language models by training with more compute on more data. Such scaling led to the success of ChatGPT and other AI chat engines, which were surprisingly capable and general purpose. With the release of OpenAI o1, we are at the beginning of a new paradigm where we do not just scale training time compute, but we also scale test-time compute. These new models are trained via reinforcement learning on chain-of-thought reasoning, and by thinking harder for more-challenging tasks can solve even competition-level math and programming problems. I will conclude with a few remarks on how AI research culture has changed and where the field might go next.
For more information please contact: Mayur Naik (mhnaik@seas.upenn.edu)
Jason Wei
AI researcher, San Francisco, CA
Jason Wei is an AI researcher based in San Francisco. He currently works at OpenAI, where he contributed to OpenAI o1, a frontier model trained to do chain-of-thought reasoning via reinforcement learning. From 2020 to 2023, Jason was a research scientist at Google Brain, where his work popularized chain-of-thought prompting, instruction tuning, and emergent phenomena.