- This event has passed.
ASSET Seminar: “Improving Generative AI at Inference Time: Alignment, Reasoning, & Efficiency”
March 4 at 12:00 PM - 1:15 PM
Modern generative AI is often improved through costly post-training (e.g., RLHF or preference tuning). This talk highlights a complementary alternative called inference-time methods. With the right inference-time objectives, search procedures, and compute allocation, we can meaningfully improve generative AI model behavior without updating model weights. We will begin with a principled formulation of inference-time methods for AI alignment and introduce transfer decoding, which estimates token-level optimal values for reward-guided decoding by leveraging an already available baseline-aligned model. Building on this decoding-time optimization viewpoint, we then move to multi-criteria alignment at inference time under preferences and safety constraints. In particular, we will discuss a satisficing view of alignment and an inference-time framework that operationalizes this idea. Next, we will show how inference-time techniques can also improve reasoning performance: why simply thinking longer can create a mirage of improvement and hurt accuracy via overthinking, and how parallel thinking provides a more reliable inference-time scaling strategy under the same compute budget. Finally, we extend these ideas beyond autoregressive LLMs to diffusion LLMs, where flexible generation orders implicitly expose hidden semi-autoregressive experts that enable selective computation at inference time. By ensembling across diverse block schedules, we can allocate compute more effectively and reliably boost performance, again without additional training.
Amrit Singh Bedi
Assistant Professor of Computer Science
Amrit Singh Bedi is an Assistant Professor in the Computer Science department (joint appointment with the ECE department) at the University of Central Florida, USA. Before joining UCF, he was an Assistant Research Professor/Scientist at the University of Maryland (UMD), College Park, MD, USA, working closely with Prof. Dinesh Manocha, Prof. Pratap Tokekar, and Prof. Furong Huang. Prior to his time at UMD, he was fortunate enough to work with Dr. Alec Koppel and Dr. Brian Sadler at the US Army Research Laboratory, where he gained valuable experience and insights into the real-world applications of his research. He earned his Ph.D. in Electrical Engineering from the Indian Institute of Technology (IIT) in Kanpur under the supervision of Prof. Ketan Rajawat, where he focused on distributed and online learning with stochastic gradient methods for his Ph.D. thesis.