- This event has passed.
ESE PhD Thesis Defense: “A Dynamical Systems Perspective on Optimization Algorithms”
November 22 at 12:00 PM - 1:30 PM
The intersection of machine learning (ML) and dynamical systems and control (S&C) has become a prominent area of research in recent years. While most work applies ML to S&C problems, this dissertation explores the reverse direction: using S&C tools to address challenges in ML and optimization. This thesis is comprised of three parts:
In part I, we examine connections between continuous-time dynamics and iterative optimization algorithms. Motivated by the matching rates observed by various optimization algorithms and their continuous-time counterparts, we seek to gain a better understanding of the fundamental limits behind these matching rates. First, we show that a simple rescaling of the gradient flow achieves finite-time stability, which is clearly not preserved by its simplest discretization — the Forward Euler discretization — as it simply yields gradient descent. In fact, this property is not exactly preserved by any explicit discretization as it relies on instantaneous corrections in the speed of the trajectories of the system, which is not possible in discrete-time due to overshooting. Next, we show that the rate of any continuous-time optimization system can be approximately preserved, under sufficient regularity conditions, by any sufficiently high order off-the-shelf ODE solver.
In part II, we re-examine the relationship between convexity and smoothness and the role they play on the convergence rate of gradient-based optimization algorithms. Using and extending tools from part I, we show that a rescaled form of gradient descent achieves a rate that explicitly depends on given lower and upper bounds on the Bregman divergence of the cost function, which collectively quantify convexity and smoothness. In particular, we establish linear convergence with a generalized condition number, generalizing the case of L-smooth and mu-strongly convex functions minimized via gradient descent.
In part III, we discuss applications of S&C in two ML problems. The first is to re-examine the convergence of the expectation-maximization (EM) algorithm with a prior using Lyapunov stability theory. Next, we consider the conformal training (ConfTr) method of Stutz et al (2022) and show that, in its current form, suffers from severe sample inefficiency. We propose a simple fix and provide a preliminary theoretical analysis using tools from linear systems theory. We also use the principles established in part II to guide the training procedure.
Orlando Romero
ESE Ph.D. Candidate
Orlando Romero is a Ph.D. candidate in Electrical and Systems Engineering (ESE) at the University of Pennsylvania, advised by Prof. George J. Pappas. He received his B.S. in Applied Mathematics and Computation from Instituto Superior Técnico, Lisbon, Portugal, in 2017, and his M.S. in Industrial and Management Engineering from Rensselaer Polytechnic Institute in 2020. Between January 2020 and May 2020 he was an In-Residence AIHN Scholar at the MIT-IBM AI Watson Lab and a research intern at Mitsubishi Electric Research Laboratories (MERL) in the summers of 2019 and 2020, hosted by Dr. Benosman. He joined George J. Pappas’ group in Fall 2020. His research interests are at the intersection of optimization, dynamical systems, and machine learning.