FOLDS seminar: Global Convergence of Gradient EM for Over-Parameterized Gaussian Mixtures
April 2 at 12:00 PM - 1:00 PM
Details
Zoom link: https://upenn.zoom.us/j/98220304722
Learning Gaussian Mixture Models (GMMs) is a fundamental problem in machine learning, and the Expectation-Maximization (EM) algorithm and its variant gradient-EM are the most widely used algorithms in practice. When the ground-truth GMM and the learning model have the same number of components, m, a line of prior work has attempted to establish rigorous recovery guarantees; however, EM methods are known to fail to recover the ground truth when m>2.
This talk considers the “over-parameterized” case, where the learning model uses n>m components to fit an m-component GMM. I will show that gradient-EM converges globally and recovers the GMM: for a well-separated GMM with only mild over-parameterization n = \Omega(m log m), randomly initialized gradient-EM converges to the ground truth at a polynomial rate with polynomial samples. The analysis relies on novel techniques to characterize the geometric landscape of the likelihood loss. This is the first global convergence result for EM methods beyond the special case of m=2.

