ASSET Seminar: “When do spectral gradient updates help in deep learning?”
Spectral gradient methods, such as the recently popularized Muon algorithm, are a promising alternative to standard Euclidean gradient descent for training deep neural networks and transformers, but it is still […]