Past Event: Oden Institute Seminar
Stephan Wojtowytsch, Assistant Professor, Department of Mathematics, Texas A&M University.
3:30 – 5PM
Tuesday Feb 21, 2023
POB 6.304 & Zoom
Gradient descent methods, which choose an optimal direction based on purely local information, are robust but slow. Accelerated methods in convex optimization improve upon gradient descent by exploiting information gained along their trajectory and converge much more quickly, albeit in a smaller class of problems. While non-convex, many optimization problems in deep learning share properties with the convex situation. In this context, however, true gradients are prohibitively expensive to compute and only stochastic gradient estimates are available. In this talk, I present a momentum-based accelerated method which achieves acceleration even if the stochastic noise is many orders of magnitude larger than the gradient (i.e. assuming multiplicative noise scaling with a potentially very large constant). Numerical evidence suggests that this method outperforms the current momentum-SGD optimizer in PyTorch and TensorFlow without increasing the computational cost.
Stephan Wojtowtysch is an assistant professor in the Department of Mathematics at Texas A&M University, working primarily on the mathematical foundations of deep learning. Previously, he held positions as a postdoctoral scholar at Carnegie Mellon University and at Princeton University. He obtained his PhD at Durham University, where he studied geometric problems in the calculus of variations and their numerical approximation. Stephan is also an organizer for the One World Seminar Series on the Mathematics of Machine Learning.