How to Make the Gradients Small in Convex and Min-Max Optimization

11/18/21 | 4:15pm | E25-111


 

 

 

 

Jelena Diakonikolas

Assistant Professor
University of Wisconsin


Abstract: One of the most fundamental facts in unconstrained convex optimization is that every point with zero gradient is also a global function minimum. However, the problem of efficiently computing points with small gradients is significantly different from the problem of efficiently computing points with small objective functions values, and methods that exhibit optimal convergence rates under one of these two criteria do not in general exhibit optimal convergence rates under both. In particular, the accelerated gradient method of Nesterov is iteration-complexity-optimal in terms of minimizing smooth (gradient-Lipschitz) convex functions, but suboptimal in terms of finding their near-stationary points. While the complexity of minimizing convex functions is well-understood for various broad classes of problems, much less is known about the complexity of minimizing the functions' gradients in the appropriate norms. Our understanding is even more limited for the more general setting of min-max optimization. In this talk, I will discuss why finding points with small gradients is an intriguing problem, what we know so far, and what kinds of questions we can hope to answer looking forward.

Bio: Jelena Diakonikolas is an assistant professor in the Department of Computer Sciences at UW-Madison. Prior to joining UW-Madison, she held postdoctoral positions at UC Berkeley and Boston University. She received her PhD from Columbia University. Her research interests are in efficient optimization methods for large scale problems and connections between optimization and other areas such as dynamical systems and Markov Chain Monte Carlo methods. Jelena is a recipient of a Simons-Berkeley Research Fellowship, the Morton B. Friedman Prize for Excellence at Columbia Engineering, and a Qualcomm Innovation Fellowship.

Event Time: 

2021 - 16:15