Moritz Handt, The Zen of Gradient Descent, here.
Trefethen-Bau, “Numerical Linear Algebra”. My favorite book on the topic of classical numerical methods by far.
Ben Recht’s lecture notes here and here, his Simons talk.
Sebastien Bubeck’s course notes are great!
Ben Recht, Simons Institute, Optimization 1, here.
Machine learning and computational statistics problems involving large datasets have proved to be a rich source of interesting and challenging optimization problems in recent years. The challenges arising from the complexity of these problems and the special requirements for their solutions have brought a wide range of optimization algorithms into play. We start this talk by surveying the application space, outlining several important analysis and learning tasks, and describing the contexts in which such problems are posed. We then describe optimization approaches that are proving to be relevant, including stochastic gradient methods, sparse optimization methods, first-order methods, coordinate descent, higher-order methods, and augmented Lagrangian methods. We also discuss parallel variants of some of these approaches.