On the Foundations of Deep Learning: SGD, Overparametrization, and Generalization
In this talk, I will provide analysis of several elements of deep learning including non-convex optimization, overparametrization, and generalization error. First, we show that gradient descent and many other algorithms are guaranteed to converge to a local minimizer of the loss. For several interesting problems including the matrix completion problem, this guarantees that we converge to a global minimum. Then we will show that gradient descent converges to a global minimizer for deep overparametrized networks. Finally, we analyze the generalization error by showing that a subtle combination of SGD, logistic loss, and architecture combine to promote large margin classifiers, which are guaranteed to have low generalization error. Together, these results show that on overparametrized deep networks SGD finds solution of both low train and test error.
Bio:
Jason Lee is an assistant professor in Data Sciences and Operations at the University of Southern California. Prior to that, he was a postdoctoral researcher at UC Berkeley working with Michael Jordan. Jason received his PhD at Stanford University advised by Trevor Hastie and Jonathan Taylor. His research interests are in statistics, machine learning, and optimization. Lately, he has worked on the foundations of deep learning, non-convex optimization algorithm, and adaptive statistical inference. He has received a Sloan Research Fellowship in 2019 and NIPS Best Student Paper Award for his work.
Lunch for talk attendees will be available at 12:00pm.
To request accommodations for a disability, please contact Emily Lawrence, emilyl@cs.princeton.edu, 609-258-4624 at least one week prior to the event.