Colloquia & Seminars - Activities - 计算数学与科学工程计算研究所

Activities

< 返回

Implicit biases of optimization algorithms for neural networks and their effects on generalization

Home - Activities

Reporter：

Dr. Chao Ma, Stanford University

Inviter：

Pingbing Ming, Professor

Subject：

Implicit biases of optimization algorithms for neural networks and their effects on generalization

Time and place：

9:00-10:00 December 23( Friday)，Tencent Meeting ID: 247-520-003

Abstract：

Modern neural networks are usually over-parameterized—the number of parameters exceeds the number of training data. In this case the loss functions tend to have many (or even infinite) global minima, which imposes an additional challenge of minima selection on optimization algorithms besides the convergence. Specifically, when training a neural network, the algorithm not only has to find a global minimum, but also needs to select minima with good generalization among many other bad ones. In this talk, we connect the implicit bias of optimization algorithms and the generalization performance via two steps. First, with a linear stability analysis around global minima, we show that stochastic gradient descent (SGD) favors flat and uniform global minima. Then, we build a theoretical connection of flatness and generalization performance based on a special multiplicative structure of neural networks. Together, we show that SGD tends to find global minima with good generalization. Bounds for generalization error and adversarial robustness depending on SGD hyperparameters are derived.