北京大学数学学院

首页» 科学研究» 学术报告» 讨论班» Probability and Statistics

讨论班

Why interpolating neural nets generalize well: recent insights from neural tangent model

报告人： Yiqiao Zhong(University of Wisconsin)

时间：2022-12-15 9:00-10:00

地点： Tencent Meeting (436-439-849)

Abstract: A mystery of modern neural networks is their surprising generalization power in overparametrized regime: they comprise so many parameters that they can interpolate the training set, even if actual labels are replaced by purely random ones; despite this, they achieve good prediction error on unseen data.
In this talk, we focus on the neural tangent (NT) model for two-layer neural networks, which is a simplified model. Under the isotropic input data, we first show that interpolation phase transition is around Nd ~ n, where Nd is the number of parameters and n is the sample size.
To demystify the generalization puzzle, we consider the min-norm interpolator and show that its test error/generalization error is largely determined by a smooth, low- degree component. Moreover, we find that nonlinearity of the activation function has an implicit regularization effect. These results offer new insights to recent discoveries in overparametrized models such as double descent phenomena.
Paper to appear in Annals of Statistics. Link to the ArXiv paper: https://arxiv.org/ abs/2007.12826

About the Speaker:

Yiqiao Zhong is currently an assistant professor at the University of Wisconsin—Madison, Department of Statistics. Prior to joining UW Madison, Yiqiao was a postdoc at Stanford University, advised by Prof. Andrea Montanari and Prof. David Donoho. His research interest includes deep learning theory, high- dimensional statistics, and optimization. Yiqiao Zhong obtained his Ph.D. in 2019 from Princeton University, where he was advised by Prof. Jianqing Fan. In 2014, Yiqiao graduated from Peking University with a mathematics major.

Online: Tencent Meeting（ID: 436-439-849）

Meeting Link: https://meeting.tencent.com/dm/WwJOpWgGciie

北大数学成就展

人才引进

捐赠