北京大学数学学院

首页» 科学研究» 学术报告» 讨论班» Information Sciences

讨论班

机器学习实验室博士生系列论坛（第十三期）—— Implicit bias and implicit acceleration in deep learning

报告人：Chuhan Xie (PKU)

时间：2021-09-01 15:10-16:10

地点：北大静园六院一楼大会议室 & 腾讯会议 914 9325 5784

Abstract: One of the major research directions in deep learning theory is to explain why a learned neural network model can generalize well even when it is highly overparametrized. Recent work sheds light to this question: the optimization algorithm (e.g. gradient descent) used in training is biased towards simple solutions with good generalization performance. Such phenomenon is called implicit bias; particularly, implicit bias states that iterates converge to a solution that minimize a regularization function Q(.) under certain restrictions. In this talk, we will first introduce the precise definition of implicit bias, and then review recent research characterizing it according to different learning tasks (e.g. classification, matrix factorization), and different optimization methods (e.g. adaptive algorithms). Besides, we will introduce "implicit acceleration", a phenomenon that overparametrization somtimes leads to acceleration in training. We take deep linear networks as an example to illustrate that vanilla gradient descent on such overparametrized model leads to gradient descent with momentum and an adaptive learning rate on the original model.

北大数学成就展

人才引进

捐赠