Wei Lin @ PKU

00102892: Statistical Learning

Course Description

This introductory statistical machine learning course is designed for graduate and advanced undergraduate students in statistics and probability, applied mathematics, and other fields involving learning from data. The course covers fundamental principles and methodology of machine learning, including model selection and regularization, classification and regression, support vector machines, kernel methods, boosting, clustering, and dimension reduction. If time permits, selected topics from the following will also be discussed: random forests, graphical models, ranking, online learning, and reinforcement learning. The PAC/nonasymptotic framework for machine learning theory will be introduced and developed throughout the semester.

Syllabus

Lectures and Assignments

Week Date Topics References Assignments Notes
1 9/15 Introduction, no-free-lunch theorem FML Chap. 1, ESL Chap. 1, UML Sec. 5.1
2 9/22 PAC/nonasmptotic framework FML Sec. 2.1
9/24 Finite hypothesis sets, Rademacher complexity FML Secs. 2.2–2.4, 3.1 FML 2.1, 2.3, 2.7, 2.9, 2.10, 2.12
3 9/29 VC dimension FML Secs. 3.2–3.4 FML 3.3, 3.8, 3.12, 3.16, 3.23, 3.30 Homework 1 complete, due 10/13
4 10/6 No class
10/8 Model selection and regularization FML Secs. 4.1–4.6, ESL Secs. 7.1–7.3
5 10/13 Information criteria, bootstrap methods ESL Secs. 7.4–7.7, 7.11 FML 4.1, ESL 7.2
6 10/20 Linear regression and discriminant analysis ESL Secs. 3.2, 4.1–4.3
10/22 Logistic regression, best subset selection, ridge regression ESL Secs. 4.4, 3.3, 3.4.1 ESL 3.4, 3.6, 3.8, 3.9, 3.11, 3.12, 3.29, 4.2, 4.3, 4.5 Homework 2 complete, due 11/5
7 10/27 Lasso and its variants ESL Secs. 3.4.2–3.4.4, 3.8
8 11/3 Theory for Lasso Wainwright Secs. 7.3, 7.4
11/5 Extensions of Lasso, support vector machines ESL Secs. 3.7, 4.5, 12.2, Wainwright Chap. 10 ESL 3.16, 3.30
9 11/10 Kernel methods Wainwright Chap. 12, ESL Sec. 12.3
10 11/17 Margin theory, boosting FML Sec. 5.4, ESL Secs. 10.1–10.6 FML 5.1, 5.6, 5.7, 6.2(d)(g)(h), 6.11, 6.13, 11.7
11/19 Theory and regularization for boosting ESL Secs. 10.10, 10.12, FML Secs. 7.3, 7.4 FML 7.1, 7.2, 7.8 Homework 3 complete, due 12/3
11 11/24 Clustering, dimension reduction ESL Secs. 14.3, 14.5, 14.6
12 12/1 Midterm exam Mean = 34, median = 37, Q1 = 24.5, Q3 = 44.5, high score = 68
12/3 Gaussian graphical models ESL Secs. 17.1–17.3, Wainwright Secs. 11.1, 11.2 Wainwright 11.1, 11.2
13 12/8 Directed acyclic graphs Kalisch and Bühlmann (2007) Supplementary problem
14 12/15 Random forests ESL Sec. 9.2, Chap. 15 ESL 15.5 Homework 4 complete, due 12/29
12/17 Reinforcement learning FML Chap. 17
15 12/22 Oral presentations Written report due 12/26