00102892: Statistical Learning
Course Description
This introductory statistical machine learning course is designed for graduate and advanced undergraduate students in statistics and probability,
applied mathematics, and other fields involving learning from data. The course covers fundamental principles and methodology of machine learning,
including model selection and regularization, classification and regression, support vector machines, kernel methods, boosting, clustering, and
dimension reduction. If time permits, selected topics from the following will also be discussed: random forests, graphical models, ranking,
online learning, and reinforcement learning. The PAC/nonasymptotic framework for machine learning theory will be introduced and developed
throughout the semester.
Syllabus
Lectures and Assignments
Week | Date | Topics | References | Assignments | Notes |
1 | 9/15 | Introduction, no-free-lunch theorem | FML Chap. 1, ESL Chap. 1, UML Sec. 5.1 | | |
2 | 9/22 | PAC/nonasmptotic framework | FML Sec. 2.1 | | |
| 9/24 | Finite hypothesis sets, Rademacher complexity | FML Secs. 2.2–2.4, 3.1 | FML 2.1, 2.3, 2.7, 2.9, 2.10, 2.12 | |
3 | 9/29 | VC dimension | FML Secs. 3.2–3.4 | FML 3.3, 3.8, 3.12, 3.16, 3.23, 3.30 | Homework 1 complete, due 10/13 |
4 | 10/6 | No class | | | |
| 10/8 | Model selection and regularization | FML Secs. 4.1–4.6, ESL Secs. 7.1–7.3 | | |
5 | 10/13 | Information criteria, bootstrap methods | ESL Secs. 7.4–7.7, 7.11 | FML 4.1, ESL 7.2 | |
6 | 10/20 | Linear regression and discriminant analysis | ESL Secs. 3.2, 4.1–4.3 | | |
| 10/22 | Logistic regression, best subset selection, ridge regression | ESL Secs. 4.4, 3.3, 3.4.1 | ESL 3.4, 3.6, 3.8, 3.9, 3.11, 3.12, 3.29, 4.2, 4.3, 4.5 | Homework 2 complete, due 11/5 |
7 | 10/27 | Lasso and its variants | ESL Secs. 3.4.2–3.4.4, 3.8 | | |
8 | 11/3 | Theory for Lasso | Wainwright Secs. 7.3, 7.4 | | |
| 11/5 | Extensions of Lasso, support vector machines | ESL Secs. 3.7, 4.5, 12.2, Wainwright Chap. 10 | ESL 3.16, 3.30 | |
9 | 11/10 | Kernel methods | Wainwright Chap. 12, ESL Sec. 12.3 | | |
10 | 11/17 | Margin theory, boosting | FML Sec. 5.4, ESL Secs. 10.1–10.6 | FML 5.1, 5.6, 5.7, 6.2(d)(g)(h), 6.11, 6.13, 11.7 | |
| 11/19 | Theory and regularization for boosting | ESL Secs. 10.10, 10.12, FML Secs. 7.3, 7.4 | FML 7.1, 7.2, 7.8 | Homework 3 complete, due 12/3 |
11 | 11/24 | Clustering, dimension reduction | ESL Secs. 14.3, 14.5, 14.6 | | |
12 | 12/1 | Midterm exam | | | Mean = 34, median = 37, Q1 = 24.5, Q3 = 44.5, high score = 68 |
| 12/3 | Gaussian graphical models | ESL Secs. 17.1–17.3, Wainwright Secs. 11.1, 11.2 | Wainwright 11.1, 11.2 | |
13 | 12/8 | Directed acyclic graphs | Kalisch and Bühlmann (2007) | Supplementary problem | |
14 | 12/15 | Random forests | ESL Sec. 9.2, Chap. 15 | ESL 15.5 | Homework 4 complete, due 12/29 |
| 12/17 | Reinforcement learning | FML Chap. 17 | | |
15 | 12/22 | Oral presentations | | | Written report due 12/26
|
|