Wei Lin @ PKU

Math 867: High-Dimensional Data Analysis and Statistical Inference

Course Description

This is a graduate and advanced undergraduate level course in high-dimensional statistics, introducing the fundamental principles for the statistical modeling, analysis, and inference of high-dimensional and big data. High-dimensional regression, large covariance estimation, and large-scale hypothesis testing will be covered, along with necessary mathematical tools such as concentration inequalities and random matrix theory, as well as optimization algorithms such as coordinate descent and the alternating direction method of multipliers. Large data sets of various types will be presented and analyzed.

Syllabus

Announcements

  • Class on May 6 moved to Saturday, May 9, 3:10–5:50 pm, 1114 Science Building 1. If possible, please attend the short course by Professor Alan Gelfand.

Lecture Schedule

Week Date Topic References
1 March 4 Introduction Fan & Lv (2010), Boehm Vock et al. (2015)
2 March 11 Concentration inequalities
Sparse linear regression
Boucheron, Lugosi & Massart Chapters 1 & 2
Zhao & Yu (2006), Wainwright (2009)
3 March 18 Sparse linear regression
Sparse GLMs
Bickel, Ritov & Tsybakov (2009), Raskutti, Wainwright & Yu (2010)
Fan & Lv (2011)
4 March 25 Group Lasso
Structured sparsity
Yuan & Lin (2006), Ravikumar et al. (2009)
Bien, Taylor & Tibshirani (2013)
5 April 1 Large-scale optimization
Discussion: Microbiome data analysis
Friedman et al. (2007), Boyd et al. (2011)
Chen & Li (2013)
6 April 8 Variable screening
Bayesian Lasso
Discussion: Chromatographic fingerprints
Fan & Lv (2008)
Park & Casella (2008)
Wierzbicki et al. (2014)
7 April 15 Sparse covariance estimation Bickel & Levina (2008), Rothman, Levina & Zhu (2009), Cai & Liu (2011)
8 April 22 Sparse inverse covariance estimation Yuan & Lin (2007), Friedman, Hastie & Tibshirani (2008), Cai, Liu & Luo (2011)
9 April 29 Consistency of PCA
Sparse PCA
Discussion: Text analysis
Johnstone & Lu (2009)
Shen & Huang (2008), Amini & Wainwright (2009), Ma (2013)
Taddy (2013)
10 May 9 Matrix perturbation theory
Random matrix theory
Stewart & Sun (1990), Yu, Wang & Samworth (2015)
Vershynin (2012)
11 May 13 Low-rank matrix recovery
Discussion: Star formation in galaxies
Negahban & Wainwright (2011), Negahban & Wainwright (2012)
Richards et al. (2012)
12 May 20 False discovery rate control Efron Chapter 4, Benjamini (2010)
13 May 27 Two-sample mean tests
Two-sample covariance tests
Discussion: Topological inference in neuroimaging
Chen & Qin (2010), Cai, Liu & Xia (2014)
Li & Chen (2012), Cai, Liu & Xia (2013)
Kilner & Friston (2010)
14 June 3 Scaled Lasso
Confidence intervals and tests
Discussion: Change detection in remote sensing
Sun & Zhang (2012), Belloni, Chernozhukov & Wang (2011)
van de Geer et al. (2014), Zhang & Zhang (2014), Javanmard & Montanari (2014)
Clements et al. (2014)
15 June 10 Office hours (schedule)
16 June 17 Presentations (schedule)

Homework and Projects