Math 867: High-Dimensional Data Analysis and Statistical Inference
Course Description
This is a graduate and advanced undergraduate level course in high-dimensional statistics, introducing the fundamental principles for the statistical
modeling, analysis, and inference of high-dimensional and big data. High-dimensional regression, large covariance estimation, and large-scale
hypothesis testing will be covered, along with necessary mathematical tools such as concentration inequalities and random matrix theory, as well as
optimization algorithms such as coordinate descent and the alternating direction method of multipliers. Large data sets of various types will be
presented and analyzed.
Syllabus
Announcements
Class on May 6 moved to Saturday, May 9, 3:10–5:50 pm, 1114 Science Building 1. If possible, please attend the
short course by Professor Alan Gelfand.
Lecture Schedule
Week | Date | Topic | References |
1 | March 4 | Introduction | Fan & Lv (2010), Boehm Vock et al. (2015) |
2 | March 11 | Concentration inequalities Sparse linear regression | Boucheron, Lugosi & Massart Chapters 1 & 2 Zhao & Yu (2006), Wainwright (2009) |
3 | March 18 | Sparse linear regression Sparse GLMs | Bickel, Ritov & Tsybakov (2009), Raskutti, Wainwright & Yu (2010) Fan & Lv (2011) |
4 | March 25 | Group Lasso Structured sparsity | Yuan & Lin (2006), Ravikumar et al. (2009) Bien, Taylor & Tibshirani (2013) |
5 | April 1 | Large-scale optimization Discussion: Microbiome data analysis | Friedman et al. (2007), Boyd et al. (2011) Chen & Li (2013) |
6 | April 8 | Variable screening Bayesian Lasso Discussion: Chromatographic fingerprints | Fan & Lv (2008) Park & Casella (2008) Wierzbicki et al. (2014) |
7 | April 15 | Sparse covariance estimation | Bickel & Levina (2008), Rothman, Levina & Zhu (2009), Cai & Liu (2011) |
8 | April 22 | Sparse inverse covariance estimation | Yuan & Lin (2007), Friedman, Hastie & Tibshirani (2008), Cai, Liu & Luo (2011) |
9 | April 29 | Consistency of PCA Sparse PCA Discussion: Text analysis | Johnstone & Lu (2009) Shen & Huang (2008), Amini & Wainwright (2009), Ma (2013) Taddy (2013) |
10 | May 9 | Matrix perturbation theory Random matrix theory | Stewart & Sun (1990), Yu, Wang & Samworth (2015) Vershynin (2012) |
11 | May 13 | Low-rank matrix recovery Discussion: Star formation in galaxies | Negahban & Wainwright (2011), Negahban & Wainwright (2012) Richards et al. (2012) |
12 | May 20 | False discovery rate control | Efron Chapter 4, Benjamini (2010) |
13 | May 27 | Two-sample mean tests Two-sample covariance tests Discussion: Topological inference in neuroimaging | Chen & Qin (2010), Cai, Liu & Xia (2014) Li & Chen (2012), Cai, Liu & Xia (2013) Kilner & Friston (2010) |
14 | June 3 | Scaled Lasso Confidence intervals and tests Discussion: Change detection in remote sensing | Sun & Zhang (2012), Belloni, Chernozhukov & Wang (2011) van de Geer et al. (2014), Zhang & Zhang (2014), Javanmard & Montanari (2014) Clements et al. (2014) |
15 | June 10 | Office hours (schedule) | |
16 | June 17 | Presentations (schedule) |
|
Homework and Projects
|