Wei Lin @ PKU

00133110: Applied Regression Analysis

Course Description

This is an undergraduate-level course for students majoring in statistics, probability, or any other field where applied statistics plays an essential role. Methodology and theory for linear regression will be introduced, illustrated by examples and applications. Extensions and advanced topics, such as categorical predictors, polynomial regression, analysis of variance, weighted least squares, mixed models, transformations, regression diagnostics, variable selection, nonlinear regression, and generalized linear models, will be covered if time permits.

Syllabus

Lectures and Assignments

Week Date Topics References Assignments Notes
1 2/21 Scatterplots and regression Weisberg Chap. 1
2 2/28 Random vectors, quadratic forms, moment generating functions Seber & Lee Secs. 1.4–1.6
3/2 Independence, multivariate normal distribution Seber & Lee Secs. 1.6–2.2
3 3/7 Independence and quadratic forms under the multivariate normal, linear regression Seber & Lee Secs. 2.3–3.1 Seber & Lee 1a.1, 4; 1b.2, 5; 1c.2; 1m.3, 5; 2a.1, 3; 2b.8; 2c.2; 2d.2, 4; 2m.5, 8, 12 (errata) Homework 1 due 3/16
4 3/14 Least squares Seber & Lee Secs. 3.1
3/16 Properties of least squares Seber & Lee Secs. 3.2–3.5 Seber & Lee 3a.1, 6, 8; 3b.1, 4; 3c.1, 2; 3d.2
5 3/21 Generalized least squares, adding predictors Seber & Lee Secs. 3.10, 3.7
6 3/28 Adding cases, F-tests in linear regression Seber & Lee Secs. 11.6, 3.8, 4.1, 4.3 Seber & Lee 3f.1, 4; 3g.1, 3; 3k.4, 5; 3misc.4, 9
3/30 Likelihood ratio tests, multiple correlation coefficient Seber & Lee Secs. 4.2, 4.4
7 4/2 Goodness-of-fit tests, simultaneous confidence intervals Seber & Lee Secs. 4.6, 5.1 Seber & Lee 4a.5; 4b.3; 4c.1; 4m.1, 2, 4 Homework 2 due 4/11
8 4/11 Scheffé's method, prediction intervals and bands Seber & Lee Secs. 5.1–5.3
4/13 More examples on straight lines Seber & Lee Sec. 6.1 Seber & Lee 5m.4, 5, 6; 6a.3, 4
9 4/18 Midterm exam Mean = 63, median = 64.5, Q1 = 52, Q3 = 80, high score = 94
10 4/25 Comparing straight lines, two-phase regression Seber & Lee Secs. 6.4, 6.5 Seber & Lee 6c.3; 6m.3
4/27 One-way ANOVA Seber & Lee Secs. 8.1, 8.2
11 5/2 No class
12 5/9 Two-way ANOVA Seber & Lee Secs. 8.3–8.5 Seber & Lee 8a.4, 5; 8b.2; 8c.1; 8e.3; 8m.3, 5 Homework 3 due 5/16
5/11 Bias due to underfitting/overfitting, mispecified covariance matrix, outliers Seber & Lee Secs. 9.1–9.4
13 5/16 Robustness to nonnormality, random predictors Seber & Lee Secs. 9.5, 9.6
14 5/23 Measurement error, collinearity Seber & Lee Secs. 9.6, 9.7 Seber & Lee 9a.3; 9b.2; 9m.2, 3
5/25 Diagnostic quantities, diagnostics for regression surfaces Seber & Lee Secs. 10.1–10.3
15 5/30 Variable importance, diagnostics for variance functions, outliers, and collinearity Seber & Lee Secs. 10.4, 10.6, 10.7 Seber & Lee 10a.3; 10b.1; 10f.2; 10m.1, 2, 3; Lab Homework 4 & Lab due 6/13
16 6/6 Generalized linear models Casella & Berger Sec. 12.3
6/8 Robust regression Seber & Lee Sec. 3.13
18 6/20 Final exam Mean = 45, median = 48, Q1 = 30.5, Q3 = 58.5, high score = 83