Wei Lin @ PKU

00103335: Deep Learning and Reinforcement Learning

Course Description

As highly successful and widely applied machine learning methods, deep learning and reinforcement learning are the core techniques underlying the latest major breakthroughs in the field of AI. Building on the general principles and methodology of machine learning and motivated by important practical problems, this course will introduce the basic concepts and methods, mathematical foundations and theory, optimization algorithms, and applications and case studies of deep learning and reinforcement learning. The part on deep learning will cover feedforward neural networks, regularization and optimization for deep learning, convolutional neural networks, recurrent neural networks, and autoencoders and generative models; the part on reinforcement learning will cover multi-armed bandits, Markov decision processes, dynamic programming, Monte Carlo methods, temporal difference learning, and deep reinforcement learning.

Syllabus

Lectures and Assignments

Week Date Topics References Assignments Notes
1 9/9 Introduction, machine learning basics DL Sec. 5.1
2 9/14 Generalization, bias–variance trade-off DL Secs. 5.2–5.4 See Belkin et al. (2019) for the double descent phenomenon and Hastie et al. (2022) for a theory in linear models.
9/16 Inference principles, feedforward networks DL Secs. 5.4–5.11, 6.1–6.3
3 9/23 Universal approximation, back-propagation DL Secs. 6.4–6.6
4 9/28 Regularization for DL, weight decay DL Secs. 7.1–7.5
9/30 Early stopping, dropout DL Secs. 7.8–7.14 Homework 1
5 10/7 Optimization for DL DL Secs. 8.1–8.3
6 10/12 Initialization, adaptive algorithms DL Secs. 8.4–8.7
10/14 Convolutional networks DL Chap. 9 See He et al. (2016) for residual networks.
7 10/21 Recurrent networks DL Chap. 10 Homework 2
8 10/26 Reinforcement learning basics, multi-armed bandits RL Chaps. 1 & 2
10/28 Markov decision processes RL Chap. 3
9 11/4 Dynamic programming RL Chap. 4
10 11/9 Monte Carlo methods RL Chap. 5 A recent progress on the open question about the convergence of MCES was made by Wang et al. (2022).
11/11 Temporal difference learning RL Chap. 6, Secs. 7.1–7.3, 12.1 Homework 3
11 11/18 Midterm exam Mean = 43, median = 43.5, Q1 = 34, Q3 = 54.5, high score = 75
12 11/23 Autoencoders, approximate inference DL Chaps. 14 & 19
11/25 Deep generative models DL Chap. 20 For diffusion models, see Sohl-Dickstein et al. (2015) and Ho, Jain & Abbeel (2020).
13 12/2 On-policy prediction with approximation RL Chap. 9
14 12/7 On-policy control with approximation, policy gradient methods RL Chaps. 10 & 13 Homework 4
12/9 Oral presentations
15 12/16 Oral presentations Written report due 12/21