|
Intro |
|
|
1/3 Tu |
Introduction, neural network basics (on Zoom)
|
chapter 1-4 of Dive into Deep Learning, Zhang et al
https://playground.tensorflow.org/
|
Lecture 1,
Lecture 1 (annotated)
|
|
Approximation Theory |
|
|
1/5 Th |
1D and multivariate approximation (on Zoom)
|
Chapter 1,2 of Matus Telgarsky's notes |
Lecture 2
, Lecture 2 (annotated),
scirbed notes on 1D, multivariate approximation, and Barron's Theory.
|
1/10 Tu |
Barron's theory, depth separation
|
Chapter 3,5 of Matus Telgarsky's notes
|
Lecture 3
, Lecture 3 (annotated)
|
|
Optimization |
|
|
1/12 Th |
Backpropagation, auto-differentiation, Clarke differential
|
Chapter 4 of Dive into Deep Learning, Zhang et al , Chapter 9 of Matus Telgarsky's notes |
Lecture 4, Lecture 4 (annotated)
|
1/17 Tu |
Auto-balancing, advanced optimizers
|
Chapter 9 of Matus Telgarsky's notes,
Chapter 12 of Dive into Deep Learning, Zhang et al. ,
Du et al. on auto-balancing, Optimizer visualization
|
Lecture 5,
Lecture 5 (annotated),
scribed notes on Clarke differential, positive homogeneity and auto-balancing
|
1/19 Th |
Techniques for improving optimization,
optimization landscape
|
Chapter 12 of Dive into Deep Learning, Zhang et al. , He et al. on Kaiming initialization,
blog of escaping saddle points, blog on how to escape saddle points efficiently
|
Lecture 6, Lecture 6 (annotated), scribed notes on Kaiming initialization
|
1/24 Tu |
optimization landscape, global convergence of gradient descent
|
blog of escaping saddle points, blog on how to escape saddle points efficiently, Du et al. on global convergence of gradient descent
|
Lecture 7, Lecture 7 (annotated), scribed notes on global convergence of gradient descent
|
1/26 Th |
Finish the proof of global convergence of gradient descent, Neural tangent kernel
|
Du et al. on global convergence of gradient descent, Jacot et al. on Neural Tangent Kernel, Arora et al. on Neural Tangent Kernel
|
Lecture 8 Part2, Lecture 8 Part 1(annotated), Lecture 8 Part 2(annotated)
|
|
Generalization |
|
|
1/31 Tu |
Measures of generalzation, techniques for improving generalization, generalization bounds for deep learning, double descent, implicit bias
|
Zhang et al. on rethinking generalization on deep learning, Chapter 5 of Dive into Deep Learning, Zhang et al. , Chapter 10 - 14 of Matus Telgarsky's notes,
Jiang et al. on different generalization measures, Belkin et al. on double descent
|
Lecture 9, Lecture 9 (annotated)
|
2/2 Th |
Separation between neural network and kernel, introduction to convolutional neural networks
|
Allen-Zhu and Li on separation beteween neural networks and kernels, Chapter 7,8 of Dive into Deep Learning, Zhang et al.
|
Lecture 10, Lecture 10 (annotated), scribed notes on separation between NN and kernel
|
|
Neural Network Architecture |
|
|
2/7 Tu |
Advanced convolutional neural networks, recurrent neural networks
|
Chapter 7,8, 9 of Dive into Deep Learning, Zhang et al.
|
Lecture 11, Lecture 11 (annotated)
|
2/9 Th |
LSTM, attention mechanism
|
Chapter 10, 11 of Dive into Deep Learning, Zhang et al.
|
Lecture 12
, Lecture 12 (annotated)
|
|
Generative Models |
|
|
2/14 Tu |
Transformer, desiderata for generative models, GAN
|
Chapter 20 of Dive into Deep Learning, Zhang et al.
|
Lecture 13 part1 ,Lecture 13 part1 (annotated) , Lecture 13 part2 , Lecture 13 part2 (annotated)
|
2/16 Tu |
Math behind Gan, variational autoencoder, energy models,
|
Chapter 20 of Dive into Deep Learning, Zhang et al.
|
Lecture 14, Lecture 14 (annotated)
|
2/21 Tu |
Normalizing flows, score-based models, diffusion models
|
Yang Song's blog on score-based models, Lilian Weng's blog on diffusion models.
|
Lecture 15, Lecture 15 (annotated)
|
|
Representation Learning |
|
|
2/23 Th |
Desiderata for representation learning, Word embedding, auto-encoder, self-supervised learning, contrastive learning
|
Bengio et al. on representation learning, Chapter 11 of Dive into Deep Learning, Zhang et al.
|
Lecture 16 part1(annotated),Lecture 16 part2, Lecture 16 part2(annotated)
|
2/28 Tu |
Multi-task representation learning, active representation learning (guest lecture by Yifang Chen, on Zoom)
|
|
Lecture 17
|
3/2 Th |
Understanding neural network training from the perspective of feature learning (guest lecture by Ruoqi Shen, on Zoom)
|
|
Lecture 18
|
|
Course Presentations |
|
|
3/7 Tu |
Project Presentation (on Zoom)
|
|
|
3/9 Th |
Project Presentation (on Zoom)
|
|
|