|
Intro |
|
|
3/29 Tu |
Introduction, neural network basics
|
chapter 1-4 of Dive into Deep Learning, Zhang et al
https://playground.tensorflow.org/
|
Lecture 1,
Lecture 1 (annotated)
|
|
Approximation Theory |
|
|
3/31 Th |
1D and multivariate approximation
|
Chapter 1,2 of Matus Telgarsky's notes |
Lecture 2, Lecture 2 (annotated),
scirbed notes on 1D and multivariate approximation
|
4/5 Tu |
Barron's theory, depth separation
|
Chapter 3,5 of Matus Telgarsky's notes
|
Lecture 3, Lecture 3 (annotated)
|
|
Optimization |
|
|
4/7 Th |
Backpropagation, auto-differentiation
|
Chapter 4 of Dive into Deep Learning, Zhang et al |
Lecture 4, Lecture 4 (annotated)
|
4/12 Tu |
Clarke differential, auto-balancing
|
Chapter 9 of Matus Telgarsky's notes,
Chapter 11 of Dive into Deep Learning, Zhang et al. ,
Du et al. on auto-balancing
|
Lecture 5, Lecture 5 (annotated), scribed notes on Clarke differential, positive homogeneity and auto-balancing
|
4/14 Th |
Advanced optimizers, techniques for improving optimization
|
Optimizer visualization , Chapter 11 of Dive into Deep Learning, Zhang et al.
|
Lecture 6, Lecture 6 (annotated) |
4/19 Tu |
Techniques for improving optimization,
optimization landscape
|
He et al. on Kaiming initialization,
blog of escaping saddle points, blog on how to escape saddle points efficiently
|
Lecture 7, Lecture 7 (annotated)
|
4/21 Th |
Optimization landscape (cont'd),
global convergence of gradient descent
|
Du et al. on global convergence of gradient descent
|
Lecture 8, Lecture 8 (annotated)
|
|
Generalization |
|
|
4/26 Tu |
Finish the proof of global convergence of gradient descent, Neural tangent kernel, measures of generalzation
|
Jacot et al. on Neural Tangent Kernel, Arora et al. on Neural Tangent Kernel, Zhang et al. on rethinking generalization on deep learning
|
Proof of global convergence of graident descent (annotated),
Lecture 9, Lecture 9 (annotated), scribed notes on global convergence of gradient descent
|
4/28 Th |
Techniques for improving generalization,
generalization bounds for deep learning,
|
Chapter 4 of Dive into Deep Learning, Zhang et al. ,
Chapter 11 - 13 of Matus Telgarsky's notes
|
Lecture 10, Lecture 10 (annotated)
|
5/3 Tu |
Norm-based generalization bounds, double descent, implicit bias
|
Chapter 10,14 of Matus Telgarsky's notes, Jiang et al. on different generalization measures, Belkin et al. on double descent
|
Lecture 11, Lecture 11 (annotated)
|
|
Neural Network Architecture |
|
|
5/5 Th |
Separation between neural network and kernel, intro to convolutional neural network
|
Chapter 6,7 of Dive into Deep Learning, Zhang et al.
|
Lecture 12, Lecture 12 (annotated), scribed notes on separation between NN and kernel
|
5/10 Tu |
Convolutional neural networks, recurrent neural networks
|
Chapter 6,7, 8 of Dive into Deep Learning, Zhang et al.
|
Lecture 13, Lecture 13 (annotated)
|
5/12 Th |
LSTM, attention mechanism
|
Chapter 9, 10 of Dive into Deep Learning, Zhang et al.
|
Lecture 14, Lecture 14 (annotated)
|
|
Representation Learning |
|
|
5/17 Tu |
Transformer, Desiderata for representation learning, theory for multi-task representation learning
|
Chapter 10 of Dive into Deep Learning, Zhang et al., Du et al. on representation learning, Tripuraneni et al. on representation learning
|
Lecture 15 on transformer, Lecture 15 on representation learning, Lecture 15 on transformer (annotated), Lecture 15 on representation learning (annotated)
|
5/19 Th |
Word embedding, auto-encoder, self-supervised learning, contrastive learning
|
Bengio et al. on representation learning
|
Lecture 16, Lecture 16 (annotated)
|
|
Generative Models |
|
|
5/24 Tu |
Desiderata for generative models, GAN (on Zoom)
|
Chapter 17 of Dive into Deep Learning, Zhang et al.
|
Lecture 17, Lecture 17 (annotated)
|
5/26 Th |
Variational autoencoder, energy models, normalizing flows (on Zoom)
|
|
Lecture 18, Lecture 18 (annotated)
|
|
Course Presentations |
|
|
5/31 Tu |
Project Presentation (on Zoom)
|
|
|
6/2 Th |
Project Presentation (on Zoom)
|
|
|