|
Intro |
|
|
| 1/8 Th |
Introduction, machine learning review
|
chapter 1-4 of Dive into Deep Learning, Zhang et al
https://playground.tensorflow.org/
|
Lecture 1,
Lecture 1 (annotated)
|
| 1/15 Th |
Fully-connected neural networks, optimization algorithms, optimization techniques
|
chapter 5, 6, 12 of Dive into Deep Learning, Zhang et al
|
Lecture 2
, Lecture 2 (annotated), scribed notes on Clarke differential, positive homogeneity and auto-balancing
|
| 1/22 Th |
Advanced optimizers, optimization techniques
|
Chapter 12 of Dive into Deep Learning, Zhang et al. ,
He et al. on Kaiming initialization, blog of escaping saddle points, blog on how to escape saddle points efficiently
|
Lecture 3
, Lecture 3 (annotated), scribed notes on Kaiming initialization
|
| 1/29 Th |
Introduction to convolutional neural networks, advanced convolutional neural networks, Recurrent neural networks, LSTM
|
Chapter 7,8, 9, 10 of Dive into Deep Learning, Zhang et al.
|
Lecture 4,
Lecture 4 (annotated)
|
| 2/5 Th |
Attention mmechnism, deep learning theory (approximation)
|
Chapter 11 of Dive into Deep Learning, Zhang et al.
Chapter 1,2 of Matus Telgarsky's notes |
Lecture 5,
Lecture 5 (annotated), scirbed notes on 1D, multivariate approximation, and Barron's Theory
|
| 2/12 Th |
Deep learning theory (approximation, optimization), measures of generalzation, techniques for improving generalization, generalization theory for deep learning
|
Du et al. on global convergence of gradient descent,
Jacot et al. on Neural Tangent Kernel, Arora et al. on Neural Tangent Kernel,
Zhang et al. on rethinking generalization on deep learning ,
Allen-Zhu and Li on separation beteween neural networks and kernels, Chapter 10 - 14 of Matus Telgarsky's notes,
Jiang et al. on different generalization measures, Belkin et al. on double descent,
|
Lecture 6, Lecture 6 (annotated), scribed notes on global convergence of gradient descent, scribed notes on separation between NN and kernel
|
| 2/19 Tu |
Pre-training, representation learning
|
Bengio et al. on representation learning, Chapter 20 of Dive into Deep Learning, Zhang et al., CLIP paper, Olmo 3
|
Lecture 7,
Lecture 7(annotated)
|
| 2/26 Th |
Desiderata for generative models, variational autoencoder, GAN, energy-based models, normalizing flows, score-based models, diffusion models
|
Chapter 20 of Dive into Deep Learning, Zhang et al.,
Yang Song's blog on score-based models, Lilian Weng's blog on diffusion models.
|
Lecture 8
|
| 3/5 Th |
|
|
| 3/12 Th |
Project Presentation (Zoom)
|
|
|