Tentative Schedule

Date Content Reading Slides and Notes
Intro
1/8 Th Introduction, machine learning review chapter 1-4 of Dive into Deep Learning, Zhang et al
https://playground.tensorflow.org/
Lecture 1, Lecture 1 (annotated)
1/15 Th Fully-connected neural networks, optimization algorithms, optimization techniques chapter 5, 6, 12 of Dive into Deep Learning, Zhang et al Lecture 2 , Lecture 2 (annotated), scribed notes on Clarke differential, positive homogeneity and auto-balancing
1/22 Th Advanced optimizers, optimization techniques Chapter 12 of Dive into Deep Learning, Zhang et al. , He et al. on Kaiming initialization, blog of escaping saddle points, blog on how to escape saddle points efficiently Lecture 3 , Lecture 3 (annotated), scribed notes on Kaiming initialization
1/29 Th Introduction to convolutional neural networks, advanced convolutional neural networks, Recurrent neural networks, LSTM Chapter 7,8, 9, 10 of Dive into Deep Learning, Zhang et al. Lecture 4, Lecture 4 (annotated)
2/5 Th Attention mmechnism, deep learning theory (approximation) Chapter 11 of Dive into Deep Learning, Zhang et al. Chapter 1,2 of Matus Telgarsky's notes Lecture 5, Lecture 5 (annotated), scirbed notes on 1D, multivariate approximation, and Barron's Theory
2/12 Th Deep learning theory (approximation, optimization), measures of generalzation, techniques for improving generalization, generalization theory for deep learning Du et al. on global convergence of gradient descent, Jacot et al. on Neural Tangent Kernel, Arora et al. on Neural Tangent Kernel, Zhang et al. on rethinking generalization on deep learning , Allen-Zhu and Li on separation beteween neural networks and kernels, Chapter 10 - 14 of Matus Telgarsky's notes, Jiang et al. on different generalization measures, Belkin et al. on double descent, Lecture 6, Lecture 6 (annotated), scribed notes on global convergence of gradient descent, scribed notes on separation between NN and kernel
2/19 Tu Pre-training, representation learning Bengio et al. on representation learning, Chapter 20 of Dive into Deep Learning, Zhang et al., CLIP paper, Olmo 3 Lecture 7, Lecture 7(annotated)
2/26 Th Desiderata for generative models, variational autoencoder, GAN, energy-based models, normalizing flows, score-based models, diffusion models Chapter 20 of Dive into Deep Learning, Zhang et al., Yang Song's blog on score-based models, Lilian Weng's blog on diffusion models. Lecture 8
3/5 Th
3/12 Th Project Presentation (Zoom)