Tentative Schedule

Date	Content	Reading	Slides and Notes
	Intro
1/8 Th	Introduction, machine learning review	chapter 1-4 of Dive into Deep Learning, Zhang et al https://playground.tensorflow.org/	Lecture 1, Lecture 1 (annotated)
1/15 Th	Fully-connected neural networks, optimization algorithms, optimization techniques	chapter 5, 6, 12 of Dive into Deep Learning, Zhang et al	Lecture 2 , Lecture 2 (annotated), scribed notes on Clarke differential, positive homogeneity and auto-balancing
1/22 Th	Advanced optimizers, optimization techniques	Chapter 12 of Dive into Deep Learning, Zhang et al. , He et al. on Kaiming initialization, blog of escaping saddle points, blog on how to escape saddle points efficiently	Lecture 3 , Lecture 3 (annotated), scribed notes on Kaiming initialization
1/29 Th	Introduction to convolutional neural networks, advanced convolutional neural networks, Recurrent neural networks, LSTM	Chapter 7,8, 9, 10 of Dive into Deep Learning, Zhang et al.	Lecture 4, Lecture 4 (annotated)
2/5 Th	Attention mmechnism, deep learning theory (approximation)	Chapter 11 of Dive into Deep Learning, Zhang et al. Chapter 1,2 of Matus Telgarsky's notes	Lecture 5, Lecture 5 (annotated), scirbed notes on 1D, multivariate approximation, and Barron's Theory
2/12 Th	Deep learning theory (approximation, optimization), measures of generalzation, techniques for improving generalization, generalization theory for deep learning	Du et al. on global convergence of gradient descent, Jacot et al. on Neural Tangent Kernel, Arora et al. on Neural Tangent Kernel, Zhang et al. on rethinking generalization on deep learning , Allen-Zhu and Li on separation beteween neural networks and kernels, Chapter 10 - 14 of Matus Telgarsky's notes, Jiang et al. on different generalization measures, Belkin et al. on double descent,	Lecture 6, Lecture 6 (annotated), scribed notes on global convergence of gradient descent, scribed notes on separation between NN and kernel
2/19 Tu	Pre-training, representation learning	Bengio et al. on representation learning, Chapter 20 of Dive into Deep Learning, Zhang et al., CLIP paper, Olmo 3	Lecture 7, Lecture 7(annotated)
2/26 Th	Desiderata for generative models, variational autoencoder, GAN, energy-based models, normalizing flows, score-based models, diffusion models	Chapter 20 of Dive into Deep Learning, Zhang et al., Yang Song's blog on score-based models, Lilian Weng's blog on diffusion models.	Lecture 8 , Lecture 8 (annotated)
3/5 Th	Deep reinforcement learning, reinforcement learning for large language models	Chapter 17 of Dive into Deep Learning, Zhang et al.	Lecture 9 , Lecture 9 (annotated)
3/12 Th	Project Presentation (Zoom)