Tentative Schedule

Date	Content	Reading	Slides and Notes
	Intro
1/3 Tu	Introduction, neural network basics (on Zoom)	chapter 1-4 of Dive into Deep Learning, Zhang et al https://playground.tensorflow.org/	Lecture 1, Lecture 1 (annotated)
	Approximation Theory
1/5 Th	1D and multivariate approximation (on Zoom)	Chapter 1,2 of Matus Telgarsky's notes	Lecture 2 , Lecture 2 (annotated), scirbed notes on 1D, multivariate approximation, and Barron's Theory.
1/10 Tu	Barron's theory, depth separation	Chapter 3,5 of Matus Telgarsky's notes	Lecture 3 , Lecture 3 (annotated)
	Optimization
1/12 Th	Backpropagation, auto-differentiation, Clarke differential	Chapter 4 of Dive into Deep Learning, Zhang et al , Chapter 9 of Matus Telgarsky's notes	Lecture 4, Lecture 4 (annotated)
1/17 Tu	Auto-balancing, advanced optimizers	Chapter 9 of Matus Telgarsky's notes, Chapter 12 of Dive into Deep Learning, Zhang et al. , Du et al. on auto-balancing, Optimizer visualization	Lecture 5, Lecture 5 (annotated), scribed notes on Clarke differential, positive homogeneity and auto-balancing
1/19 Th	Techniques for improving optimization, optimization landscape	Chapter 12 of Dive into Deep Learning, Zhang et al. , He et al. on Kaiming initialization, blog of escaping saddle points, blog on how to escape saddle points efficiently	Lecture 6, Lecture 6 (annotated), scribed notes on Kaiming initialization
1/24 Tu	optimization landscape, global convergence of gradient descent	blog of escaping saddle points, blog on how to escape saddle points efficiently, Du et al. on global convergence of gradient descent	Lecture 7, Lecture 7 (annotated), scribed notes on global convergence of gradient descent
1/26 Th	Finish the proof of global convergence of gradient descent, Neural tangent kernel	Du et al. on global convergence of gradient descent, Jacot et al. on Neural Tangent Kernel, Arora et al. on Neural Tangent Kernel	Lecture 8 Part2, Lecture 8 Part 1(annotated), Lecture 8 Part 2(annotated)
	Generalization
1/31 Tu	Measures of generalzation, techniques for improving generalization, generalization bounds for deep learning, double descent, implicit bias	Zhang et al. on rethinking generalization on deep learning, Chapter 5 of Dive into Deep Learning, Zhang et al. , Chapter 10 - 14 of Matus Telgarsky's notes, Jiang et al. on different generalization measures, Belkin et al. on double descent	Lecture 9, Lecture 9 (annotated)
2/2 Th	Separation between neural network and kernel, introduction to convolutional neural networks	Allen-Zhu and Li on separation beteween neural networks and kernels, Chapter 7,8 of Dive into Deep Learning, Zhang et al.	Lecture 10, Lecture 10 (annotated), scribed notes on separation between NN and kernel
	Neural Network Architecture
2/7 Tu	Advanced convolutional neural networks, recurrent neural networks	Chapter 7,8, 9 of Dive into Deep Learning, Zhang et al.	Lecture 11, Lecture 11 (annotated)
2/9 Th	LSTM, attention mechanism	Chapter 10, 11 of Dive into Deep Learning, Zhang et al.	Lecture 12 , Lecture 12 (annotated)
	Generative Models
2/14 Tu	Transformer, desiderata for generative models, GAN	Chapter 20 of Dive into Deep Learning, Zhang et al.	Lecture 13 part1 ,Lecture 13 part1 (annotated) , Lecture 13 part2 , Lecture 13 part2 (annotated)
2/16 Tu	Math behind Gan, variational autoencoder, energy models,	Chapter 20 of Dive into Deep Learning, Zhang et al.	Lecture 14, Lecture 14 (annotated)
2/21 Tu	Normalizing flows, score-based models, diffusion models	Yang Song's blog on score-based models, Lilian Weng's blog on diffusion models.	Lecture 15, Lecture 15 (annotated)
	Representation Learning
2/23 Th	Desiderata for representation learning, Word embedding, auto-encoder, self-supervised learning, contrastive learning	Bengio et al. on representation learning, Chapter 11 of Dive into Deep Learning, Zhang et al.	Lecture 16 part1(annotated),Lecture 16 part2, Lecture 16 part2(annotated)
2/28 Tu	Multi-task representation learning, active representation learning (guest lecture by Yifang Chen, on Zoom)		Lecture 17
3/2 Th	Understanding neural network training from the perspective of feature learning (guest lecture by Ruoqi Shen, on Zoom)		Lecture 18
	Course Presentations
3/7 Tu	Project Presentation (on Zoom)
3/9 Th	Project Presentation (on Zoom)