Subscribe to this calendar (Google, iCal, etc.)
September | ||||
---|---|---|---|---|
Monday | Tuesday | Wednesday | Thursday | Friday |
22 | 23 | 24
10:00-11:20 Lecture
MGH 295 Intro and Transformers Paper (no discussion): Attention Is All You Need Slides Recording |
25
16:00-17:00 OH (Stephanie)
CSE1 580 |
26
11:30-12:30 OH (Frank)
Allen 3rd floor breakout
23:59 HW0 due
|
29 | 30 | 01 | 02
16:00-17:00 OH (Stephanie)
CSE1 580 |
03
Guest lecture: Kan Zhu
10:00-11:20 Lecture
MGH 295 GPU architecture and programming
11:30-12:30 OH (Frank)
Allen 3rd floor breakout |
October | ||||
---|---|---|---|---|
Monday | Tuesday | Wednesday | Thursday | Friday |
06 | 07 | 08
23:59 Groups and paper signup due
|
09
16:00-17:00 OH (Stephanie)
CSE1 580 |
10
11:30-12:30 OH (Frank)
Allen 3rd floor breakout |
13 | 14 | 15
Guest lecture: Frank Zhao
Guest lecture: Aashaka Shah, Roshan Dathathri
23:59 Project plan due
|
16
16:00-17:00 OH (Stephanie)
CSE1 580 |
17
HW2 out
10:00-11:20 Lecture
MGH 295 Scaling I: Memory optimizations and ZeRO Paper 1: ZeRO Paper 2: PyTorch FSDP Optional: Activation checkpointing Optional: ZeRO-Infinity
11:30-12:30 OH (Frank)
Allen 3rd floor breakout
23:59 HW1 due
|
20 | 21 | 22
10:00-11:20 Lecture
MGH 295 Scaling I: Model parallelism Paper 1: Megatron-LM Paper 2: Scaling Megatron-LM Optional: GPipe Optional: Ring attention Optional: Zero Bubble Pipeline Parallelism |
23
16:00-17:00 OH (Stephanie)
CSE1 580 |
24
11:30-12:30 OH (Frank)
Allen 3rd floor breakout |
27 | 28 | 29
10:00-11:20 Lecture
MGH 295 Scaling I: Foundation model case studies Paper 1: PaLM, sections 1-5 Paper 2: DeepSeek-V3, sections 1-4 |
30
16:00-17:00 OH (Stephanie)
CSE1 580 |
31
10:00-11:20 Lecture
MGH 295 Scaling I: Foundation model case studies Paper 1: Llama3 Paper 2: TorchTitan
11:30-12:30 OH (Frank)
Allen 3rd floor breakout |
November | ||||
---|---|---|---|---|
Monday | Tuesday | Wednesday | Thursday | Friday |
03 | 04 | 05
10:00-11:20 Lecture
MGH 295 Post-training: Intro Paper 1: Tulu 3, sections 1-5 Paper 2: Tulu 3, sections 6-10 |
06
16:00-17:00 OH (Stephanie)
CSE1 580 |
07
Guest lecture: Eric Liang
HW3 out
10:00-11:20 Lecture
MGH 295 Post-training: Systems for RL for LLMs Paper 1: RLlib Paper 2: OpenRLHF Optional: Ray
11:30-12:30 OH (Frank)
Allen 3rd floor breakout
23:59 HW2 due
|
10 | 11
Veteran's Day
|
12
Guest lecture: Shishir Patil
|
13
16:00-17:00 OH (Stephanie)
CSE1 580 |
14
11:30-12:30 OH (Frank)
Allen 3rd floor breakout |
17 | 18 | 19 | 20
16:00-17:00 OH (Stephanie)
CSE1 580 |
21
10:00-11:20 Lecture
MGH 295 Scaling II: Hardware trends Paper 1: Power stabilization Paper 2: Semianalysis: H100 vs GB200 (link TBA)
11:30-12:30 OH (Frank)
Allen 3rd floor breakout |
24 | 25 | 26
Project time, NO CLASS
|
27
Thanksgiving
|
28
Native American Heritage Day
|
December | ||||
---|---|---|---|---|
Monday | Tuesday | Wednesday | Thursday | Friday |
01 | 02 | 03
10:00-11:20 Lecture
MGH 295 Scaling II: Multimodal systems Paper 1: Diffusion transformers Paper 2: Chameleon |
04
16:00-17:00 OH (Stephanie)
CSE1 580 |
05
10:00-11:20 Lecture
MGH 295 Final project presentations
11:30-12:30 OH (Frank)
Allen 3rd floor breakout
23:59 HW3 due
|
08 | 09 | 10
23:59 Final project writeup due
23:59 All assignments due (no grace period)
|
11 | 12 |