CSE 493S

About the Course

When talking to some students, one of the questions I get the most is how to get started in research as an undergrad. This course is designed to be the first step towards in-depth understanding and rigorous analyses in both theoretical and empirical machine learning.

This course will cover advanced machine learning, from VC dimension to Generative AI. It will be divided into two parts: theoretical and empirical. In the first we will cover topics such as VC dimension, Rademacher complexity, ERM, generalization bounds, and optimization basics. Next we will cover the components and development of advanced ML systems.

Prerequisites: Students entering the class should be comfortable with programming and should have a pre-existing working knowledge of linear algebra (MATH 308), vector calculus (MATH 126), probability and statistics (CSE 312/STAT390), and algorithms. Knowledge of machine learning at the level of CSE446 is highly recommended.

Past offering of this course by Ludwig Schmidt: link

Useful resources: Understanding Machine Learning by Shai Shalev-Shwartz and Shai Ben-David -- free pdf

Staff: See the Staff Info page for information about the staff

Lectures

Lecture time and place: Tuesdays, Thursdays 10:00 -- 11:20am, CSE2 G10 (The topics below are tentative)

Lecture 1 (4/1): Introduction
Lecture 2 (4/3): PAC learning and ERM (lecture notes)

Understanding Machine Learning, Chapters 2 and 3
Kevin Jamieson's amazing class in interactive learning covers similar proofs but with slides

Lecture 3 (4/8): Uniform convergence, agnostically PCA learnable, Hoeffding's inequality (lecture notes)

Understanding Machine Learning, Chapter 4

Lecture 4 (4/10): Concentration of measure (lecture notes)

Understanding Machine Learning, Appendix B

Lecture 5 (4/15): No-free-lunch theorem, and bias-complexity trade-off, VC dimensions (lecture notes)

Understanding Machine Learning, Chapters 5, 6

Lecture 6 (4/17): VC dimensions (lecture notes)

Understanding Machine Learning, Chapter 6

Lecture 7 (4/22): VC dimension of linear predictors and perceptron algorithm (lecture notes)

Understanding Machine Learning, Chapter 9

Lecture 8 (4/24): Convexity, regularization, and stability analysis (lecture notes)

Understanding Machine Learning, Chapter 12, 13

Lecture 9 (4/29): Rademacher complexity and generalization bounds (lecture notes)

Understanding Machine Learning, Chapter 26

Lecture 10 (5/1): Symmetrization trick, Tokenization (lecture notes)

Understanding Machine Learning, Chapter 26

Lecture 11 (5/6): Tokenization: SuperBPE and Data Mixture Inference
Lecture 12 (5/8): No lecture.
Lecture 13 (5/13): Language model basics and architecture (lecture notes)
Lecture 14 (5/15): Transformers (lecture notes)
Lecture 15 (5/20): Speculative decoding and in-context learning (lecture notes)
Lecture 16 (5/22): Chain-of-thought prompting (lecture notes)
Lecture 17 (5/27): Parameter efficeint fine-tuning (lecture notes)
Lecture 18 (5/29): Alignemnt (lecture notes)
Lecture 19 (6/3): Mixture-of-Experts, test-time compute (lecture notes)
Lecture 20 (6/5): Final project presentation and closing remarks

Office Hours

Sewoong's Office Hours: Wednesdays 1:00-2:00, CSE2 207 (No office hour on June 4th)
Project Office Hours: Thursdays 2:00-3:00, CSE2 276

5/8: Divyansh (theory)
5/15: Anshul (empirical)
5/22: Thao (empirical)
5/29: Jon (theory and empirical)

Theory part Office Hours (for both lectures and Homework 1): Tuesdays 4:00-5:00 PM, CSE2 376 (from April 15th to April 29th)

Additional OHs near the due date: Tuesday, April 29 will be 3:00-5:00 with two TAs

Empirical part Office Hours (for both lectures and Homework 2): Tuesdays 3:00-4:00 PM, CSE2 387 (from May 6th to May 27th)

5/6: Anshul
5/13: Anshul
5/20: Thao
Additional OHs near the due date: TBD

Assignments

We expect all assignments to be typeset (i.e., no photos or scans of written work) and submitted to this Link to Gradescope. Homework 1 should be Latexed and homework 2 can be typeset using any editor like Microsoft Word or Latex.

Homework 1: Due on Thursday, May 1st at 11:59pm (PDF, source).
Homework 2: Due on Tuesday, June 10th at 11:59pm (PDF, Starter Code). To be done in groups of up to 4, Submissions should be done by one person per group.

Reading Assignments (CSE 599 S only)

CSE 599 S students will have an extra reading assignment worth 10% of the total grade. Instructions for this can be found here.

Reading 1: Due on Thursday, May 8th at 11:59pm. Instructions here.
Reading 2: Due on Thursday, May 15th at 11:59pm. Instructions here.
Reading 3: Due on Thursday, May 22nd at 11:59pm. Instructions here.
Reading 4: Due on Thursday, May 29th at 11:59pm. Instructions here.

Projects

The project will be about a replication of research, original empirical research, or a summarization of a line of theoretical work (and potential extension). There are three milestones for the project: (1) a proposal what you will work on, (2) version 1 which checks if you are on track to finish the project in time, (3) the final version which includes the full report.

Resources:
Deadlines:
- Proposal: Thursday, April 24rd at 11:59 PM, (submit on gradescope)
- Version 1: Sunday, May 18th at 11:59 PM, (submit on gradescope)
- Final version: Final project presentation is on Thursday, June 5th during class, (submit on gradescope). The final report is due on Friday, June 6th at 11:59 PM.
Grading for the project is distributed as such: 10% for the proposal, 25% for version 1, and 65% for the final version. The project is 50% of total course project grade.

Grading

For students enrolled in CSE 493S, your grade will be determined by:

25% homework 1
25% homework 2
50% final project

For students enrolled in CSE 599, your grade will be determined by:

20% homework 1
20% homework 2
10% reading assignment
50% final project

Where to get help

EdStem discussion board:
- Public/Anonymous Posts
  - Questions like, "Is there a typo in the homework?", "What does this notation mean?", "Is this an accurate description of how this works?".
  - Questions that are not of a personal nature should be posted to the discussion board.
- Private Posts
  - Questions involving your own code should be posted privately to the EdStem discussion board, not office hours.
  - Personal concerns (like "I was in the hospital", "Laptop was stolen").
Course staff email cse493s-staff@cs.washington.edu
- Personal concerns (like "I was in the hospital", "Laptop was stolen") you aren't comfortable sharing with the entire staff. We highly recommend posting privately to EdStem if you are comfortable.
- Please direct all course-related inquiries to cse493s-staff@cs.washington.edu or EdStem. Please do not email the instructors or TAs individually.
Submit anonymous feedback here.