Machine learning explores the study and construction of algorithms that can learn from data. This study combines ideas from both computer science and statistics. The study of learning from data is playing an increasingly important role in numerous areas of science and technology.

This course is designed to provide a thorough grounding in the fundamental methodologies and algorithms of machine learning. The topics of the course draw from classical statistics, from machine learning, from data mining, from Bayesian statistics, and from optimization.

Prerequisites: Students entering the class should be comfortable with programming and should have a pre-existing working knowledge of probability, statistics and algorithms, though the class has been designed to allow students with a mathematical background to catch up and fully participate.

IMPORTANT: All class announcements will be broadcasted using the Canvas discussion board. The same applies to questions about homeworks, projects and lectures. If you have a question of personal matters, please email the instructors list: cse546-instructors@cs.washington.edu. Otherwise, please send all questions to this board, since other students may have the same questions, and we need to be fair in terms of how we interact with everyone. Also, please feel free to participate, answer each others' questions, etc.

- HW 1, 2, 4 (15% each)
- HW 3 (20%) - midterm
- Final project (35%)

Each homework assignment contains both theoretical questions and will have programming components. Homeworks must be submitted by the posted due date.

COLLABORATION POLICY: Homework must be done individually: each student must hand in their own answers. In addition, each student must write and submit their own code in the programming part of the assignment (we may run your code). It is acceptable, however, for students to collaborate in figuring out answers and helping each other solve the problems (for HWs 1, 2, and 4). You must also indicate on each homework with whom you collaborated.

RE-GRADING POLICY: All grading related requests must be submitted to the TA via email only. Office hours and in person discussions are limited solely to asking knowledge related questions, not grade related questions. If you feel that we have made an error in grading your homework, please let us know with a written explanation, and we will consider the request. Please note that regrading of a homework may cause your grade to go up or down on the entire homework set.

LATE POLICY: Homeworks must be submitted by the posted due date. You are allowed to use 2 LATE DAYs throughout the entire quarter only for the homeworks, so please plan accordingly. Any assignment turned in late, will incur a reduction of 33% in the final score, for each day (or part thereof) if it is late. For example, if an assignment is up to 24 hours late, it incurs a penalty of 33%. Else if it is up to 48 hours late, it incurs a penalty of 66%. And any longer, it will receive no credit. You must turn in all 4 homeworks, even if for zero credit, in order to pass the course. (Empty homeworks do not count.)

NO EXCEPTIONS WILL BE GIVEN TO THE GRADING POLICIES (unless based on university policies, e.g. medical reasons). IF YOU ARE NOT ABLE TO COMPLY WITH THE LATE HOMEWORK POLICY, DUE TO TRAVEL, CONFERENCES, OTHER DEADLINES, OR ANY OTHER REASON, DO NOT ENROLL IN THE COURSE.

HONOR CODE: As we sometimes reuse problem set questions from previous years, covered by papers and webpages, we expect the students not to copy, refer to, or look at the solutions in preparing their answers (referring to unauthorized material is considered a violation of the honor code). Similarly, we expect students not to google directly for answers. The homework is to help you think about the material, and we expect you to make an honest effort to solve the problems. If you do happen to use other material, it must be acknowledged clearly with a citation on the submitted solution.

You are expected to complete a final project for the class. This will provide you with an opportunity to apply the machine learning concepts you have learned. We will update the project requirements and due dates during the quarter.

Recitations will only occur on some Weds, depending on interest. The schedule will be posted on Canvas.

- Homework 2
- Due Oct 31th.
- homework pdf

- Homework 3
- Due Nov 18th.
- homework pdf

- Homework 4
- Due Dec 12th.
- homework pdf

- Lecture 1: [Sept 29] Intro and Regression
- Point estimation and MLE, Gaussians.
- Lectures: slides
- Required reading: Murphy Ch 1, 2.1-2.6, 2.8,
- Extra good stuff:
- AlphaGo
- ConvNets
- Hoeffding bound
- Chernoff bound
- central limit theorem proof pdf

- Lecture 2: [Oct 4] Linear regression, bias-variance tradeoff.
- Lectures: slides annotated slides
- Required reading: Murphy 7.1-7.3, 6.4, 7.5.1, 7.6

- Lecture 3: [Oct 6] Overfitting, regularization, ridge regression, cross-validation.
- Lectures: slides annotated slides
- Required reading: Murphy 7.5, 6.1-6.5

- Lecture 4: [Oct 11] Variable selection, sparsity, LASSO.
- Lectures: slides annotated slides
- Required reading: Murphy 13.1-13.4

- Lecture 5: [Oct 13] Logistic Regression, Convexity, and Gradient Descent
- Lectures: slides annotated slides
- Required reading: Murphy 8.1 - 8.3

- Lecture 6: Gradient Descent
- Lectures: slides annotated slides
- Required reading: Murphy 8.5

- Lecture 7: Gradient Descent and Optimization
- Lecture 8: Online Learning: Stochastic Gradient Descent
- Lectures: notes
- Required reading: 14.5

- Lecture 9: The perceptron algorithm and online learning
- Lectures: notes
- Required reading: 14.1-14.4

- Lecture 10: Kernels and SVMs
- Lectures: slides annotated slides
- Required reading: 14.5
- Optional:

- Lecture 11: SVMs (continued); Generalization
logistic regression; convex optimization; generalization
- Lectures: slides annotated slides
- Required reading: 14.6

- Lecture 12: Review: (SGD+practical considerations),
dimensionality reduction, SVD, PCA, and k-Means
- Lectures: slides
- Required reading: 12.2, (CCA) 12.5.3,

- Lecture 13: PCA and k-Means
- Lectures: slides annotated slides
- Required reading: 11.1-11.3, 11.4

- Lecture 14: Mixture of Gaussians and EM
- Lectures: slides
- Required reading: 11.1-11.3, 11.4
- Optional:
- Clustering + PCA (seems reasonable in practice)
- Provable learning of spherical Gaussians. (the sample size is polynomial, though larger than is information theoretically needed)
- SDCA paper (a very nice optimization idea, connecting coordinate ascent with gradient descent)

- Lecture 15: The EM algorithm
- Lectures: slides
- Required reading: 11.1-11.3, 11.4
- Optional:
- EM has local minima (a nice paper showing that EM could realistically get stuck, even when: k=3; the model is correct; and you have lots of data)
- EM for learning a mixture of 2 Gaussians (Proving that EM works for a mixture of 2 Gaussians is tricky...)

- Lecture 16: Deep Learning, Neural Nets, and Backprop
- Lectures: slides
- Reading: Ch 5 in Pattern Recognition and Machine Learning , Chris Bishop.
- Optional:
- Backprop tutorial (there lots of other tutorials)
- Deep Learning Book (Ideas from the Deep Learning Community)

- Lecture 17: Deep Learning (continued); Structured Networks;
Best Practices
- Lectures: slides annotated slides
- Required reading: Convolutional Neural Networks
- Optional:
- First order dropout (Dropout and Ridge are pretty similar)
- Simpler convolutional methods (this works well for smaller datasets)

- Lecture 18: MusicNet Case Study; Reinforcement Learning, Policy Gradients
- Lectures: MusicNet case study slides
- Required reading:
- Algorithms in RL Read Chapters 1 and 2
- RL wikipage
- wiki page on MDPs

- Optional:
- MusicNet is out!
- David Silver's notes
- RL in robotics (a nice survey)
- Policy search survey