Machine learning explores the study and construction of algorithms that can learn from data. This study combines ideas from both computer science and statistics. The study of learning from data is playing an increasingly important role in numerous areas of science and technology.

This course is designed to provide a thorough grounding in the fundamental methodologies, statistics, mathematics, and algorithms of machine learning. The topics of the course draw from classical statistics, from machine learning, from data mining, from Bayesian statistics, and from statistical algorithms.

Prerequisites: Students entering the class should have a pre-existing working knowledge of probability, statistics and algorithms, though the class has been designed to allow students with a mathematical background to catch up and fully participate.

IMPORTANT: All class announcements will be broadcasted using the Catalyst discussion board. The same applies to questions about homeworks, projects and lectures. If you have a question of personal matters, please email the instructors list: cse546-instructors@cs.washington.edu. Otherwise, please send all questions to this board, since other students may have the same questions, and we need to be fair in terms of how we interact with everyone. Also, please feel free to participate, answer each others' questions, etc.

Important Note: As we sometimes reuse problem set questions from previous years, covered by papers and webpages, we expect the students not to copy, refer to, or look at the solutions in preparing their answers (referring to unauthorized material is considered a violation of the honor code). Similarly, we expect to not to google directly for answers. The homework is to help you think about the material, and we expect you to make an honest effort to solve the problems. If you do happen to use other material, it must be acknowledged clearly with a citation on the submitted solution.

Homeworks will be done individually: each student must hand in their own answers. In addition, each student must write their own code in the programming part of the assignment. It is acceptable, however, for students to collaborate in figuring out answers and helping each other solve the problems. You also must indicate on each homework with whom you collaborated.

If you feel that we have made an error in grading your homework, please turn in your homework with a written explanation, and we will consider your request. Please note that regrading of a homework may cause your grade to go up or down.

You are expected to complete a final project for the class. This will provide you with an opportunity to apply the machine learning concepts you have learned. We will update the project requirements and due dates during the quarter.

Recitations will only occur on some Weds, depending on interest. The schedule is here:

- Oct 7: Python tutorial.

- Oct 28th: HW review

- Lecture 1: Introduction
- Lecture 2: Linear Algebra Review and Regression
- Least squares
- SVD and the pseudo-inverse
- lecture notes pdf

- Lecture 3: Bias-Variance Tradeoff (& Optimization Implications)
- Lecture 4: Decision Trees
- Lecture 5: Feature selection 1
- Lecture 6: Feature selection 2
- Theory (the orthogonal case)
- Theory (RIP and the near-orthogonal case)
- lecture notes pdf

- Lecture 7: Feature construction
- boosting
- more tricks: kernels
- even more tricks: random features
- lecture notes pdf

- Lecture 8: Loss functions
- Binary classification
- Convexity
- Gradient descent
- lecture notes pdf

- Lecture 9: Gradient descent and Stochastic Gradient Descent
- optimization issues
- stochastic gradient descent
- lecture notes pdf

- Lecture 10: Stochastic Gradient Descent
- Stochastic Gradient Descent
- non-smooth optimization
- lecture notes pdf

- Midterm (in class)
- Lecture 11: SGD and generalization
- Stochastic Gradient Descent
- generalization
- lecture notes pdf

- Lecture 12: binary classification
- the perceptron algorithm
- SVMs
- lecture notes pdf

- Lecture 13: Dimensionality reduction
- Lecture 14: PCA and Clustering
- k-means
- PCA and learning and clustering
- learning with missing data
- Extra reading Bishop Ch 9

- Lecture 15: Expectation maximization
- Gaussian mixture models
- learning with missing data
- Extra reading Bishop Ch 9
- wiki page EM algo

- Thanksgiving (no class)
- Lecture 16: Sequence modeling
- extra reading: Bishop Ch 13
- Hidden markov models
- Conditional random fields
- Structured prediction

- Lecture 17: Deep Learning 1
- neural nets
- back prop
- extra reading: Bishop Ch 5

- Lecture 18: Learning theory
- Concentration and the union bound
- lecture notes pdf

- Lecture 19: RNNs and LSTMs
- guest lecture: Antoine Bosselut