About the Course and Prerequisites
The traditional approach to machine learning uses a training set of labeled examples to learn a prediction rule that will predict the labels of future examples. Collecting such training sets can be expensive and time-consuming. This course will explore methods that leverage already-collected data to guide future measurements, in a closed loop, to best serve the task at hand. We focus on two paradigms: i) in pure-exploration we desire algorithms that identify or learn a good-enough model using as few measurements as possible (e.g., classification, drug discovery, science), and ii) in regret minimization we desire algorithms that balance taking measurements to learn a model with taking measurements to exploit the model to obtain high reward outcomes (e.g., content recommendation, medical treatment design, ad-serving).
The literature on adaptive methods for machine learning has exploded in the past few years and can be overwhelming.
This course will classify different adaptive machine learning problems by characteristics such as the hypothesis space, the available actions, the measurement model, and the available side information.
We will identify general adaptive strategies and cover common proof techniques.
List of topics:
- Stochastic Multi-armed Bandits
- Regret minimization
- Pure Exploration
- Stochastic Linear Bandits
- Regret minimization
- Pure exploraton
- Experimental Design
- Stochastic Nonparametric Bandits
- Kernels, Gaussian Processes
- Lipschitz, convex
- Stochastic Contextual bandits
- Model-free and model-based
- Binary classification
- Disagreement-based methods
- Greedy information gain methods
Prerequisites: The course will make frequent references to introductory concepts of machine learning (e.g., CSE 546) but it is not a prerequisite.
However, fluency in basic concepts from linear algebra, statistics, and calculus will be assumed (see HW0). Some review materials:
The course will be analysis heavy, with a focus on methods that work well in practice.
You are strongly encouraged to complete the
self-test of fundmamental prerequisites on your own (not to be turned in or graded). You should be able to complete most of these in your head or with minimal computation.
Class materials
The course will pull from textbooks, papers, and course notes that will evolve with the course. This list will be updated.
Discussion Forum and Email Communication
We will use Ed as a discussion board (you should have received an invite if registered for the course, otherwise email the instructor). We will not be using Canvas discussion board. This is your first resource for questions. For private or confidential questions email the instructor directly.
You may also get messages to the instructor through anonymous course feedback (though, I cannot respond to you personally so this is imperfect).
Grading and Evaluation
There will be 3 homeworks (each worth 20%) and one take-home cumulative final exam (worth 40%).
Submission guidelines
Each homework assignment will be submitted as a single PDF to gradescope. Any code for a programming problem should come at the end of the problem, after any requested figures for the problem.
We expect all assignments to be typeset (i.e., no photos or scans of written work). This can be done in an editor like Microsoft Word or Latex (highly recommended).
There exist convenient packages for listing Python code in Latex.
- Regrades: If you feel that we have made an error in grading your homework, please submit a regrade request via gradescope, and we will consider your request. Please note that regrading of a homework may cause your grade to go up or down.
- Here is gradescope help.
- You will automatically be enrolled in gradescope.
Latex resources:
Collaboration Policy
Homeworks must be done individually: each student must hand
in their own answers. In addition, each student must write
their own code in a programming part of the assignment. It
is acceptable, however, for students to collaborate in
figuring out answers and helping each other solve the
problems. You also must indicate on each homework with whom
you collaborated.
The homework problems have been carefully chosen for their
pedagogical value and hence might be similar or identical to
those given out in past offerings of this course at UW, or
similar courses at other schools. Using any pre-existing
solutions from these sources, from the Web or other
textbooks constitues a violation of the academic
integrity expected of you and is strictly prohibited.
Late Policy
If you need an extra 24 hours on any homework assignment, that's no big deal and you don't need permission. If you need multiple days due to personal reasons, please email the instructor. I will try to be accomdating but please don't abuse it, I will evaluate requests as I get them.
You will be given 24 hours for the take home exam that is expected to take no more than two hours to complete. It cannot be turned in late. Plan accordingly.
Regrading requests
All requests for regrading should be submitted to Gradescope directly. Office hours and in person discussions are limited solely to asking knowledge related questions, not grade related questions. If you feel that we have made an error in grading your homework, please let us know with a written explanation, and we will consider the request. Please note that regrading of a homework means the entire assignment may be regraded which may cause your grade on the entire homework set to go up or down.
Regrade requests must be submtted within 7 days (24*7 hours) of the time in which grades are released.
Schedule
- Homework 0: (Self-examination, Not due but recommend you complete within the first week) PDF
- Homework 1: Due 11:59 PM on January 22, 2021. PDF
- Homework 2: Due 11:59 PM on February 17, 2021. PDF
- Homework 3: Due 11:59 PM on March 12, 2021. PDF
Schedule
- Lecture 1: Jan. 5
- Welcome, logistics, overview of course topics
- Review prerequisites and "self-test" of above on your own (not to be turned in)
- Introductions
- Regret minimization introduction
- Lecture notes PDF
- Lecture 2: Jan. 7
- Chernoff Bound, Sub-gaussian random variables, elimination algorithms for multi-armed bandits
- Reading: [SzepesvariLattimore] Chapter 5, 6
- Lecture notes PDF
- Lecture 3: Jan. 12
- Elimination algorithms for multi-armed bandits continued
- Reading: [SzepesvariLattimore] Chapter 6
- Lecture notes PDF
- Lecture 4: Jan. 14
- Finish elimination algorithms, Optimism and UCB
- Reading: [SzepesvariLattimore] 7
- Lecture notes PDF
- Lecture 5: Jan. 19
- Lower bounds
- Reading: [SzepesvariLattimore] 13-16
- Lecture notes PDF
- Lecture 6: Jan. 21
- Linear experimental design
- Reading: [SzepesvariLattimore] 21
- Lecture notes PDF
- Lecture 7: Jan. 26
- Linear bandits with finite arms. Regret bound for elimination algorithm
- Reading: [SzepesvariLattimore] 22
- Lecture notes PDF
- Lecture 8: Jan. 28
- Linear bandits with finite arms. Model misspecification and infinte arm sets
- Reading: [SzepesvariLattimore] 19
- Lecture notes PDF
- Lecture 9: Feb. 2
- Self-normalized bounds, Regret bound for UCB
- Reading: [SzepesvariLattimore] 20
- Lecture notes PDF
- Lecture 10: Feb. 4 (Cancelled, make up lecture posted to Zoom)
- Martingales, method of mixtures
- Reading: [SzepesvariLattimore] 3.3, 20
- Lecture notes PDF
- Lecture 11: Feb. 9
- Finish method of mixtures
- Reading: [SzepesvariLattimore] 3.3, 20
- Lecture notes PDF
- Lecture 12: Feb. 11
- Optimal sequential testing
- Reading: See course notes
- Lecture notes PDF
- Lecture 13: Feb. 16
- Contextual bandits, policy evaluation
- Reading: [SzepesvariLattimore] 18
- Lecture notes PDF
- Lecture 14: Feb. 18
- Contextual bandits, linear value function approximation
- Reading: [SzepesvariLattimore] 19
- Lecture notes PDF
- Lecture 15: Feb. 23
- Contextual bandits for arbitrary policy class
- Reading: See course notes
- Lecture notes PDF
- Lecture 16: Feb. 25
- Computationally efficient contextual bandits for arbitrary policy class
- Reading: See course notes
- Lecture notes PDF
- Lecture 17: Mar. 2
- Introduction to active learning; Separable, pool-based setting; Halving algorithm
- Reading: See course notes
- Lecture notes PDF
- Lecture 18: Mar. 4
- Separable, pool-based and sampling-oracle setting; generalized binary search, CAL
- Reading: See course notes
- Lecture notes PDF
- Lecture 19: Mar. 9
- Sampling-oracle setting; CAL and splitting index
- Reading: See course notes
- Lecture notes PDF
- Lecture 20: Mar. 11
- Agnostic setting; Robust CAL and reduction to linear bandits
- Course review
- Reading: See course notes
- Lecture notes