Instructor: Abhishek Gupta (abhgupta at cs)
TAs:
Jacob Berg (jacob33 at cs)
Lecture: Gates 271: (Monday, Wednesday) 11:30-12:50
Office Hours:
Office Hours: Abhishek (Gates 215): Fri 4-5pm, Tue 12-1pm
Office Hours (Gates center 153):
Jacob: Wednesday CSE 2 274 2:00-3:00PM, in-person/zoom
Ed discussion board (link) All questions that are not of a personal nature should be posted to the discussion board.
Submit anonymous feedback here.
In this course, we study the science of reinforcement learning - algorithms for learning how to act in sequential decision making problems through trial and error experience in an environment. These methods have shown promise in domains ranging from video game AI, graphics, robotics, and even large language models. Reinforcement learning methods constitute an entirely new range of algorithmic techniques from the typical supervised or unsupervised machine learning paradigms, and require a new set of tools to properly study. In this class, we will start from first principles to derive and describe a variety of reinforcement learning algorithms, and describe how they can form a useful, practical tool when combined with rich function approximation. We plan to cover a wide range of methods: policy gradients, off-policy reinforcement learning, model-based reinforcement learning, imitation learning and inverse reinforcement learning, offline reinforcement learning and multi-task learning.
The expected course outcomes are:
This course is largely directed towards Ph.D. students with an advanced knowledge of linear algebra/probability/optimization, although undergraduate and masters students with a requisite background are welcome as well.
Reinforcement Learning: An Introduction, Sutton and Barto
Conduct an open-ended research project on any topic related to reinforcement learning or it’s applications. Will be completed in groups of 1 to 2. More details soon!
Tentatively, the grading is as follows:
40% final project
10% paper discussions/readings
45% for HWs – 15% for each of 3 HWs
5% participation
Linear Algebra, Multivariate Calculus, Probability Theory, Optimization, Machine Learning
Students of all backgrounds and experiences are welcome in this class. You are entitled to be treated respectfully by your classmates and the course staff.
If at any time you are made to feel uncomfortable, disrespected, or excluded, please contact the instructors or a TA to report the incident. If you feel uncomfortable bringing up an issue with the course staff directly, you may also consider sending anonymous course feedback or meeting with the CSE academic advisors or the UW Office of the Ombud.
Programming projects are designed for a group of 1 student. Each group should write their own writeup and code.
We encourage you to discuss all course activities with your friends and classmates as you work through them. Feel free to talk through struggles with your peers as long as you follow the academic misconduct warnings that have been relayed in every course you’ve taken thus far. It’s okay to look at online resources as long as sources are cited and code isn’t copied.
Here’s a reference in case you need a refresher.
It is the policy and practice of the University of Washington to create inclusive and accessible learning environments consistent with federal and state law. If you have already established accommodations with Disability Resources for Students (DRS), please activate your accommodations via myDRS so we can discuss how they will be implemented in this course. If you have not yet established services through DRS, but have a temporary health condition or permanent disability that requires accommodations, contact DRS directly to set up an Access Plan.
Washington state law requires that UW develop a policy for accommodation of student absences or significant hardship due to reasons of faith or conscience, or for organized religious activities. The UW’s policy, including more information about how to request an accommodation, is available at Religious Accommodations Policy. Accommodations must be requested within the first two weeks of this course using the Religious Accommodations Request form.
I reserve the right to modify any of these plans as need be during the course of the class; however, I won’t do anything capriciously, anything I do change won’t be too drastic, and you’ll be informed as far in advance as possible.