Notes Lectures Homework Office Hours Projects
Please communicate to the instructors and TA only through this account. Emails not sent to this list, with regards to the course, will not be responded to in a timely manner.
Please send all questions about homeworks, lectures, and policies to the Piazza discussion board.
Please make sure you monitor announcements from Piazza. It is important for you to make sure you get these announcements in a timely manner.
Krishna Pillutla: Wednesdays, 3-4 pm, CSE2 152
This course focuses on several theoretical foundations of sequential decision making. The course is concerned with the general problem of reinforcement learning and sequential decision making, going from algorithms for small-state Markov decision processes to methods that handle large state spaces. We also cover sequential decision making in the multi-armed bandit framework and proceed to the more general contextual bandit problem. We focus on the main algorithms, performance guarantees, lower bounds, and applications.
Prerequisites: Students entering the class should have a commanding grasp of machine learning, probability and statistics, optimization, and linear algebra. Students who are weak in these areas should take the course at a later date.
Grades will be based on three assignments (50%) and a course project (50%). ALL HOMEWORK MUST BE SUBMITTED, EVEN IF IT IS FOR 0 CREDIT, IN ORDER TO PASS THE CLASS. (Empty homeworks do not count.) There will be a poster session on Thursday, June 6th from 10-noon. YOU MUST BE PRESENT AT THE POSTER SESSION TO PASS THE CLASS.
Homework must be done individually: each student must hand in their own answers. It is acceptable for students to discuss problems with each other; it is not acceptable for students to look at another students written answers. You must also indicate on each homework with whom you collaborated with.
HW LATE POLICY: Homeworks must be submitted by the posted due date. You are allowed up to 4 total LATE DAYs for the homeworks throughout the entire quarter and up to 2 late days PER HOMEWORK assignment; these will be automatically deducted if your assignment is late. For example, any day in which an assignment is late by up to 24 hours, then one late day will be used (up to two late days). After your late days are used up, late penalties will be applied: any assignment turned in late will incur a reduction in score by 33% for each late day, so if an assignment is up to 24 hours late, it incurs a penalty of 33%. Else if it is up to 48 hours late, it incurs a penalty of 66%. And any longer, it will receive no credit. We will track all your late days and any deductions will be applied in computing the final grades. If you are unable to turn in HWs on time, aside from permitted days, then do not enroll in the course.
All re-grading requests (for the homework and the midterm) must be submitted (on Gradescope) within seven days after any grades are released. This policy is to ensure that we can address any concerns in a timely and fair manner. The focus of office hours and in person discussions are solely limited to asking knowledge related questions.
The course will follow lectures notes. Helpful books are:
Here are some related courses, with relevant material available online:
The instructors expect (and believes) that each student will conduct himself or herself with academic (and personal) integrity. While the TA will follow the course and university policies with regards to grading (see CSE conduct policy), it is ultimately up to you to conduct yourself with academic and personal integrity for a number of reasons that go beyond the scope of just this class.
While many academic disciplines have historically been dominated by one cross section of society, the study of and participation in STEM disciplines is a joy that the instructor hopes that everyone can pursue, regardless of their socio-economic background, race, gender, etc. The instructor encourages students to both be mindful of these issues, and, in good faith, try to take steps to fix them. You are the next generation here.