
This schedule is tentative and subject to change.

January 4 Introduction: Why is Deep Learning a good tool for Robotics? [slides] [recording]
※ Intelligence without representation
The free-energy principle: a unified brain theory?
Computing Machinery and Intelligence
 Optional Readings
   On the measure of intelligence
From Socrates to expert systems: The limits of calculative rationality
A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence
Reforcement Learning in the brain
Does intelligence require a body
January 9 Reinforcement Learning - Policy Gradient [slides] [recording]
※ What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study
Approximately Optimal Approximate RL
Mirage of Action Dependent Baselines
Scalable Trust-Region Method for Deep Reinforcement Learning using Kronecker-factored Approximation
 Optional Readings
   Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
Sample Efficient Actor Critic with Experience Replay
Implementation Matters in Deep RL: A Case Study on PPO and TRPO
Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies
Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator
January 11 Student-lead Discussion 1
※ A Closer Look at Deep Policy Gradients
Backpropagation Through the Void
January 16 Holiday: Martin Luther King Jr. Day
January 18 Reinforcement Learning - Off-policy Methods [slides] [recording]
※ QT-opt
Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning
DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction
Soft Actor Critic
 Optional Readings
   TD3: Addressing Function Approximation Error in Actor-Critic Methods
Deep Reinforcement Learning with Double Q-learning
MPO: Maximum a Posteriori Policy Optimisation
REDQ: Randomized Ensembled Double Q-Learning: Learning Fast Without a Model
A Walk in the Park: Learning to Walk in 20 Minutes With Model-Free Reinforcement Learning
Continuous Deep Q-Learning with Model-based Acceleration
January 23 Model-based Reinforcement Learning [slides]
※ Information Theoretic MPC for Model-Based Reinforcement Learning
Generative Temporal Difference Learning for Infinite-Horizon Prediction
Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models
Blending MPC and Value Function Approximation for Efficient RL
 Optional Readings
Successor features
Guided Policy Search
 Optional Readings (Application)
Iterative Residual Policy for Goal-Conditioned Dynamic Manipulation of Deformable Objects
Deep Dynamics Models for Learning Dexterous Manipulation
January 25 Student-lead Discussion 2
※ Diagnosing Bottlenecks in Deep Q-learning Algorithms
When to Trust Your Model: Model-Based Policy Optimization
January 30 Imitation Learning [slides] Project Proposal
※ Feedback in Imitation Learning: The Three Regimes of Covariate Shift
Towards the Fundamental Limits of Imitation Learning
Discriminator Actor Critic
 Optional Readings
   An Invitation to Imitation Learning
An Algorithmic Perspective on Imitation Learning
Imitation Learning as F-Divergence Minimization
Of Moments and Matching
Provably Efficient Imitation Learning from Observations Alone
DART Noise Injection:
Zero Shot Visual Imitation
February 1 Inverse Reinforcement Learning [slides]
※ Adversarial Inverse Reinforcement Learning
Bayesian IRL (Ramachandran Amir)
A Connection Between Max Entropy IRL and GAN
 Optional Readings
   Guided Cost Learning
Deep Imitative Models
Max Margin Planning
Max Entropy Deep IRL
February 6 Inverse RL and other forms of supervision [slides]
February 8 Student-lead Discussion 3
※ Casual Confusion in Imitation Learning
Cooperative Inverse Reinforcement Learning
February 13 Learning from Prior Data and Offline Reinforcement Learning [slides]
※ Conservative Q learning
Learning Latent Plans from Play
Decision Transformer
Implicit Q-learning
 Optional Readings
   BEAR: Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
BRAC:Behavior Regularized Offline Reinforcement Learning
GenDICE: Generalized Offline Estimation of Stationary Values
Doubly Robust Off-Policy Value Estimation
Trajectory Transformer: Offline Reinforcement Learning as One Big Sequence Modeling Problem
BCQ: Off-Policy Deep Reinforcement Learning without Exploration
TD3 + BC
MOREL: Model-Based Offline Reinforcement Learning
February 15 Multi-task and Meta Learning [slides] Project Milestone
※ RL2: Fast Reinforcement Learning via Slow Reinforcement Learning
Model-Agnostic Meta-Learning
Gradient Surgery for Multi-Task RL
MT-Opt: Continuous Multi-Task Robotic Reinforcement Learning at Scale
 Optional Readings
   Human-Timescale Adaptation in an Open-Ended Task Space
MELD: Meta-Reinforcement Learning from Images via Latent State Models
VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning
DREAM: Decoupling Exploration and Exploitation in Meta-Reinforcement Learning without Sacrifices
VIMA: General Robot Manipulation with Multimodal Prompts
February 20 Holiday: President's Day
February 22 Student-lead Discussion 4
※ Implicit Q-Learning
Ray Interference: A Source of Plateaus in Deep RL
February 27 Simulator and Domain Transfer [slides]
March 1 Guest Lecture: Deep Learning in Robot Perception by Pete Florence
March 6 Frontiers and Perspectives
March 8 Student-lead Discussion 5
※ VariBad
EPOpt: Learning Robust Neural Network Policies Using Model Ensembles
Mar 13 Final Project Presentation Project Report