Schedule

This schedule is tentative and subject to change.

DATETOPICDUE
January 4 Introduction: Why is Deep Learning a good tool for Robotics? [slides] [recording]
※ Intelligence without representation
※ 
The free-energy principle: a unified brain theory?
※ 
Computing Machinery and Intelligence
 Optional Readings
   On the measure of intelligence
   
From Socrates to expert systems: The limits of calculative rationality
   
A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence
   
Reforcement Learning in the brain
   
Does intelligence require a body
January 9 Reinforcement Learning - Policy Gradient [slides] [recording]
※ What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study
※ 
Approximately Optimal Approximate RL
※ 
Mirage of Action Dependent Baselines
※ 
Scalable Trust-Region Method for Deep Reinforcement Learning using Kronecker-factored Approximation
 Optional Readings
   Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
   
Sample Efficient Actor Critic with Experience Replay
   
Implementation Matters in Deep RL: A Case Study on PPO and TRPO
   
Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies
   
Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator
January 11 Student-lead Discussion 1
※ A Closer Look at Deep Policy Gradients
※ 
Backpropagation Through the Void
January 16 Holiday: Martin Luther King Jr. Day
January 18 Reinforcement Learning - Off-policy Methods [slides] [recording]
※ QT-opt
※ 
Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning
※ 
DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction
※ 
Soft Actor Critic
 Optional Readings
   TD3: Addressing Function Approximation Error in Actor-Critic Methods
   
Deep Reinforcement Learning with Double Q-learning
   
MPO: Maximum a Posteriori Policy Optimisation
   
REDQ: Randomized Ensembled Double Q-Learning: Learning Fast Without a Model
   
A Walk in the Park: Learning to Walk in 20 Minutes With Model-Free Reinforcement Learning
   
Continuous Deep Q-Learning with Model-based Acceleration
January 23 Model-based Reinforcement Learning [slides]
※ Information Theoretic MPC for Model-Based Reinforcement Learning
※ 
Generative Temporal Difference Learning for Infinite-Horizon Prediction
※ 
Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models
※ 
Blending MPC and Value Function Approximation for Efficient RL
 Optional Readings
   PILCO
   
Dreamer
   
TD-models
   
Successor features
   
STEVE
   
Guided Policy Search
 Optional Readings (Application)
   AlphaGo
   
Iterative Residual Policy for Goal-Conditioned Dynamic Manipulation of Deformable Objects
   
Deep Dynamics Models for Learning Dexterous Manipulation
   
DayDreamer
January 25 Student-lead Discussion 2
※ Diagnosing Bottlenecks in Deep Q-learning Algorithms
※ 
When to Trust Your Model: Model-Based Policy Optimization
January 30 Imitation Learning Project Proposal
February 1 Reward Inference and Specification
February 6 Student-lead Discussion 3
February 8 Learning from Prior Data and Offline Reinforcement Learning
February 13 Student-lead Discussion 4
February 15 Multi-task and Meta Learning Project Milestone
February 20 Holiday: President's Day
February 22 Simulator and Domain Transfer
February 27 Student-lead Discussion 5
March 1 Deep Learning for Perception
March 6 Frontiers and Perspectives
March 8 Student-lead Discussion 6
Mar 13 Final Project Presentation Project Report