Foundations of Reinforcement Learning

Foundations of Reinforcement Learning (Spring 2024)

This course focuses on theoretical and algorithmic foundations of reinforcement learning, through the lens of optimization, modern approximation, and learning theory. The course targets students with strong research interests in reinforcement learning, optimization under uncertainty, and data-driven control.

Learning Objectives

By the end of the course, students will be able to

Define the key features of RL that distinguishes it from standard ML;
Identify the strengths and limitations of various reinforcement learning algorithms;
Formulate and solve sequential decision-making problems by applying relevant reinforcement learning tools;
Recognize the common, connecting boundary of optimization and RL;
Generalize or discover “new” applications, algorithms, or theories of reinforcement learning towards conducting independent research on the topic.

Course Content

Week 1
Introduction
- Learning objectives and course logistics
- An overview on RL
- A primer on optimization
Week 2
Dynamic Programming and Linear Programming
- Markov Decision Processes (MDPs)
- Bellman Equations and Bellman Optimality
- Value/Policy Iteration
- Linear Programming
Week 3
Value-based RL
- From Planning to Reinforcement Learning
- Model-free Prediction
- Model-free Control
- Function Approximation
- Convergence Analysis
Week 4
Policy-based RL I (Algorithms)
- Overview of Policy-based RL
- Policy Gradient Estimation
- Policy Gradient Methods (PG)
- Natural Policy Gradient (NPG)
- Beyond PG: TRPO, PPO, etc.
Week 5
Policy-based RL II (Theory)
- Performance Difference Lemma
- Global Convergence of Policy Gradient Methods
- Global Convergence of Natural Policy Gradient Methods
- Remarks on Sample Efficiency
Week 6
Multi-agent RL and Markov Games
- RL From Single Agent to Multiple Agents
- Preliminaries: Normal Form Games and Repeated Games
- Markov Games and Algorithms
- Zero-Sum Markov Games and Algorithms
Week 7
Imitation Learning
- Offline Imitation Learning: Behavior Cloning
- Online Interactive Imitation Learning: DAGGER, AggreVaTe
- Inverse Reinforcement Learning: Feature Expectation Matching, Max-Ent IRL
- Generative Adversarial Imitation Learning (GAIL)
Week 8
Deep RL
- Algorithms:
- Theory:
Week 9
Going Beyond: Model-based RL, Offline RL, Many-agent RL
- Model-based RL
- Offline RL
- Many-agent RL
- Summary and Outlook

Recommended References

There is no required textbook. Lectures and class discussions are mostly based on classical and recent papers on the topic.

RL textbooks:

[S09] Algorithms for Reinforcement Learning, Csaba Szepesvári, 2009.
[SB18] Reinforcement learning: an introduction, by Richard S. Sutton, Andrew G. Barto, 2018.
[B19] Reinforcement Learning and Optimal Control, by Dimitri P. Bertsekas, 2019.
[AJK20] Reinforcement Learning: Theory and Algorithms by Alekh Agarwal, Nan Jiang, Sham M. Kakade, 2020.
[M21] Control Systems and Reinforcement Learning by S. Meyn, Cambridge University Press, 2021.
[KWM22] Algorithms for Decision Making by Mykel J. Kochenderfer, Tim A. Wheeler, and Kyle H. Mray, MIT Press, 2022.

Optimization foundations: