Policy Gradient Methods: Analysis, Misconceptions, and Improvements

27 Oct

Friday, 10/27/2023 1:00pm to 3:00pm

Hybrid - LGRC A215 & Zoom

PhD Thesis Defense

Policy gradient methods are a class of reinforcement learning algorithms that optimize a parametric policy by maximizing an objective function which directly measures the performance of the policy. Despite being used in many high-profile applications of reinforcement learning, the update rules used in practice by policy gradient methods are not well-understood. Furthermore, under conditions such as partial observability, the update rules can be highly suboptimal from the perspective of variance analysis. This thesis presents a comprehensive mathematical analysis of policy gradient methods, uncovering misconceptions and suggesting novel solutions to improve their performance.

Advisor: Philip S. Thomas

Join via Zoom

(Zoom passcode: bellman123)

Policy Gradient Methods: Analysis, Misconceptions, and Improvements

Subscribe to the CICS eNewsletter