Faculty Recruiting Support CICS

Learning Structure to Support Autonomous Control Decisions

01 Feb
Thursday, 02/01/2024 2:00pm to 4:00pm
LGRC A215
PhD Dissertation Proposal Defense
Speaker: Khoshrav Doctor

There is a great deal of consequential research and development on autonomous systems with applications ranging from space exploration to self-driving vehicles. These systems must make critical control decisions in the absence of complete state information that stems from noisy sensors, actuators and environmental dynamics. Intelligent biological agents can acquire and exploit background knowledge summarizing past experience of situated control dynamics to help address this challenge and probabilistic distributions are attractive representations for the behavior of systems with incomplete information. Probabilistic structure in the environment is also useful for model-based learning and planning systems. It supports agents (biological or otherwise) in sequential decision making and active information gathering to reduce uncertainty in order to lead them towards desired states while avoiding unrecoverable failure. The desire to use models for decision making brings about questions about what these models should represent and how they can be acquired autonomously.

In embodied systems, structure is revealed by reliable patterns of flow in the control transitions. This proposal examines learning models of that structure to be leveraged by model-based control and provides mechanisms for autonomous model building under uncertainty. The goal is to provide a means of learning control transitions autonomously that can be applied in general to all learning and planning systems formulated as a Partially Observable Markov Decision Process (a common framework for sequential decision making when complete information is not always available) and to measure the impact of the acquired models on representative learning/planning tasks. It aims to support the learning of models that predict interaction dynamics from recognizable environmental structure---a form of cognitive artifact that describes probabilistic roadmaps in state-action spaces.

I consider two forms of models: 1) an empirical distribution (called a SEARCH prior) over goals for actions that can generate a given target state; and 2) the transition function T, that provides a probability distribution over outcome states that occur when an action is performed from a given state. Collectively, these models support the agent in reasoning over trajectories of (either open or closed loop) controlled state transitions in order to solve a task. I will study mechanisms to learn these models through autonomous exploration in a manner that addresses issues of salience, efficiency, coverage and completeness. I will measure the effects of the resulting models on the quality of autonomous behavior using a POMDP framework for model-based planning in solving the Kidnapped Robot Problem using a sparse feature space and on a multi-object scene recognition problem---both these tasks are examples wherein an agent is confounded with incomplete information and must rely on actively picking actions to mitigate uncertainty based on prior models.
 

Advisor: Rod Grupen