03-12
Reinventing Partially Observable Reinforcement Learning

Many complex domains offer limited information about their exact state and the way actions affect them. There, autonomous agents need to make decisions at the same time that they learn action models and track the state of the domain. This combined problem can be represented within the framework of reinforcement learning in POMDPs, and is known to be computationally difficult.

In this presentation I will describe a new framework for such decision making, learning, and tracking. This framework applies results that we achieved about updating logical formulas (belief states) after deterministic actions. It includes algorithms that represent and update the set of possible action models and world states compactly and tractably. It makes a decision with this set, and updates the set after taking the chosen action. Most importantly, and somewhat surprisingly, the number of actions that our framework takes to achieve a goal is bounded polynomially by the length of an optimal plan in a fully observable, fully known domain, under lax conditions. Finally, our framework leads to a new stochastic-filtering approach that has better accuracy than previous techniques.

* Joint work with Allen Chang, Hannaneh Hajishirzi, Stuart Russell, Dafna Shahaf, and Afsaneh Shirazi (IJCAI'03,IJCAI'05,AAAI'06,ICAPS'06,IJCAI'07,AAAI'07).

Date and Time

Wednesday March 12, 2008 4:15pm - 5:45pm

Location

Computer Science Small Auditorium (Room 105)

Event Type

CS Colloquium Series

Speaker

Eyal Amir, from University of Illinois, Urbana-Champaign

Host

David Blei

Contributions to and/or sponsorship of any event does not constitute departmental or institutional endorsement of the specific program, speakers or views presented.

CS Talks Mailing List

03-12 Reinventing Partially Observable Reinforcement Learning

03-12
Reinventing Partially Observable Reinforcement Learning