Acting Optimally in Partially Observable Stochastic Domains

From: Alan L. Liu (aliu_at_cs.washington.edu)
Date: Mon Nov 24 2003 - 02:38:01 PST

Next message: Karthik Gopalratnam: "POMDP Review"

Previous message: Keith Noah Snavely: "Review: "Acting Optimally in Partially Observable Stochastic Domains""
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Paper reviewed: Acting Optimally in Partially Observable Stochastic Domains
Authors: Anthony Cassandra, Kaelbling and Littman

Summary:
This paper gives an overview of MDPs, POMDPS, and their Witness
Algorithm for approximating the value function.

Big ideas:
POMDPs are MDPs over belief states instead of regular states, because
actions deterministically move from one belief state to another.

The Witness Algorithm they propose, which when used with value iteration
can generate arbitrarily close to optimal policies. It finds a
"witness," which is an example which shows that the current approximate
value function differs from the optimal value function given the values
approximated in the last iteration.

Flaw:
The paper mentions that the Witness algorithm is guaranteed to be
non-exponential when the number of belief states is not exponential.
However, there's no mention of what kinds of cases should require
exponential belief states, and what kinds don't, or whether there are
useful problems that do require exponential number of belief states.

Open research questions:
The authors mentioned trying this on real problems. It would be
interesting to see how large the POMDP can be such that this algorithm
can still produce a good, approximate policy within a reasonable amount
time.

One recurring theme is that it would be nice to have a corpus of
problems to use for comparing algorithms (in this case, a set of POMDPs).

Next message: Karthik Gopalratnam: "POMDP Review"

Previous message: Keith Noah Snavely: "Review: "Acting Optimally in Partially Observable Stochastic Domains""
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.6 : Mon Nov 24 2003 - 02:38:01 PST