From: Daniel Lowd (lowd_at_cs.washington.edu)
Date: Mon Nov 24 2003 - 01:28:16 PST
"Acting Optimally in Partially Observable Stochastic Domains,"
Cassandra et al.
This paper gave a rough overview of a new algorithm to solve POMDPs using
linear programming, by finding an inconsistent point in belief space and
augmenting the set of vectors that imply the solution to correct that
inconsistency.
The authors gave a good overview of POMDPs, including how they relate to
fully observable MDPs and various techniques to solve them. They then
discussed an algorithm they had developed to solve them much more
efficiently by using certain linear programming techniques.
The content of this paper, as far as it goes, is fairly accessible and
well-written. Due to space limitations, however, not only is much of the
algorithm's background omitted, the algorithm itself is not present! The
reader is referred to a technical report for the details. The empirical
results are also rather vague: the authors claim it works on a number of
small datasets, but only show a few examples. These are not critical
flaws, but suggest the research requires more space to present in full and
is still in its early stages.
In addition to wondering how the algorithm empirically compares to others
on the same datasets, I wonder if additional optimizations could be made
for factored MDPs, similar to SPUDD.
This archive was generated by hypermail 2.1.6 : Mon Nov 24 2003 - 01:28:17 PST