POMDP Paper Review

From: Julie Letchner (letchner_at_cs.washington.edu)
Date: Sun Nov 23 2003 - 18:59:19 PST

Next message: Raphael Hoffmann: ""Acting Optimally in Partially Observable Stochastic Domains""

Previous message: Patrick Haluptzok: "Paper Review #6 - by Patrick Haluptzok - CSE 573 Fall 2003"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

         This paper presents a value-iteration "witness" algorithm over an
MDP of continuous belief states that encapsulate an underlying POMDP
defined over finite world states.
         One fundamental contribution of this paper is the use of belief
states as a mechanism for formulating POMDPs as fully observable (though
continuous) MDPs. This representation fully encapsulates the POMDP problem
in that an optimal solution to the belief state problem will be an optimal
solution to the underlying POMDP. This is important, since even the best
algorithms can produce solutions only as good as the problem formulation
they work upon.
         The major contribution that forms the focus of this paper is the
"witness" algorithm that copes with the continuous-spaced belief state by
iteratively refining a partitioning of this space into regions that share
the same optimal value and optimal choice of action. I like this algorithm
for its cleanliness and relative simplicity, but--no thanks to the
authors--I have little knowledge of other belief-state algorithms to use as
a basis for comparison.
On a related vein, evaluation of the witness algorithm is nonexistent
here. The authors state that its advantage is that it does not require
exponential time to run if the number of vectors required to partition the
belief space is not exponential--meaning that the running time is
proportional to the number of vectors required. It is not clear to me how
this property is not met by the existing POMDP algorithms mentioned (but
not named or explained) by the authors. Despite the fact that it's clear
the authors have a good idea, an explicit high-level comparison with other
algorithms or a concrete comparison of their respective running times on
various problems would go a long way in improving the credibility of their
paper.
         Another evaluation I would have liked to see is
time-to-convergence analysis for the witness algorithm and/or heuristics
for choosing the value of delta that determines termination. The
practicality of this algorithm will depend heavily on the convergence
characteristics. How frequently are these POMDP problems solvable with a
non-exponential number of vectors?
         Areas for future work are consideration of the questions above, as
well as concrete evaluation of the witness algorithm on non-trivial
problems and with respect to existing techniques.

~*~*~*~*~*~*~*~*~*~*~
Julie Letchner
(206) 890-8733
University of Washington
Computer Science & Engineering
letchner_at_cs.washington.edu

Next message: Raphael Hoffmann: ""Acting Optimally in Partially Observable Stochastic Domains""

Previous message: Patrick Haluptzok: "Paper Review #6 - by Patrick Haluptzok - CSE 573 Fall 2003"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.6 : Sun Nov 23 2003 - 18:59:45 PST