From: Alexander Yates (ayates_at_cs.washington.edu)
Date: Fri May 09 2003 - 00:05:26 PDT
Review of "Incremental Pruning: A Simple, Fast, Exact Method for Partially
Observable Markov Decision Processes," by A. Cassandra, M. Littman, and N.
Zhang.
This paper presents a comparison of two methods for solving POMDPs exactly,
a standard witness algorithm and an improvement on it called incremental
pruning. By using an improved method for "purging", the incremental pruning
method reduces the number of linear programs the algorithm needs to solve
and improves the best-case performance of the algorithm.
The main idea of the paper is that the purge routine for calculating a
certain kind of value function is inefficient in the standard witness
algorithm. This calculation requires purging a set of vectors that is the
cross-sum of n other sets of vectors, and the standard algorithm simply
takes the cross-sum and then performs the purge. The incremental pruning
method first purges the first set, then cross-sums it with the second set
and purges the result, and repeats up to the nth set. A generalization of
this method calls the linear programming subroutine with a still smaller
subset of the purged vectors.
Not having any background in this area, I found the paper pretty tough going
at first. Even things that seemed very innocuous at first confused me as I
read along. For example, the value functions are defined as functions of
information states, which are themselves functions. It took me a while to
realize that the information states could be represented as vectors, which
is exactly what they are doing. Other confusions abounded; this was just
one example.
I wasn't quite sure how well-motivated the problem itself was. The solution
they find is for a special case of POMDPs (ones that have piecewise linear
and convex optimal value functions), and I couldn't really tell how
important this class of problems was. They gave no intuition or explanation
for how or where such problems are found.
As for future research -- I first need to understand this area a little
better.
Alex
This archive was generated by hypermail 2.1.6 : Fri May 09 2003 - 00:05:33 PDT