From: Kevin Sikorski (kws_at_cs.washington.edu)
Date: Thu May 01 2003 - 21:38:29 PDT
SPUDD: Stochastic Planning using Decision Diagrams
Jesse Hoey, Robert St-Aubin, Alan Hu and Craig Boutilier
This paper presents SPUDD, a version of value iteration that uses ADDs to
solve binary-valued MDPs.
The use of Algebraic Decision Diagrams (ADDs) is the main idea of this
paper. Through this, the solver no longer has to enumerate all the states
in the search space, which yields performance improvements in time (a
faster planner) and in space (larger problems can be represented in main
memory).
A related contribution is the actual variation on classical value
iteration to arrive at an epsilon-optimal policy. The ADD form allows all
expected value computations to be performed at the leaves of the "tree"
rather than at each state as in the classical version.
It's not very clear what SPI (Structured Policy Iteration) is and how it
was implemented. Is this just plain, classical Policy Iteration? If so,
then it's not surprising that SPUDD was so much faster. It would be nice
to see a comparison of how well SPUDD does compared to one of the flavors
of Graphplan. Also, it would be nice to see graphs of how it does on the
worst-case EXPON problem compared to classical value iteration and
graphplan.
Since I reviewed SPUDD two weeks ago, I'll mention again one possible
avenue of future work for SPUDD and ADDs:
In terms of future work, I'd like to see how a relatively simple
improvement to the ADD form would help out SPUDD. In its current form, ADD
does not exploit symmetry - it just fuses together subtrees (subgraphs,
actually) iff they are exactly the same. An even more compact
representation would be a "parameterized ADD", where we also fuse subtrees
when they have the same shape, but different variables. We would then
label the fused subtree with the possible sets of variables that could be
used in those fused nodes. This should further reduce the size of the
graph, resulting in a more compact representation. Something similar to
this may even greatly simplify SPUDDS worst case execution in the EXPON
problem, though you may have to add one more layer of abstraction to make
this work (sort of a second-order ADD). Clearly, this would only work for
certain domains that exhibit a particular type of symmetry, but I think
this would be fairly common in real-life STRIPS problems. I haven't put
any thought into how badly this would affect their version of value
iteration.
This archive was generated by hypermail 2.1.6 : Thu May 01 2003 - 21:38:31 PDT