From: Nan Li (annli_at_cs.washington.edu)
Date: Fri May 02 2003 - 02:16:05 PDT
SPUDD: Stochastic Planning using Decision Diagrams /
Jesse Hoey, Robert St-Aubin, Alan Hu, Craig Boutilier
This paper proposes and examine a method for solving MDPs with large
spaces. Value functions and policies in MDPs are represented compactly in
ADD, yet classical value iteraction algorithm can still be used to
construct optimal policies.
The authors make their motivation very clear. In order to solve large
scale MDPs, they want to use some abstraction technique: ADD, which is a
generalization of ordered BDD. Unlike the standard tabular representation
of CPTs, which describes each defined conditional distribution separately,
the geometrical structures make it possible for states that agree on
partial value to share a same sub-structure. As such, the dynamic
programming process (to construct optimal policies) can potentially avoid
the explicit enumeration of the state space, saving both space and
computational time.
The authors then suggest methods to optimize the cost of space and time,
respectively. For space cost, the authors notice that during the process,
there are some intermediate steps that need large space. They use a kind
of "divide-and-conquer" method: instead of manipulate two large diagrams
and sum one final result, an alternative is to each time manipulate one
large diagram with only a sub-diagram of the other large one, and sum the
one-step result at once. For time cost, the authors explain that some
repeated computation can be pre-computed. Unfortunately, the two saving
methods are in conflict. Thus, they further implement a method to control
the trade-off, simply speaking, by choosing the size of the sub-diagram.
I don't have much to complain about this paper, although I feel some
parts are not easy to understand. One place left for extension is the
dynamic reordering of variables.
--
This archive was generated by hypermail 2.1.6 : Fri May 02 2003 - 02:16:06 PDT