SPUDD: Review #4

From: Karthik Gopalratnam (karthikg_at_cs.washington.edu)
Date: Mon Nov 17 2003 - 02:04:29 PST

  • Next message: Lincoln Ritter: "SPUDD: Stochastic Planning Using Decision Diagrams"

    PAPER REVIEW #4: "SPUDD: Stochastic Planning using Decision Diagrams" - Hoey
    et. al.

            This paper presents a different perspective on solving MDP's without
    having to explicitly enumerate the entire state-space, by introducing a
    dynamic abstraction method using ADD's.

            The key premise of the paper is that ADD's induce a method of
    representing value functions and policies by means of a compact decision
    graph, that can then be used to learn the optimal policy. ADD's can make use
    of the structure inherent in the representation of the actions as a DBN to
    represent the functions that denote the conditional distributions defined
    over the actions. Solving the MDP is done subsequently by implementing value
    iteration over the ADD representations for the value functions and the
    CPT's, which have implicitly encoded the structure inherent to the problem.
    The authors present the VI algorithm over the ADD's and also present certain
    optimizations to leverage time-space tradeoffs in implementing the basic
    SPUDD algorithm.

            
            The patently glaring drawback of this method is the fact that ADD's
    can only represent boolean valued functions. Like the authors point out,
    though multi-valued functions can be represented as multiple boolean
    variables, changing the representation seldom preserves the original
    structure that this method was capable of utilizing. Therefore, the blow-up
    in the number of variables that such an abstraction is bound to entail,
    would imply that for general purpose planning problems with multi-valued
    functions, this method might actually perform poorly.
            Though the authors' tests on the FACTORY data present compelling
    evidence that this algorithm is capable of performing very well relative to
    SPI on at least one domain, it remains to be seen, especially in the light
    of the worst-case analysis graphs, how the algorithm compares to other
    methods, on a broader set of planning problems.
            

            It would be interesting to see, given the significantly faster
    machines that are available today, how the time-space tradeoffs suggested by
    the authors in their optimizations affect the run-time of the algorithm, and
    whether larger problems may be feasibly solved using this method. Also, it
    would be very useful to address the drawback of the algorithm in its not
    being able to represent multi-valued functions, in a more efficient manner.
    The effect of implementing dynamic variable reorderings on the effectiveness
    of this method would also be a useful study.
            


  • Next message: Lincoln Ritter: "SPUDD: Stochastic Planning Using Decision Diagrams"

    This archive was generated by hypermail 2.1.6 : Mon Nov 17 2003 - 02:03:50 PST