SPUDD: Stochastic Planning Using Decision Diagrams

From: Raphael Hoffmann (raphaelh_at_u.washington.edu)
Date: Sun Nov 16 2003 - 22:59:27 PST

  • Next message: Julie Letchner: "SPUDD Review"

    "SPUDD: Stochastic Planning Using Decision Diagrams"

    Summary:
    Due to the need of complete state enumeration, solving large MDPs by the
    value iteration algorithm is often infeasible. The authors of this paper
    propose a different (reduced) representation of value functions and show
    that some large MDP problems can be computed efficiently.

    Pros:
    The idea of finding a technique to efficiently solve large MDPs is a very
    important one, since applications arise in many contexts and current
    algorithms often prohibit practical usage. Using ADDs seems to be a
    promising solution. The space complexity can be reduced substantially and
    many problems that could not be solved previously are computable now.
    The authors also invested a lot of effort in further optimizing their
    algorithm and the results in a series of tests look very promising.
    Certainly, not for all MDP problems can states be easily represented by
    merely boolean variables. The authors therefore addressed the possibility of
    transforming multivalued variables into several boolean variables. This
    feature makes ADD also usable when a few variables are multivalued.

    Cons:
    However, the way the authors cope with multivalued variables appears to be
    rather simple and obviously has some flaws: The state space could grow
    quickly - depending on the number of multivalued variables and the size of
    their domains. The effects haven't been examined thoroughly.
    At the beginning of Chapter 4, the authors explain briefly why the ADD
    representation can be so efficient: it exploits the regularities int the
    actions and reward networks. It would be great if the efficiency gain -
    based on a measure of the regularities in the data - could be assessed more
    precisely.

    Future work:
    In my opinion, improvements can be made in how to deal with multivalued
    variables. The authors indicated that it is possible to allow ADDs to work
    directly with multivalued variables. Further research could evaluate the
    performance issues and compare the results with the performance of doing the
    simple boolean conversion of the multivalued variables. It could be possible
    that in some cases the expansion of the state space still has advantages
    over the direct usage of multivalued variables and vice versa.
    The authors suggest that the performance could be significantly improved by
    implementing dynamic variable reordering at runtime. To make this possible,
    algorithms that allow dynamic variable reordering will have to be developed
    and the performance increases analysed.
    As mentioned above, it would be great to calculate the efficiency gain based
    on a measure of the regularities in the data. Perhaps it would then be
    possible to make a priori estimates of the runtime. This could be a very
    helpful information in practical applications.


  • Next message: Julie Letchner: "SPUDD Review"

    This archive was generated by hypermail 2.1.6 : Sun Nov 16 2003 - 22:59:34 PST