SPUDD

From: Nan Li (annli_at_cs.washington.edu)
Date: Fri May 02 2003 - 02:16:05 PDT

  • Next message: MAUSAM: "SPUDD"

    SPUDD: Stochastic Planning using Decision Diagrams /
    Jesse Hoey, Robert St-Aubin, Alan Hu, Craig Boutilier

    This paper proposes and examine a method for solving MDPs with large
    spaces. Value functions and policies in MDPs are represented compactly in
    ADD, yet classical value iteraction algorithm can still be used to
    construct optimal policies.

    The authors make their motivation very clear. In order to solve large
    scale MDPs, they want to use some abstraction technique: ADD, which is a
    generalization of ordered BDD. Unlike the standard tabular representation
    of CPTs, which describes each defined conditional distribution separately,
    the geometrical structures make it possible for states that agree on
    partial value to share a same sub-structure. As such, the dynamic
    programming process (to construct optimal policies) can potentially avoid
    the explicit enumeration of the state space, saving both space and
    computational time.

    The authors then suggest methods to optimize the cost of space and time,
    respectively. For space cost, the authors notice that during the process,
    there are some intermediate steps that need large space. They use a kind
    of "divide-and-conquer" method: instead of manipulate two large diagrams
    and sum one final result, an alternative is to each time manipulate one
    large diagram with only a sub-diagram of the other large one, and sum the
    one-step result at once. For time cost, the authors explain that some
    repeated computation can be pre-computed. Unfortunately, the two saving
    methods are in conflict. Thus, they further implement a method to control
    the trade-off, simply speaking, by choosing the size of the sub-diagram.

    I don't have much to complain about this paper, although I feel some
    parts are not easy to understand. One place left for extension is the
    dynamic reordering of variables.

    --
    

  • Next message: MAUSAM: "SPUDD"

    This archive was generated by hypermail 2.1.6 : Fri May 02 2003 - 02:16:06 PDT