Review: Symbolic Heuristic Search for Factored MDPs

From: Julie Letchner (letchner_at_cs.washington.edu)
Date: Tue Nov 18 2003 - 16:52:32 PST

  • Next message: Lincoln Ritter: "$0.02"

    The authors introduce an MDP solver that uses pruning in the symbolic state
    space of ADDs to eliminate states not reachable from the initial state. The
    policy search is made more efficient by eliminating from it states that can
    never be reached.

    None of the techniques in this paper (A*/LAO* search, dynamic programming,
    ADD symbolic representation of MDPs, heuristic generation) are new; the
    contribution here is in combining them to create an algorithm more efficient
    than its components. The results show that this combination is indeed
    successful: factoring MDPs into ADDs generates a reasonably-sized symbolic
    state space, the grouped structure of which allows pruning to be effective
    and its guiding heuristics efficiently computable.

    The combining-old-tools approach taken in this paper possibly explains why
    the evaluation section has length equal to that of the algorithms'
    explanation. The experiments are well-reported; not only do they show
    positive results, but the accompanying explanations do a very nice job at
    high-level credit assignment. The authors' explanations of what features of
    a problem (for example, fringe size and # states evaluated) correlate with
    what types of reachability/size/timing results will prove far more valuable
    than strictly numeric results in suggesting and guiding future work. This
    speaks well to the authors' choice of evaluation metrics, though one
    evaluation I missed was an evaluation of the effects of heuristic accuracy
    on the reachability/size/timing results.

    Just as with the SPUDD paper, my major gripe is that the paper does nothing
    to convince me that the a* and f* problems bear any resemblance to
    real-world MDP problems. In particular, my intuition (though it may be
    wrong) tells me that most symbolic states in well-constructed real problems
    are reachable, and are furthermore the "image" of a state set will expand
    into the entire reachable set of states in a very small number of
    iterations-in which case the value of reachability pruning may be
    negligible.

    I think a good direction for future work would be the development of formal
    characterizations of the space of MDPs. Such a characterization would
    provide a set of characteristics by which various MDPs could be described,
    which would allow the performance of these various algorithms on various
    types of MDPs to be evaluated and compared, providing more concrete
    directions for future work than has been given by the two papers we've read
    on this subject.


  • Next message: Lincoln Ritter: "$0.02"

    This archive was generated by hypermail 2.1.6 : Tue Nov 18 2003 - 16:51:56 PST