Heuristic Search in Belief Space

From: Nan Li (annli_at_cs.washington.edu)
Date: Tue Apr 22 2003 - 04:41:17 PDT

  • Next message: Parag: "Planning with incomplete information - review"

    Planning with Incomplete Information as Heuristic Search in Belief Space /
    B. Bonet, H. Geffner

    This paper makes explicit the formulation of conformant planning and
    contigent planning as heuristic search in belief space, provides
    practial search algorithms for different cases, and tests them over a
    number of domains.

    The paper first analyses the mathematical models underlying three
    different planning tasks: classical planning, conformant planning, and
    contigent planning. It turns out all the three cases can be defined in a
    general formulation, a heuristic search problem in belief space. The idea
    behind might not be novel, but the authors efforts surely provide a neat
    result. Note that the concept "belief space", which is extended from
    "state space" in classical planning. It can be understood as a structured
    set of state spaces. This extension makes it possible to describe actions
    as transformations of states, even in non-deterministic and probabilistic
    domains.

    Then for each case, the paper suggests practical search algorithms to
    implement the heuristic search. In the most difficulty case, contingent
    planning, the authors suggest an approximat algorithm, for the sake of
    time and memory costs. The basic of real-time dynamic programming is, it's
    hopefully that using greedy policy possibly reduce search cost, when the
    heuristic function is close enough to the optimal cost function, which is
    improved iteratively by dynamic programming. Intuitively, compared with
    the optimal heuristic AND/OR graph search algorithms, the RTDP mighte be
    more effecient, since it updates ( and hopefully improves )its heuristic
    function; but it doesn't maintain a real cost for explored path, it can
    not be theoretically optimal. Though the authors claim that within certain
    limitations the algorithm will eventually terminate and approach the
    optimal solution.

    One thing I expect yet can not find the answer is, when the authors
    compare the empirical results of applying heuristic function with those
    of merely bf-search, they say that there is no notable difference, but
    they don't explain it. Is it because of the choice of heuristic function,
    or it's inherited in the RTDP algorithm? It would have been an interesting
    discussion.

    One open problem comes from a limitation of the RTDP. Only when the belief
    space is finite is the algorithm guaranteed to succeed. Currently, for
    infinite belief space domains, the solution is to choose a suitable
    heuristic function and focus the updates only on "relevant" states. The
    meaning of "relevent" here is unclear, and may not be able to apply in
    some domains.


  • Next message: Parag: "Planning with incomplete information - review"

    This archive was generated by hypermail 2.1.6 : Tue Apr 22 2003 - 04:41:19 PDT