Incremental Contingency Planning

From: Alexander Yates (ayates_at_cs.washington.edu)
Date: Mon May 05 2003 - 16:23:13 PDT

  • Next message: Kevin Sikorski: "Incremental Contingency Planning"

    Review of "Incremental Contingency Planning," by R.Dearden, N. Meuleau, S.
    Ramikrishnan, D. Smith, and R. Washington.

    This paper proposes a method for incrementally building probabilistic plans
    that reason over continuous properties. The basic method is to build a
    straight line plan with actions having their most likely effects, and then
    add branches to the plan at the places where the branch would have the
    highest utility, taking into effect the probability that the branch will be
    taken.

    The Mars Rover motivates the idea of planning over continuous time and
    resources. This kind of problem is something like an MDP with an
    uncountable state set (they assume full observability of states), so
    traditional MDP techniques that visit the entire state set to build a full
    policy are inappropriate. This research focuses instead on building a plan
    for a specific initial state and a set of goal states, rather than a
    complete policy, somewhat like PGraphplan and TGraphplan. Also like
    TGraphplan, it begins by finding the best trajectory to the goals (the main
    branch). It then uses utility estimates to find the best places to add
    branches to the plan. The utility estimates are propagated backwards from
    the goal nodes through a planning graph, and they are represented as a
    piecewise constant function of the resources.

    I don't quite understand why this planning problem is called contingent
    planning, since they assume full observability. That conflicts with the
    previous notion of contingent planning that we've seen. The reason for
    branches in these plans aren't because there is uncertainty in what state
    the planner is in, but because it's impossible to enumerate all of the
    different outcomes. Also, the paper never discusses probabilistic effects
    for state transitions other than the ones effecting the continuous
    attributes. I wasn't sure if this meant that all actions were deterministic
    except in their resource consumption, or if it meant that any attributes
    which might have probabilistic transitions would need to be treated as
    resources.

    Another approach to this kind of problem would be to automatically
    discretize the continuous attributes into bins. It seems like what they are
    doing is kind of figuring out the best way of selecting certain bins. A
    possible direction for future research is find other ways of automatically
    discretizing the state space, and then applying more standard MDP
    approaches. Another possible direction is to look for better heuristics,
    perhaps using different representations of the utility function (piecewise
    linear or something).

    Alex


  • Next message: Kevin Sikorski: "Incremental Contingency Planning"

    This archive was generated by hypermail 2.1.6 : Mon May 05 2003 - 16:23:14 PDT