Sapa - review

From: Tal Shaked (tshaked_at_u.washington.edu)
Date: Fri May 30 2003 - 09:06:32 PDT

  • Next message: Parag: "Sapa - review"

    Sapa: A Domain-Independent Heuristic Metric Temporal Planner – M. Do, R.
    Kambhampati

    Sapa is a forward chaining planner that handles durative actions, metric
    resource constraints, and deadline goals. It describes how states are
    defined to include time, events that will happen, and functions for metric
    resources. A relaxed temporal planning graph is constructed from which
    heuristics (some admissible) are derived that guide the search.

    An RTPG (relaxed temporal planning graph) is constructed to guide several
    heuristics. An RTPG is built from relaxed actions by eliminating delete
    effects and those that reduce the level of some resource. As a result there
    are no mutexes, meaning that the RTPG primarily recognizes the structure in
    terms of time and order of actions. Specifically the RTPG is used to prune
    states that cannot lead to a solution, use time points of goals as lower
    bounds (admissible), and build a relaxed plan that estimates distance to
    goals. Constructing an RTPG requires adding all actions applicable at some
    time point, adding the future events of those actions to a queue, and then
    advancing the time to execute events in the queue. Then the process repeats
    by adding possible actions, advancing time to the earliest event, and
    updating facts accordingly.

    There are many admissible heuristics (dependent on the objective function)
    that can be drawn from the RTPG which include max-span, min-slack,
    max-slack, sum-slack. Note that slack is the difference in time between when
    the goal is achieved in a plan, and the deadline specified for achievement.

    The above heuristics focus on time points at which goals are achieved rather
    than the length of the plan, which tends to work well in classical planning.
    Sum-action and sum-duration are two more heuristics for temporal planning,
    based on a relaxed plan built backward from the goal similar to the
    Graphplan algorithm. Note that these heuristics are not admissible because
    their values are based on the first relaxed plan found rather than the
    smallest real plan, and there is no guarantee that the relaxed plan is
    smaller than the real plan. I am not sure exactly why, but perhaps because
    the relaxed plan considers parallel actions rather than smallest number of
    actions, and does not have a notion of the sum of durations?

    None of the above heuristics take into account the metric resource
    constraints. Therefore actions that refuel or deposit money into an account
    might not be included in relaxed plans that involve summing actions or
    durations. A logical way around this is to find the total amount of
    resources consumed for a relaxed plan. If this exceeds the initial amount
    plus the produced amount for that plan, find the number of actions needed to
    satisfy the necessary consumption. Then incorporate these additions to the
    sum-actions and sum-durations.

    The experiments focused primarily on sum-action and sum-duration heuristics,
    perhaps because max-span and slack-value heuristics do not scale up well.
    It would be interesting to hear more about why this is the case.

    Generally the results are not conclusive relative to search/time. Sum-action
    seems to perform better as a whole, but not always, and adding the metric
    resource adjustment helps the sum-action a lot more than the sum-duration.
    For plan quality, results seemed to indicate good plans meaning few
    irrelevant actions. Heuristics or techniques that tend to disfavor
    advance-clock actions naturally lead to better makespan times, which led to
    some choices on how and when to calculate heuristics. There are some cases
    where the plan quality is still 3x longer than normal which requires future
    work to improve.

    It’s a little odd that several of the admissible heuristics are mentioned,
    and yet no results are shown since they do not scale up well. Perhaps more
    could be said about this, or maybe some analysis of the simple problems it
    can solve can indicate whether it looks promising in terms of plan quality
    and nodes expanded.

    Future work will most likely focus on fine-tuning the heuristics. In
    particular, it might make sense to use less relaxed problems to base the
    heuristics on, such as finding ways to include mutexes in the planning
    graphs. There might also be a way to have a better sense dynamically of when
    resources will need to be refilled, which will lead to more accurate graphs.
    Perhaps a certain structure can be found in the relaxed problems such as
    sequence of actions such as load/refuel-fly-unload/refuel which can speed up
    the search (i.e. some variation of compiling complex actions). It would also
    be interesting to see what exactly is causing the bad quality plans and
    figure out what heuristic is necessary to fix that problem, which ties back
    to how exactly the advance-clock action is considered in heuristics.


  • Next message: Parag: "Sapa - review"

    This archive was generated by hypermail 2.1.6 : Fri May 30 2003 - 09:06:33 PDT