From: Tal Shaked (tshaked_at_u.washington.edu)
Date: Fri May 30 2003 - 09:06:32 PDT
Sapa: A Domain-Independent Heuristic Metric Temporal Planner – M. Do, R.
Kambhampati
Sapa is a forward chaining planner that handles durative actions, metric
resource constraints, and deadline goals. It describes how states are
defined to include time, events that will happen, and functions for metric
resources. A relaxed temporal planning graph is constructed from which
heuristics (some admissible) are derived that guide the search.
An RTPG (relaxed temporal planning graph) is constructed to guide several
heuristics. An RTPG is built from relaxed actions by eliminating delete
effects and those that reduce the level of some resource. As a result there
are no mutexes, meaning that the RTPG primarily recognizes the structure in
terms of time and order of actions. Specifically the RTPG is used to prune
states that cannot lead to a solution, use time points of goals as lower
bounds (admissible), and build a relaxed plan that estimates distance to
goals. Constructing an RTPG requires adding all actions applicable at some
time point, adding the future events of those actions to a queue, and then
advancing the time to execute events in the queue. Then the process repeats
by adding possible actions, advancing time to the earliest event, and
updating facts accordingly.
There are many admissible heuristics (dependent on the objective function)
that can be drawn from the RTPG which include max-span, min-slack,
max-slack, sum-slack. Note that slack is the difference in time between when
the goal is achieved in a plan, and the deadline specified for achievement.
The above heuristics focus on time points at which goals are achieved rather
than the length of the plan, which tends to work well in classical planning.
Sum-action and sum-duration are two more heuristics for temporal planning,
based on a relaxed plan built backward from the goal similar to the
Graphplan algorithm. Note that these heuristics are not admissible because
their values are based on the first relaxed plan found rather than the
smallest real plan, and there is no guarantee that the relaxed plan is
smaller than the real plan. I am not sure exactly why, but perhaps because
the relaxed plan considers parallel actions rather than smallest number of
actions, and does not have a notion of the sum of durations?
None of the above heuristics take into account the metric resource
constraints. Therefore actions that refuel or deposit money into an account
might not be included in relaxed plans that involve summing actions or
durations. A logical way around this is to find the total amount of
resources consumed for a relaxed plan. If this exceeds the initial amount
plus the produced amount for that plan, find the number of actions needed to
satisfy the necessary consumption. Then incorporate these additions to the
sum-actions and sum-durations.
The experiments focused primarily on sum-action and sum-duration heuristics,
perhaps because max-span and slack-value heuristics do not scale up well.
It would be interesting to hear more about why this is the case.
Generally the results are not conclusive relative to search/time. Sum-action
seems to perform better as a whole, but not always, and adding the metric
resource adjustment helps the sum-action a lot more than the sum-duration.
For plan quality, results seemed to indicate good plans meaning few
irrelevant actions. Heuristics or techniques that tend to disfavor
advance-clock actions naturally lead to better makespan times, which led to
some choices on how and when to calculate heuristics. There are some cases
where the plan quality is still 3x longer than normal which requires future
work to improve.
It’s a little odd that several of the admissible heuristics are mentioned,
and yet no results are shown since they do not scale up well. Perhaps more
could be said about this, or maybe some analysis of the simple problems it
can solve can indicate whether it looks promising in terms of plan quality
and nodes expanded.
Future work will most likely focus on fine-tuning the heuristics. In
particular, it might make sense to use less relaxed problems to base the
heuristics on, such as finding ways to include mutexes in the planning
graphs. There might also be a way to have a better sense dynamically of when
resources will need to be refilled, which will lead to more accurate graphs.
Perhaps a certain structure can be found in the relaxed problems such as
sequence of actions such as load/refuel-fly-unload/refuel which can speed up
the search (i.e. some variation of compiling complex actions). It would also
be interesting to see what exactly is causing the bad quality plans and
figure out what heuristic is necessary to fix that problem, which ties back
to how exactly the advance-clock action is considered in heuristics.
This archive was generated by hypermail 2.1.6 : Fri May 30 2003 - 09:06:33 PDT