From: Tal Shaked (tshaked_at_u.washington.edu)
Date: Mon Apr 28 2003 - 22:49:21 PDT
Probabilistic Planning in the Graphplan Framework – A. Blum and J. Langford
This paper explores extending Graphplan by creating PGraphplan and
TGraphplan, which are planners (the first being optimal, the second being
faster) for solving probabilistic planning problems where the solution is a
contingent plan with the highest probability of success over some number of
steps.
PGraphplan works by adding two types of heuristics and then doing a forward
chaining search to compute a complete contingent plan. One heuristic
involves calculating needed nodes which results in early detection of
unreachable goals and propagating pairwise needed nodes where one
proposition depends on another to achieve some goal. A second heuristic
involves propagating backwards the value of the goal through actions and
propositions in such a way that some state can be immediately eliminated if
it does not sum to some minimum amount (this is legal, but may over-estimate
due to double-counting). Forward chaining is chosen to get around the
problem of dealing with goal sets that might make bottom-up dynamic
programming tricky.
TGraphplan runs at approximately the speed of Graphplan and only computes an
optimal trajectory and then is recursively called on other possibilities.
In this way it is fast but may not as a whole find the best solution since
it does not exhaustively explore the branching search space (it is greedy).
The experiments section is unclear. The tables show times, states, and
plans, but what about how close to optimal the plan is? How does one
compare a sub-optimal plan to an optimal one? Graphplan is mentioned, but I
did not see it anywhere in the tables.
It is not clear how much of the search space is expanded by relaxing the
mutexes for TGraphplan as described. This could have been tested.
Generally specific examples could have been constructed to test the
assumptions of the heuristics and relaxations to get a better idea of their
effects.
It would be interesting to get a better idea of how effect the two
heuristics for PGraphplan were. These can expand to other planning domains.
More work can be done about looking at what it means to be sub-optimal in
terms of TGraphplan. As described in the hvalue section, this heuristic
seems interesting and more can be explored in how to distribute the values
as well as the order in examining actions.
This archive was generated by hypermail 2.1.6 : Mon Apr 28 2003 - 22:49:25 PDT