MDP in Graphplan

From: Nan Li (annli_at_cs.washington.edu)
Date: Tue Apr 29 2003 - 03:44:59 PDT

Next message: Sumit Sanghai: "Probabilistic GraphPlan Review"

Previous message: Alexander Yates: "PGraphplan"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Probabilistic Planning in the Graphplan Framework / A.L. Blum and J.C.
Langford

This paper explores the extent of Graphplan classical STRIPS domains to
the framework of MDPs. It describes two (actually three) algorithms, the
basic dynamic programming algorithm, an optimal algorithm PGraphplan
(where by "optimal" it means maximizing the probability of success within
the horizon), and a fast yet potentially sub-optimal algorithm TGraphplan.

PGraphplan begins with the basic DP algorithm, and as in Graphplan, it
uses the planning graph to prune the search. The trim has several levels:
  detecting early the unreachable goals from a state, by maintaining a
vector for each node indicating those reachable goals from it and removing
those nodes that disconnect with goal literals;
  binding nodes with its necessary context, by propagating the pairwise
information of "neededness";
  applying an admissible heuristic function to estimate the fastest way
the goal can be achieved from a state, and pruning states that are not
fast enough to achieve a goal on time.

TGraphplan is a greedy algorithm, so it does not gurantee to produce the
optimal policies. It always chooses the current most likely trajectory,
and applies a Graphplan-like algorithm on it. The termination is
controlled by a threshold value of desired probability.

I think the authors should have give more analysis to TGraphplan,
theoratically or empirically. Because greedy algorithms always leave
doubts to people. And I think the top-down DP might cost space, yet no
comments about space consuming in the paper. Another thing is, I don't
completely agree that the algorithms are in Graphplan framework. They give
up a very important feature in Graphplan: they allow only one non-noop
action per time step. Though, I'm with the authors that the heuristic
functions for PGraphplan seem not informaive enough, leaving a space for
improvement.

Next message: Sumit Sanghai: "Probabilistic GraphPlan Review"

Previous message: Alexander Yates: "PGraphplan"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.6 : Tue Apr 29 2003 - 03:45:07 PDT