MDP in Graphplan

From: Nan Li (annli_at_cs.washington.edu)
Date: Tue Apr 29 2003 - 03:44:59 PDT

  • Next message: Sumit Sanghai: "Probabilistic GraphPlan Review"

    Probabilistic Planning in the Graphplan Framework / A.L. Blum and J.C.
    Langford

    This paper explores the extent of Graphplan classical STRIPS domains to
    the framework of MDPs. It describes two (actually three) algorithms, the
    basic dynamic programming algorithm, an optimal algorithm PGraphplan
    (where by "optimal" it means maximizing the probability of success within
    the horizon), and a fast yet potentially sub-optimal algorithm TGraphplan.

    PGraphplan begins with the basic DP algorithm, and as in Graphplan, it
    uses the planning graph to prune the search. The trim has several levels:
      detecting early the unreachable goals from a state, by maintaining a
    vector for each node indicating those reachable goals from it and removing
    those nodes that disconnect with goal literals;
      binding nodes with its necessary context, by propagating the pairwise
    information of "neededness";
      applying an admissible heuristic function to estimate the fastest way
    the goal can be achieved from a state, and pruning states that are not
    fast enough to achieve a goal on time.

    TGraphplan is a greedy algorithm, so it does not gurantee to produce the
    optimal policies. It always chooses the current most likely trajectory,
    and applies a Graphplan-like algorithm on it. The termination is
    controlled by a threshold value of desired probability.

    I think the authors should have give more analysis to TGraphplan,
    theoratically or empirically. Because greedy algorithms always leave
    doubts to people. And I think the top-down DP might cost space, yet no
    comments about space consuming in the paper. Another thing is, I don't
    completely agree that the algorithms are in Graphplan framework. They give
    up a very important feature in Graphplan: they allow only one non-noop
    action per time step. Though, I'm with the authors that the heuristic
    functions for PGraphplan seem not informaive enough, leaving a space for
    improvement.
     


  • Next message: Sumit Sanghai: "Probabilistic GraphPlan Review"

    This archive was generated by hypermail 2.1.6 : Tue Apr 29 2003 - 03:45:07 PDT