From: Tal Shaked (tshaked_at_u.washington.edu)
Date: Tue Apr 22 2003 - 00:31:53 PDT
Planning with Incomplete Information as Heuristic Search in Belief Space –
B. Bonet, H. Geffner
This paper discusses how to explicitly represent planning problems with
incomplete information as search problems in belief space, and then
describes a planner they built that uses a heuristic computed from a relaxed
problem with full state observability, which is tested on several domains.
It was interesting to see how conformant planning, contingent planning, and
probabilistic contingent planning were each represented as search problems
in belief space. Specifically it is shown how conformant planning can be
described as deterministic planning in belief space, while both contingent
planning and probabilistic contingent planning can be described as
non-deterministic planning in belief space. In this way the optimal cost
function is the Bellman equation that assigns a cost to a belief state based
the minimum cost of an action plus the resulting belief state.
The primary heuristic used is the optimal cost function over a relaxed
problem with full state observability which is computable in polynomial time
to the size of the state space.. In this way the value of a given belief
state can be estimated from the maximum value obtained over the relaxed
problem for each possible state in the belief state. A greedy policy alone
is insufficient since it may result in long paths or looping, so instead a
combination is used call real time dynamic programming. Initially the
heuristic is used, but as the search progress the costs are updated with
more accurate values.
There were a few times when results showed little or no improvement from the
heuristic search. This was noted, but no explanation or guess was given
which seems a little dissatisfying. The tables contain a lot of numbers, so
it might have been better to graph them to see how fast they grow and
compare the slopes to other planners. Only toy problems were used, so it is
not clear how useful these techniques are for real-world problems,
considering that it may not scale well.
Future work could include exploring why some domains worked well (and not
well) for the given heuristic. As mentioned in the paper, developing
heuristics for other domains such as those that are purely information
gathering problems, could be looked at. Perhaps a cheaper heuristic can be
used that will scale to larger problems (10^5 does not seem that big for a
state space).
This archive was generated by hypermail 2.1.6 : Tue Apr 22 2003 - 00:31:56 PDT