From: Daniel Lowd (lowd_at_cs.washington.edu)
Date: Wed Nov 19 2003 - 02:38:33 PST
"Symbolic Heuristic Search for Factored Markov Decision Processes"
[Feng & Hansen, 2003]
This paper presented a new algorithm for solving factored MDPs that
combined the heuristic approach of LAO* with the symbolic gains of SPUDD.
The main contribution of this paper is an algorithm that may have all the
advantages of two MDP solution approaches: heuristic search and factoring.
The authors present in moderate detail how these approaches can be
combined and present empirical evidence suggesting that it really is a
good idea. As a second contribution, they provide some interesting
insight into the factory examples developed for testing SPUDD.
The weaknesses of this paper lie in the complexity of the algorithm
explanation and the limitations of the empirical results. In the
algorithm explanation, a few comments in the pseudocode or more helpful
variable names (E? F?) might have clarified it some. The empirical
results suggested that in some cases LAO* works better than SPUDD, in
other cases SPUDD works better than LAO*, but in all cases Symbolic LAO*
is competitive with or much better than the both of them. This is a very
nice result, but the datasets seem somewhat limited -- they admit to
discovering a lot of structure in the factory dataset, and the new dataset
they propose is entirely artificial. (Actually, the factory dataset was
made up, too...) Are there any other factored MDP problems appropriate
for these algorithms? I'd like to know if these results really hold
across a variety of types of data.
Open questions include the possibility of integrating other MDP techniques
into a still more complicated (and hard to explain in 6 pages) algorithm,
as well as in what domains various algorithms do best. We've already seen
from the factory examples vs. the a examples that the dataset structure
matters a great deal; what other dataset structures are there in the real
world, and do certain algorithms take advantage of them?
This archive was generated by hypermail 2.1.6 : Wed Nov 19 2003 - 02:38:35 PST