From: Masaharu Kobashi (mkbsh_at_cs.washington.edu)
Date: Mon Nov 17 2003 - 09:00:14 PST
Title: SPUDD: Stochastic Planning using Decision Diagram
Authors: Jesse Hoey, Robert St-Aubin, Alan Hu, Craig Boutilier
[Summary]
The paper describes a new way of solving large scale MDPs,
a dynamic abstraction method using ADDs and it also reports
its effectiveness and problems.
[Two most important ideas]
First, the implementation of the new method can substantially
improve both time and space efficiencies in solving large scale
MDPs depending on problem structures. It is valuable, since it
is, presumably, the first to implement this method for solving
MDPs and the improvement in efficiency can be substantial.
Second, although it is a little ironical, the paper revealed
some inherent difficulties and limitations of their method
in achieving their goal of improved computational efficiencies.
They face problems of exponential space usage and solved it
by some space-time trade-off using what they call,
"tuning-knob". It is a compromise and not a clean solution for
all cases. But this finding is important in that it revealed
an important obstacle in using the novel method.
[Flaws in the paper]
First, in describing the effectiveness in terms of numbers
(i.e., they claim 30 times, 40 times, etc) their claims are
based on a single type of problems. By its very nature of their
method, the effectiveness largely depends on the type of the
structure of the problem.
If there are a lot of identical sub-trees, obviously their method
can improve the efficiency a lot. On the other hand some problems
can make their method extremely inefficient depending on the
ordering of variables and degree of uniqueness of states' values.
Therefore, in order to be objective in assessing the effectiveness
in general, they have to study general characteristics of the problem
domains more precisely.
All that impressed me is not those numbers (30 times, 40 times, etc)
but the way they improved the efficiency. It is clear enough that
the effectiveness depends on a few characteristics of the target
problem (i.e. how much repetition of equivalent subtrees, ordering
of variables, uniqueness of the states values). Actual realized
times of efficiency based on a single type of problems has little
persuasive power.
Second, their analysis of the effect of the ordering of variables
is not enough (they did nothing about it). It must be done, since
it is a critical factor in deciding the effectiveness of their method.
[Important open research questions]
First, the problem variable ordering must be studied more
thoroughly to find the mechanism of impacting the efficiency
of their method as well as the way to automate the optimal
reordering, since, as stated above, it is one of the critical
factors that determine the value of their method.
Second, the problem of the potential exponential space-usage,
which they overcame with a kludge, "tune-knob", must be
addressed more seriously and has to be solved in a clean way.
Otherwise their method cannot be a practical general solution
to the large scale MDPs.
This archive was generated by hypermail 2.1.6 : Mon Nov 17 2003 - 09:00:14 PST