From: Julie Letchner (letchner_at_cs.washington.edu)
Date: Sun Nov 16 2003 - 23:23:25 PST
This paper presents a representation (ADDs) of MDPs in which decision trees
are represented as algebraic formulas whose solutions are real-valued; it
also presents an algorithm, SPUDD, for calculating an optimal policy
efficiently by leveraging the ADD representation.
I think the main insight in this paper is the idea that decision tree
outcomes can be real-valued and the branches represented algebraically,
instead of in the traditional binary tabular format. This idea seems quite
powerful, and one can hope that it will inspire other non-traditional
representations with structures conducive to efficient policy
generation. The authors' optimizations, on the other hand, don't seem
particularly inspired; they seem to be basic observations on the well-known
tradeoff between memory usage and computation time.
Overall I think the evaluation in this paper is quite weak. I'm not sure
what the authors mean by "the algorithms do not lend themselves easily to
comparisons of running times, since implementation details could the
results;" since implementations of the algorithms are what users will be
concerned with, it seems to me that evaluation of the running times of
these implementations is appropriate for discussion. Empirical running
times are given in Table 1, but accompanying explanations would be nice
particularly as all of the given times are for the same type (Factory) of
problem.
The idea of exploiting structural independences in MDP structure seems to
be a very powerful one; however, an evaluation of how often these
independences arise in "typical" problems would be useful in determining
how excited we should get about algorithms that exploit them. From this
paper alone, it is unclear to me how useful such techniques are, although
on an intuitive level it seems that many problems will exhibit a reasonable
degree of independences. As the paper suggests, other open research areas
include work on choosing optimal variable orderings during iteration and
developing new policy generation algorithms that leverage the ADD structure.
~*~*~*~*~*~*~*~*~*~*~
Julie Letchner
(206) 890-8733
University of Washington
Computer Science & Engineering
letchner_at_cs.washington.edu
This archive was generated by hypermail 2.1.6 : Sun Nov 16 2003 - 23:23:47 PST