review

From: Lincoln Ritter (lritter_at_cs.washington.edu)
Date: Mon Nov 24 2003 - 00:02:32 PST

  • Next message: Daniel J Klein: "Acting Optimally in Partially Observable Stochastic Domains"

    Acting Optimally in Partially Observable Stochastic Domains
    Cassandra, Kaelbling, Littman
    Reviewed by Lincoln Ritter

    Summary:
    The authors reformulate MDPs in terms of belief and describe the
    "witness" algorithm for value iteration for optimal behavior in
    partially observable environments.

    The use of piecewise linearization of continuous valued functions
    seems to be a key insight in this paper. This technique allow the
    authors to aim for approximations rather than exact solutions as in
    previous work. As presented, it seems that this lead to a large part
    of the gains in computational efficiency presented in this paper.

    The concept of the policy graph struck me as particularly interesting,
    especially in light of our recent reding into symbolic manipulation of
    state spaces. Here, the graph nodes represent aggregations of belief
    states. This leads to simple, uniform treatment of various types of
    actions within the policy. but it seems that the real lesson to be
    drawn, having seen this strategy be so effective in multiply areas, is
    that symbolic manipulation is and representation is an exciting and
    useful technique.

    I have two major criticisms:

    The results of this paper are not presented in any sort of
    quantitative way. Even the qualitative treatment is barely present.
    Barely any formal treatment of the algorithms is given. How is
    someone reading this paper supposed to actually know how important the
    ideas presented here are in a practical sense?

    Further, it was not clear to me reading this paper what work was
    actually being newly presented here. Perhaps i need to spend more
    time with the references, but at each turn in this paper where a new
    idea was being introduced, i was confused if the idea was being built
    off those referenced or was actually their work.

    I think that an interesting avenue for research would be to try to
    apply some of the search techniques reviewed in last weeks paper to
    POMDPs. Whether any speedups would be apreciable given the amount of
    computation that must be devoted to computing conditional
    probabilities and solving linear programs regardless is debatable, but
    clearly it is an approach that may be fruitful.


  • Next message: Daniel J Klein: "Acting Optimally in Partially Observable Stochastic Domains"

    This archive was generated by hypermail 2.1.6 : Mon Nov 24 2003 - 00:02:33 PST