paper 6

From: Lillie Kittredge (kittredl_at_u.washington.edu)
Date: Mon Nov 24 2003 - 09:49:57 PST

  • Next message: Sandra B Fan: "POMDP paper review"

    /Acting Optimally in Partially Observable Stochastic Domains/ by
    Anthony R. Cassandra, Leslie Pack Kaelbling and Michael L. Littman

    The authors present a new algorithm for constructing optimal policies in
    partially observable MDPs.

    The first main idea is that partially observable MDPs can be redefined
    as a regular MDP in state space made of beliefs. The problem with this
    is that such a space is continuous, which leads to their second main
    idea, the Witness algorithm for dealing with the space. The algorithm
    starts with a course-grained view of the space and iteratively refines
    it, approximating the optimal policy.

    I'm not entirely clear what the original contribution of this paper is,
    as the Witness algorithm is credited to "Cassandra, Kaelbling & Littman,
    1994", which seems to be the authors themselves. Is this just a rehash
    of another of their own papers? That would explain the flaw everybody
    else is pointing out, that the algorithm is poorly explained. Other
    than that same flaw, I have no problem with this paper. I rather like
    the example with the tigers.

    Future research, as pointed out by the authors, will be to work on
    policy iteration, rather than value. Also they mention using this to
    solve real world problems - I'd be interested to see some actual
    physical example. Give me a robot using this algorithm, and then I'll
    be impressed.


  • Next message: Sandra B Fan: "POMDP paper review"

    This archive was generated by hypermail 2.1.6 : Mon Nov 24 2003 - 09:49:14 PST