Paper Review

From: Xu Miao (xm_at_u.washington.edu)
Date: Mon Nov 24 2003 - 10:50:05 PST

  • Next message: Lucas Kreger-Stickles: "Acting Optimally in Partially Observable Stochastic Domains - Cassandra et. al."

    Title: Acting Optimally in Partially Observable Stochastic Domains
    Authors: Anthony R. Cassandra, Leslie Pack Kaelbling and Michael L. Littman

    Summary: This paper describes the POMDP and a new algorithm to approximately
    find the optimal solution.

    Ideas:
            1. The paper embeds the Partial Observability to the MDPs model, and
    use a convex piecewise-linear function to represetns the value function. And
    then develops a Witness Algorithm to find the approximate optimal solution
    which can be arbitarily close to the optimal solution of the value iteration
    function.
            2. After the value iteration fucntion, the authors use a policy
    graph to represents the policy, which is generated from the partition of the
    belief space defined by the solution of the value function.

    Flaws:
            1. The idea of partition of the belief space is very impressive, but
    if the authors can give some brief proof and the detailed algorithm of
    construction, it will be more complete.
            2. The result part is too short and not convinced, because there is
    too few description of what kind of problems they solved and no comparison.

    Open research:
            1. Adding policy iteration algorithm. Although the policy graph is
    simple and effective on some small problem, it coulb be very large on some
    big problem. So maybe adding policy iteration algorithm can solve the
    problem effectively.
            2. Maybe find a way to find approximately optimal policies by
    searching only part of the space by some methods, for example, Dean et al.'
    methods mentioned by the authors, or LAO* etc.

            


  • Next message: Lucas Kreger-Stickles: "Acting Optimally in Partially Observable Stochastic Domains - Cassandra et. al."

    This archive was generated by hypermail 2.1.6 : Mon Nov 24 2003 - 10:49:22 PST