Paper Review #4 - SPUDD - by Patrick Haluptzok - CSE 573 Fall 2003

From: Patrick Haluptzok (patrickh_at_windows.microsoft.com)
Date: Sun Nov 16 2003 - 23:44:36 PST

  • Next message: Lillie Kittredge: "SPUDD: Stochastic Planning using Decision Diagrams"

    Paper Review #4 - SPUDD - by Patrick Haluptzok - CSE 573 Fall 2003

     

    Paper title/Author:

     

    SPUDD: Stochastic Planning using Decision Diagrams

    by Jesse Holey, Robert St-Aubin, Alan Hu, Craig Boutilier

     

    One-line summary:

     

    This paper describes how algebraic decision diagrams (ADDs) are used to represent value functions in MDPs with state spaces that are too large to deal with by traditional value iteration methods.

     

    The (two) most important ideas in the paper, and why:

     

    MDPs are classically solved by explicit state space enumeration - but this is impractical as the number of states grows extremely large which occurs when a large number of features exist to specify the state. Since real world problems often have a large numbers of variables/features to describe the current state it is important to find a method to model the value function efficiently.

     

    ADDs are an alternative method for representing the value function and learning a policy that help overcome the curse of dimensionality. ADDs are a much more compact representation than the standard tabular form for a value function.

     

    The one or two largest flaws in the paper:

     

    Pet Peeve: The first sentence of the abstract uses the word "recently" twice in an awkward way - it makes me wonder how well the paper overall is put together.

     

    Section 4.1 I find very difficult to follow, even after reading it multiple times. I wish that part was longer and explained the approach better since it is the main point. I think the writer assumes that understanding the issues with SPI and tree-based value function representation are prerequisites to reading the paper, and I haven't developed that expertise over the weekend.

     

    At the end of section 4 the description for choosing the subsets is vague and important efficiency gains using variable elimination isn't done in their modified SPUDD algorithm in Figure 6 - but suggested for the reader to figure out.

     

    Identify two important, open research questions on the topic, and why they matter:

     

    It's not clear to me how well this method works in noisy stochastic environments, the values in a tabular representation will converge to their true value if trained appropriately, but its not clear to me that the ADD wouldn't degenerate to a very large mess because it would think that the difference in value wasn't due to noise, but a change in a variable value that is actually irrelevant. This is important because the real world is noisy and stochastic.

     

    I'm curious if just a plain Jane neural net could be used to represent the value functions, and if that wouldn't be more robust in the face of noise and handle "clustering" states that should have the same value together in a more compact size. This is important because efficient representation of a value function over the features that specify the state is critical to handling large real world problems.


  • Next message: Lillie Kittredge: "SPUDD: Stochastic Planning using Decision Diagrams"

    This archive was generated by hypermail 2.1.6 : Sun Nov 16 2003 - 23:44:38 PST