(no subject)

From: Ravi Kiran (kiran@cs.washington.edu)
Date: Sun Dec 05 2004 - 20:04:24 PST

  • Next message: Indriyati Atmosukarto: "PROVERB review"

                            "Proverb: The Probabilistic Cruciverbalist"
                    Greg.A.Keim, Noam.M.Shazeer, Michael L.Littman and others

    Summary:

    The authors attack the problem of solving crossword puzzles with a system
    consisting of "expert modules" that specialize in solving specific kinds
    of clues, drawing on ideas from information retrieval, database search,
    and machine learning and speculate that perhaps the time is ripe for using
    the now cheaply available computers with vast memory and disk storage to
    attack other problems previously deemed to be unsolvable in AI.

    Two primary ideas:

    This system demonstrates that complete systems in AI would probably
    involve ideas from more than one sub-field of AI merged in an intelligent
    manner. Also, the decentralized design was particularly helpful in
    assessing the performance of individual modules across the ensemble of
    puzzles.

    The idea of establishing probabilistic distributions over possible
    solutions enables merging of modules using information-theoretic
    techniques. This is important given the variability in the size of
    candidate lists and the differing nature of candidates returned by the
    individual modules.

    One or two largest flaws:

            The design and the architecture of the system are all geared towards solving
    American crossword puzzles. While this is a significant achievement, there were no
    experiments on applying the techniques developed here for puzzles with cryptic clues.
    The system uses clue-target pairs from previous puzzles. Adding the pairs involving
    cryptic clues would have been straightforward. In the very least, the authors could have
    evaluated the reasons for its poor performance, if at all, on cryptic clues.

            The authors claim to have used various machine learning and
    information retrieval techniques. However, expecting to see 91% of
    targets, 50% of clues, 34% of clue-target pairs and 96% of words hardly
    leaves any space for 'learning' or `thinking' on the part of the system.
    In other words, the inference problem has been made extremely simple by
    incorporating a huge amount of prior knowledge. In such a situation, using
    some of the machine learning and information retrieval techniques would
    probably be akin to cutting nails using a lathe machine. :-)

    Two important, open research questions on the topic and why they matter:

            As commented before, the solver would generate a greater deal of
    interest if it could be extended to solve cryptic puzzles. Humans tend to
    solve cryptic puzzles because they go beyond what is given in the clue. It
    would be interesting to ask humans who routinely solve cryptic crosswords
    about the techniques used to solve them and explore methods of
    incorporating this as a 'semantic module' in the architecture of Proverb.

            The probabilistic mechanism given in the paper provides a good
    mechanism to combine information from various modules in order to arrive
    at the correct target. However, the probability measures obtained can
    also be used to determine the extent to which a particular inference
    mechanism produces incorrect candidates. This would help in improving the
    performance of the modules.

            The paper compares the performance of the solver for puzzles from ACPT. Each
    of the puzzle has a time limit. An analysis of the time taken to solve the difficult
    puzzles and specifically, the clue-target pairs which consumed most of the time
    ( in arriving at an answer - correct or incorrect ), would be valuable in modifying
    the design of the solver to attempt such 'time-consumers' with greater speed and accuracy.


  • Next message: Indriyati Atmosukarto: "PROVERB review"

    This archive was generated by hypermail 2.1.6 : Sun Dec 05 2004 - 20:04:24 PST