From: jklink@u.washington.edu
Date: Mon Dec 06 2004 - 08:38:05 PST
PROVERB: The Probabilistic Cruciverbalist
Greg A. Keim, Noam M. Shazeer & Michael L. Littman (et al.)
One-line summary
The paper describes the construction of a CWP solver, using a variety of AI techniques, combined with high-performance hardware, in a highly modularized way.
Main ideas
The PROVERB system uses a highly modularized system for finding the optimal solution, given a set of clues. The clues are passed on to the modules, which return a list (possibly empty) of matching words (targets). Finding possible targets is done by using techniques of various sophistication, each suitable for different categories of clues.
The targets are return form each expert modules with an estimate of the likelihood (according to the modules beliefs) of the true solution being in its list, and also a probability score for each target to be the correct suggestion. Lists are composed in a variety of ways, ranging from database searches in word lists to information retrieval from online special-purpose dictionaries. These lists are then collected and re-weighted by a Merger, according to the estimated accuracy of each specific module finding the answer to the type of clue at hand.
Relying on high-performance hardware, the above solution is argued to have a fairly high degree (95.3% of the words correct) of correct solutions on a 370 puzzle sample from different sources, including highly regarded ones as the New York Times. These sample puzzles are also used for machine learning; to identify and solve puzzles humans consider challenging.
Flaws
A big flaw in the construction of PROVERB is that even though it uses advanced AI techniques (such as natural language processing, constraint satisfaction for example), it relies too heavily on the use of big and exhaustive searches in huge databases. The techniques described in this paper could be further refined, and by using more comparison to how humans reason while solving CWPs as a heuristic, the search can be further refined and made more effectively. Also, the true value at its fullest extent can be drawn from the more advanced AI techniques.
Another thing is that even though stated as a crucial factor, a description of the system and its hardware used is nowhere to be found. It is neither included in the discussion nor conclusions exactly why PROVERB performed so badly on the tournament/late week New York Times CWPs. A discussion on what to learn from these observations, and how to further improve the system, is absent in the paper.
Open questions and improvements
As noted in the Flaws section, one big question is how to further improve the performance of PROVERB on more advanced CWPs. Also, included in this optimization, is the above mentioned reduction in reliance on hardware for performing the costly searches needed to produce a solution.
Another part open for improvements in this implementation is the analogy with human reasoning while solving CWPs. What can be learned from how humans reason in solving the puzzles at hand, and how can we do it without the vast searches used in PROVERB? Can clues be further classified, to allow an even better prediction from expert modules, and be as finely tuned as the successful medical expert systems at classification? Or are humans reasoning in a totally different way when it comes to solving CWPs? The discussion on these questions are missing from the paper, and still open for research and further improvements and gains to the AI domain.
This archive was generated by hypermail 2.1.6 : Mon Dec 06 2004 - 08:38:06 PST