From: Gaurav Bhaya (gbhaya@cs.washington.edu)
Date: Sun Dec 05 2004 - 10:59:56 PST
PROVERB: The Probabilistic Cruciverbalist
Greg A. Keim et al
------------------------------------------------------------------------
-- One Line Summary: This paper describes PROVERB, a system to solve crossword puzzles using "expert" modules which achieves about 95% accuracy on daily puzzles such as those in the New York Times. Important Ideas: There are many useful ideas presented in this paper. Some of them are as follow: - Expert modules can be used to obtain a possible candidate set of words for a particular clue. The authors built expert modules contributing words from various different knowledge bases such as the geographic knowledge base, the movie database built from IMDB etc. Some of the expert modules guess words while some others return synonyms of the given clue. - The observation that most of the clues were repeated or the target words were repeated seems like an interesting one. The authors showed that a module which learnt past crosswords was indeed important in solving puzzles. - Just the idea of expert modules is not sufficient to solve the crossword puzzle. The authors showed in their experiments that the placement of these words was equally important. Their performance degraded substantially without the placement module. Flaws: The paper, I believe, does not have many flaws although it opens leaves a lot of questions unanswered for further research. - The paper provides an intuition for the need of learning old crossword puzzles by showing that the probability of repeat is substantial. Even though the authors measure the probability of target and a probability of clue-target pair, and important characteristic to measure given the fact that clue and clue-target curves are different would be: Clue - Target length. This (if not substantially different from Clue - Target itself) would suggest that when the clue - target length matches an old clue we know the answer with high certainty. - The paper does not explain the logic behind choosing alphabet probability in placement of words as opposed to say, choosing an order for layout of words based on the certainty of expert module etc. - The paper states that the computation time was of the order of 15 minutes. However, since the operation is computationally dependent, it would be good to know the specifications and the resource usage by the system. Open research question: - Can this be made quicker and more accurate? - Does every expert module need to be involved most of the time. Can't the clue provide a hint to assert the likelihood of the target to be found in a particular expert module. This would reduce the computation time by an order of magnitude. - Can this idea be applied to other problems of AI other than for solving crossword puzzle. Answering questions posed in natural language may be an interesting area. - Can this be used to solve cryptic clues as well. Can it be used to solve crossword puzzles in which every alphabet need to be a part of both a down clue and an across clue. The reason this is important is that the above constraints make it less likely for the crossword solver to make mistakes since it is tightly constrained.
This archive was generated by hypermail 2.1.6 : Sun Dec 05 2004 - 11:00:12 PST