From: Jon Froehlich (jfroehli@cs.washington.edu)
Date: Mon Dec 06 2004 - 12:14:48 PST
1. Paper title/Author
Proverb: The Probabilistic Cruciverbalist by Greg A Keim, Noam M Shazeer,
Michael L Littman,
Sushant Agarwal, Catherine M Cheves, Joseph Fitzgerald, Jason Grosland, Fan
Jiang, Shannon Pollard, Karl Weinmeister
2. One-line Summary
Paper introduces a novel crossword solver called Proverb that averages over
95% words correct on sample puzzles and involves many artificial
intelligence techniques including state-space search, probabilistic
optimization, constraint satisfaction, information retrieval, machine
learning, and natural language processing
3. Two Most Important Ideas and Why
(i) This paper was published at AAAI 1999; at the time, only one other
solver existed, which was intended as a solving aid for British-style
crosswords (which differ from American-style) and averaged under 5% words
correct (compared with 95.3% with Proverb). The very idea, then, of a broad
coverage computer system for solving crossword puzzles was unique. This is
probably because, as the authors point out, a successful solution depends on
a wide range of data sources - many of which, like the internet, simply were
not available before (or, at least, not available in such richness). The
authors' ability to effectively integrate multiple artificial intelligence
techniques together with extensive and varied "knowledge" sources is
encouraging and has implications for other challenging AI problems.
(ii) The authors designed a decentralized architecture partitioned into four
components: the Coordinator, the Expert Modules, the Merger, and the Solver.
Though the authors did not fully justify their architectural choices
(perhaps because of space limitations), their design was well-suited for
integrating multiple information sources (e.g. WordNet, imdb, Encyclopedias,
etc.) and multiple artificial intelligence techniques (e.g. state-space
search, probabilistic optimization, constraint satisfaction, information
retrieval, machine learning). The success of their system bodes well for
complex problem tasks that require information from diverse sources. It
would be interesting to see if a system analogous to Proverb could be used
in a non-toy problem.
4. Flaws in Paper
(i) After reading such a well thought out and written paper, I was surprised
that the authors did not include a future work section. Based on the amount
of self-citations referencing previous work that they had done on Proverb (2
of the 7 citations), I expected at least a small inclusion of future
directions.
(ii) A follow-on to (i) above: as successful as Proverb was at solving
crossword puzzles (which was quite remarkable, particularly given the sheer
amount of work needed to program the Expert Modules), I was expecting a
discussion of general applicability of their techniques to the general AI
world. Specifically, what type of problem domains do they see amenable to
this type of solution (i.e. combining AI methods with large, diverse data
sources)?
(iii) Lastly, and perhaps most importantly, I thought this paper needed a
larger analysis section covering details such as: what problems were
particularly difficult for Proverb to solve? Did these problems have an
overlapping structure? What type of new Expert Modules (or other solutions)
might be needed to tackle these difficult clue-target combinations? Of
course, the authors did do a little of this in their analysis of the New
York Times Crossword Puzzles (e.g. Proverb averaged 95.5% words correct on
easier puzzles - Monday through Wednesday - and 85.0% correct on harder
puzzles). Their explication of the "hard" tournament puzzles was also
lacking - what accounted for the disparity between the tournament champion
and Proverb.
5. Two or Three Important Open Research Questions on Topic and Why They
Matter
(i) It would be interesting to investigate whether Proverb's techniques
could be applied to other knowledge type problems. Perhaps not on the holy
grail of AI scale (e.g. common sense type problems), but maybe in a limited
domain (something analogous to crossword puzzles) like elementary school
science, math, English workbooks.
(ii) Perhaps out of sheer curiosity I would be interested in seeing how
Proverb faired in all 26 years of American Crossword Puzzle Tournaments. The
year they tested Proverb (1998) there were 251 contestants, this past year
(2003) there were 495 contestants. Would Proverb fair better in any one of
those tournaments? And, if so, what aspects of the puzzle structure made it
possible for Proverb's scores to improve? Could this data be used to improve
Proverb's algorithms? Would advances in computer speed automatically
increase Proverb's rating (given that solving time was rewarded)?
(iii) If we were to reevaluate Proverb five years later, what other advances
in AI or information retrieval could benefit the efficacy of the system?
This archive was generated by hypermail 2.1.6 : Mon Dec 06 2004 - 12:14:54 PST