From: Bhushan Mandhani (bhushan_at_cs.washington.edu)
Date: Mon Dec 08 2003 - 12:25:46 PST
Summary: The paper presents the design and performance of a crossword
solving system.
Main Ideas:
1. Crossword solving is now a tractable AI problem. However, Proverb comes
across as a fairly complex system built by a large number of people over a
period of time.
2. The system has a nice decoupled architecture. There are several
independent modules which return a probability-weighted list of candidates
for each word slot. These lists are then merged, and the problem reduces
to filling in each word slot from its candidate list, so as to maximize
the expected number of correct words.
3. The very nature of the problem requires several expert candidate
generating modules based on dictionaries, thesauri, the CWDB, IR, topical
databases, etc. It is interesting to see the successful combining of all
these components to get a well performing system.
Flaws:
1, The candidate lists returned from different modules were merged using a
method which used some scale, length-scale and spread parameters. There
was little explanation or motivation given for this method, which is an
important part of the system architecture.
2. It is clear the good performance of the system is largely due to using
a large amount of domain-specific information (the CWDB). Given the CWDB,
the expected novelty for a clue-target pair is 66% (which means 34% or a
third of the targets can be just read off from the CWDB for a new puzzle).
The system seems to be tailored to perform well on the kinds of crosswords
it was evaluated on.
Future Work:
1. More exploration of methods to effectively combine the candidate lists
returned by the individual modules. This problem seems amenable to a
machine learning approach since during training we know which module was
the best predictor for a given target.
This archive was generated by hypermail 2.1.6 : Mon Dec 08 2003 - 12:25:48 PST