From: Julie Letchner (letchner_at_cs.washington.edu)
Date: Sun Dec 07 2003 - 12:13:21 PST
This paper presents and briefly analyzes a crossword-solving system that
makes use of probabilistic techniques for pattern matching/learning (clue
solving), maximization (solution generation), and constraint satisfaction
(grid fitting).
The paper's introduction to the crossword domain gives some neat insights
about the structure of crossword puzzles and what makes them difficult or
not--the intro was fun to read. The authors did a good analysis of their
problem domain before they developed their system, and the effort clearly
paid off in their ability to generate candidate solutions with a set of
well-chosen "expert" modules. I wish I'd done as good a job of analyzing
the robocode domain before designing an architecture for my controller. :)
The major weakness of this paper is its lack of context or
generalization. What is the purpose/utility of this work? Why do we care
if we can solve crossword puzzles? I don't immediately see any broader,
more useful applications for this work, and I think authors have a
responsibility to point such applications out if they want their work to be
taken seriously. Additionally, the large amounts of domain-specific
knowledge used in the development of "expert" modules and the training on
the CWDB makes it all the more important for the authors to show the reader
why & how any of this work is worthwhile.
Another, more methodological, weakness is the lack of explanation of
motivations for the various techniques used in the project. Where did the
three scale/spread parameters come from in the candidate list merging, and
who decided that exponentiation was the right way to use them? If there
was a mathematical basis for this choice, it would be useful to
know. Similarly, if the choice was arbitrary, this would also be useful to
know as an indicator that future work in this area is desirable.
One potential question for future work is: What is it about the last 5% of
accuracy that makes it so difficult? Is it unsolvable clues? Is it the
fact that the target solution's letter distribution is far from the learned
prior, as in the "hawaiianmumuu" example? I wonder if there is some other
paradigm to which the solver could switch to solve the last 5%--for
example, use a completely new prior on letter distribution that emphasizes
uncommon pairings in the English language. Then again, this is assuming
that the solver knows when it's incorrect, which may well not be the case.
Another direction I'd like to see this work go in is in application of the
architecture to more meaningful problems. The concept of using various
"expert modules" to find candidate solutions, and then merging these
solutions in a probabilistic fashion is an appealing general architecture,
especially given the abundance of data available on the internet. I'd like
to see future work that analyzes applications of this design to broader
problems.
~*~*~*~*~*~*~*~*~*~*~
Julie Letchner
(206) 890-8733
University of Washington
Computer Science & Engineering
letchner_at_cs.washington.edu
This archive was generated by hypermail 2.1.6 : Sun Dec 07 2003 - 12:13:35 PST