review 3

From: Katarzyna Wilamowska (kasiaw@washington.edu)
Date: Wed Dec 08 2004 - 11:56:10 PST

  • Next message: Ankur Jain: "Review"

    Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL
    Peter D. Turney

     

    Summary

    Paper discusses a simple unsupervised learning algorithm for recognizing synonyms

     

    Important Ideas

    The first important idea in this paper is that the internet is a valuable and large source of data that one can search through. This size allows for PMI-IR to be much more sensitive to sparse data.

     

    I thought that the point of PMI-IR being simpler than LSA was interesting. Since one can use such a large data source to search for an answer, one can have a simpler program and get the same, or better results.

     

    The different types of IR methods was cool. I didn't think of the antonym problem, until I did get to score3.

     

    Flaws

    Lack of experimentation. I would be nice to know if chunk size really does matter.

     

    In the introduction the author hinted as "the expressive power of the search engine's query language" but didn't talk about it after that.

     

    Questions

    Experiment: LSA vs. PMI-IR with same chunk-size

    Experiment: LSA vs. PMI-IR with limited document number

    Experiment: LSA vs. PMI-IR with same chunk-size and limited document number.

    Increasing the performance of PMI_IR with a different IR method.

     


  • Next message: Ankur Jain: "Review"

    This archive was generated by hypermail 2.1.6 : Wed Dec 08 2004 - 11:56:13 PST