|
|
|
|
- The "Google" paper:
The Anatomy Of A Large-Scale Hypertextual Web Search Engine,
Sergey Brin and Lawrence Page, Stanford University, 1999. [Req]
- How to implementy PageRank Efficiently [Req]
- Basic IR textbook
Modern Information Retrieval,
R. Baeza-Yates and B. Ribeiro-Neto, Addison Wesley, 1999.
Covers vector space model (section 2), precision/recall (3), inverted
files (8), and inverted file compression (7.4.5)
- Discussion of Latent Semantic Indexing
How
LSI Works
Visual
introduction to principal components analysis (used in lsi)
- The authority and hubs model:
Authoritative Sources in a Hyperlinked Environment,
Jon Kleinberg, Proc. 9th ACM-SIAM Symposium on Discrete Algorithms, 1998. Extended version in Journal of the ACM 46(1999). Also appears as IBM Research Report RJ 10076, May 1997.
- On the stability of PageRank and HITS and the connection to LSI,
Link Analysis, Eigenvectors and Stability,
A. Ng, A. Zheng, and M. Jordan. IJCAI-01.
Requires some linear algebra and math bravery, but very good.
- The "search engine"-related web site:
Search Engine Watch,
Danny Sulivan.
- Question Answering on the Web
|