Search Engines
No one controls what’s published on
the WWW ... it is totally decentralized
To find out, search engines crawl Web
* Two parts
Crawler visits Web pages building an index
of the content
Query processor checks user requests
against the index, reports on known pages
Only a fraction of the Web’s content is crawled