|
|
|
|
The following outline is a tentative list of the topics we hope to
cover.
- Introduction [2 classes]
- Networking fundamentals
- Foundational protocols: HTTP, HTML, browser archiecture
- Server basics, cookies, log files, dynamic page generation
- Website management, N-tier architecture, scalability
- Information Retrieval [6 classes]
- Traditional approaches
- Ranking, TF/IDF, precision / recall, stemming, stop words
- Latent Semantic Indexing
- Web-oriented techniques
- Hypertext analysis (page rank, hubs and authorities, anchor text)
- Spamming: keyword stuffing, doorway/jump pages, cloaking, font tricks
- Spider search strategy, macro structure of the Web
- Implementation and scale-up issues
- Index structures, stemming
- Boolean processing
- Summarization and snippets
- Question answering
- Personalization and Adaptive Systems [5 classes]
- Learning, classification, and datamining
- Clustering of search engine results
- Collaborative filtering, user modeling, adaptive websites
- Information extraction
- Topic-specific crawling
- Web Services [3 classes]
- Services: XML, SOAP, .NET
- Brokers, UDDI, WSDL
- Database and XML processing
- Metasearch, hidden web, query routing, and data integration
- The semantic web
- Special Topics [4 classes]
- Cryptography, security, privacy, P3P
- Micropayments, digital cash, server-side wallets, and e-commerce
- Peer to peer systems and protocols
- Viruses, worms, DOS, and defenses
|