Steam-powered Turing Machine University of Washington Department of Computer Science & Engineering
 CSE454 Course Overview
  CSE Home   About Us    Search    Contact Info 

Administrivia
 Home
 Using course email
 Email archive
 Policies
Content
 Overview
 Resources
 Lecture slides
Assignments
 Reading
 Project
    The following outline is a tentative list of the topics we hope to cover; however, the ordering will be different in order to put crawling and search-engine design earlier.
  • Introduction
  • Text Processing
    • Classification
      • Naive Bayes Classifier
      • Information Extraction
      • Hidden Markov Models
      • Conditional Random Fields
    • Similarity Measures & Information Retrieval
      • Ranking, TF/IDF, precision / recall, stemming, stop words
      • Latent Semantic Indexing
    • Clustering
      • Expectation Maximization
    • Syntactic Analysis
      • POS Tagging
      • Anaphora
      • Parsing
  • The Web
    • Foundations
      • HTTP, HTML, browser archiecture
      • Server basics, cookies, log files, dynamic page generation
      • Web Programming: AJAX, FLEX, Silverlight
    • Fetching Pages, Spidering & Topic Specific Crawling
    • Web Ranking Techniques
      • Hypertext analysis (page rank, hubs and authorities, anchor text)
      • Spamming: keyword stuffing, doorway/jump pages, cloaking, font tricks
    • Datastructures for Scaling Query Processing
      • Index structures
      • Boolean processing
    • Information Extraction from the Web (KnowItAll)
    • Interface Issues
      • Summarization and snippets
      • Clustering results
      • Collaborative filtering, user modeling, adaptive websites
  • Special Topics
    • Advertising
    • Meta-search, query routing.
    • The Semantic Web, Semantic e-mail
    • Cryptography, security, privacy
    • Micropayments, digital cash, server-side wallets, and e-commerce
    • Scaling and clusters

     


CSE logo Department of Computer Science & Engineering
University of Washington
Box 352350
Seattle, WA  98195-2350
(206) 543-1695 voice, (206) 543-2969 FAX