Steam-powered Turing Machine University of Washington Department of Computer Science & Engineering
 CSE454 Project Part 5: Pagerank and Final Writeup
  CSE Home   About Us    Search    Contact Info 

Administrivia
 Home
 Using course email
 Email archive
 Policies
Content
 Overview
 Resources
 Lecture slides
Assignments
 Reading
 Project
   

Administrivia

Due Date: Tuesday, Dec 10, 12:00 noon.

Project Specifics

You should have a complete and functional search engine along with a detailed write-up of how it works and what you learned.

What to Hand In

Hand in the URL of a top-level web page that lists your team name and contact information for each member. Clearly explain how to start your server and do searches. Your index and repository should be ready for use. Be sure to also turn in a printed version of your project description.

Remember, you will be graded both on the quality of the artifact you have built and the way it is described (write-up worth 25% of the grade). Make sure your explanations are clear and experimental results presented in a way that is easy to understand. This means that clarity is important (not necessarily length) as well as careful proofreading (it may be helpful to have a friend look over your work for grammar, structure, organization, etc.). Web page(s) should explain:

  • High level description of how your search engine works
    • Important components of your search engine/indexer (like described in the Google paper)
    • Diagram to show important classes and the flow (similar to papers from class)
    • Who did what
  • Implementation details for efficiency and use of memory/disk
    • Describe relevant/interesting datastructures - how big they are as a fuction of vocabulary, documents downloaded, etc.
    • Discuss which datastructures are in memory/disk (and assumptions of the size of memory)
    • Number of disk accesses for major API operations (big-O thinking is sufficient)
  • Statistics/Experimental results (e.g. speed of indexer, pages indexed, search speed, etc.)
    • Think hard about what interesting questions you can ask about your design and include at least one to explore and answer
    • Tables/Graphs to show results (graphs are the best if you can think of a good format - see examples from class readings)
    • Clearly present results and comment on what they mean
  • Discussion
    • Extra features (and implementation if relevant/interesting)
    • What you are particularly proud of
    • Problems encountered (e.g. inaccurate parser, memory, quick sorting, etc.)
    • Where would you improve if you had more time (future work)

Note: If you get stuck or can't complete every part of the assignment, do as much as you can. If you try an ambitious method for information extraction, we understand you may not have as much time for other parts and will take this into account. Partial credit (and extra credit!) will definitely be awarded. If a bug or software glitch gets you, let us know as soon as possible (we'll give credit for finding these, and even more credit for finding a solution or workaround) - but keep working on other parts of the assignment.

Good luck!


CSE logo Department of Computer Science & Engineering
University of Washington
Box 352350
Seattle, WA  98195-2350
(206) 543-1695 voice, (206) 543-2969 FAX
[comments to weld]