|
|
|
|
Homework #3
out: Monday May 13th, 2012
due: Monday June 4th, 2012 by 9:00pm.
[
summary |
part a |
part b |
part c |
bonus: part d |
how to submit |
grading ]
In homework #3, you will finish an implementation of a multithreaded
Web server. There are three parts to homework #3. In part A, you
will read through and finish our implementation of several low-level
utilities that the web server will make use of. In part B, you will
implement the web server itself, including the ability to serve
files from the local file system. In part C, you will integrate
your twitter searching and twitter word cloud routines from homework
#2, providing clients with a web form for searching twitter.
Here is the code you should download
that you'll modify for this assignment. As before, we've provided
you with our hw1 and hw2 libraries, in case yours has bugs in it.
But, feel free to replace those libraries with your own libhw1.a and
libhw2.a if you want to build on top of your solutions, rather
than ours.
A Web server is fairly complex and depends upon a number of
lower-level abstractions. In this part of the homework, you will
read through the code of some of the abstractions that we built for
you, and you will build several of your own:
- take a look at ThreadPool.h and ThreadPool.cc. This class
manages a pool of threads and a queue of tasks. When there is an
available thread in the pool and at least one task on the queue,
the next available thread picks up a task and performs it. If all
threads are busy performing tasks, then newly added tasks will
queue up waiting for a thread to finish. We'll use this
threadpool implementation to dispatch incoming connections, so
that our server can process multiple requests concurrently.
- read through HttpUtils.h. This file defines a collection of
helper routines, many of which you need to implement. Next, read
through HttpUtils.cc and notice which of the routines are missing
(search for STEP XXX, as usual). Implement the missing routines,
and make sure that you pass the associated set of unit tests.
Note that if the documentation in HttpUtils.h isn't enough for
you to determine how to implement something, the unit tests also
serve as useful documentation!
- read through ServerSocket.h. This file defines convenience
routines for opening up a listening socket and accepting an
incoming connection from a client. These routines should be
pretty familiar to you from the lecture coding and exercises.
Next, read through ServerSocket.cc and implement the missing code;
you should feel free to cut and paste liberally from our coding
examples in lecture. Make sure you pass the associated unit test.
- read through FileReader.h. This file defines a convenience
routine for reading the contents of a file into memory. Next,
read through FileReader.cc and implement the missing code. Make
sure you pass the associated unit test.
We have provided you with a fairly complete framework for a Web
server. Your job in this part of the assignment is to finish that
implementation, and to get your Web server to the point where it is
able to parse requests for files, read those files into memory, and
respond with the file contents.
- Start by reading through and understanding the code in
HttpRequest.h. This file (which does not have an associated .cc
file!) defines a class that represents an HTTP request. As you
have already learned, a basic HTTP request is fairly simple: it
contains a first line that specifies the URL the client is
requesting, and then it contains a sequence of lines that contain
"header" information provided by the browser. HttpRequest.h
doesn't contain code for parsing requests, but rather just
represents a fully parsed request.
- Next, read through and understand the code in HttpResponse.h.
This file defines a class that represents an HTTP response, and
also contains a method called GenerateResponseString() that
generates the text of an HTTP response based on the other fields
in the class. Customers build up the fields in the HttpResponse
structure, then invoke GenerateResponseString() to generate a
formatted HTTP response ready for writing to the client.
- Now, read through HttpConnection.h. Given a file descriptor
representing an active connection to a client, this class has two
methods. The first is responsible for reading data from the
socket, buffering that data in the "buffer_" instance variable,
detecting when a full HTTP request has been received, parsing that
request into an HttpRequest structure, and returning that through
an output parameter. The second is responsible for writing an
HTTPResponse back to the client.
Read through HttpConnection.cc. This class is largely
unimplemented: implementing it is the largest piece of work you'll
do in this assignment. You should probably design some helper
private methods to get the job done (which will mean editing
HttpConnection.h as well). We recommend you follow the steps
in the comments. Once you're done, make sure you pass the
associated unit test.
- Read through HttpServer.h and HttpServer.cc. This class
implements the Web server itself, making use of all of the
building blocks you've implemented so far. The tricky part of
this class is how it uses the threadpool -- we've done that part
for you. Implement the missing piece inside HttpServer.cc,
namely the routine that tests a URL to see if it starts with the
substring "/static", and if so, extracts a filename from the
remaining part of the URL, reads that file into memory, and
builds an HTTP response from it.
- Test your web server by launching it, and interacting with
it via your browser. For example, if you are on attu4.cs, pick
a port number to start your server on, such as 5488. (don't pick
that one, or you'll collide with everybody else.) Then, give
this command to launch the server, reading files from the
hw3_htmldir/ directory:
./http333d 5488 ./hw3_htmldir
Next, launch your browser. You should be able to connect to
the following URL, replacing the port number with whichever
one you picked, and replacing attu4.cs.washington.edu with
the hostname of wherever you launched your servre:
http://attu4.cs.washington.edu:5488/static/bikeapalooza_2011/index.html
Click around and make sure the gallery works as expected. If
not, figure out why not and fix it!
Now that you have the basic web server running, it's time to have
a little fun and add in support for Twitter searching.
- Read through the bottom part of the ProcessRequest routine in
HttpServer.cc, and notice how it is testing URLs to see if they
start with "/post", and if so, invokes a class called FormReader
to handle that request.
- Read through FormReader.h and FormReader.cc. Note how
it is parsing URLs using your URLParser class, and then invoking
either ProcessTwitterSearch() in TwitterSearch.h/.cc or
ProcessTwitterCloudQuery() in TwitterCloud.h/.cc, depending on
the content of the URL.
- Visit the URL "/static/index.html" on your web server and
notice the form that it presents. Try typing in a query and
submitting it; notice that the server is not yet processing your
query. To satisfy your curiosity, read through the file
"hw3_files/index.html" to see how the form is actually implemented
in HTML.
- Read through TwitterSearch.h and TwitterSearch.cc and
complete the implementation, making use of your TwitterSearch
class from HW2. Try using our solution binary http333d to see
how we implemented support for twitter searching, and see if you
can mimic it. Have fun with this; try creative ways of formatting
your search results.
- Read through TwitterCloud.h and TwitterCloud.cc and
complete the implementation, lifting code from your TwitterShell
implementation from HW2. Try using our solution binarhy http333d
to see how we implemented support for twitter word clouds, and see
if you can mimic it.
Implement some other interesting feature in your web server. Some
examples could be:
- (very hard) the ability to search other sites than Twitter;
for example, learn the Facebook API, implement some support for
it, and allow people to search Facebook through your server.
- (medium) find some code that implements the Eliza
chatbot, and integrate support for it into your server.
- (medium) add code to the server that maintains
statistics about which URLs have been requested, which client
IP addresses have interacted with the server, or other interesting
diagnostics. Implement a web page that displays all of these
diagnostics.
- (hard) integrate your image histogram code from
HW1 into the server: allow a user to upload a picture, then
use your image histogram code to generate a histogram of the
picture. Display the histogram in a result page.
When you're ready to turn in your assignment, do the following:
- In the hw3 directory, run "make clean" to clean out
any object files and emacs detritus; what should be left are your
source files.
- Create a TURNIN.TXT file in hw3 that contains your name,
student number, and UW email address.
- cd up a directory so that hw3 is a subdirectory of your
working directory, and run the following command to create your
submission tarball, but replacing "UWEMAIL" with your uw.edu
email account name.
tar -cvzf hw3_submission_UWEMAIL.tar.gz hw3
For example, since my uw.edu email account is "gribble", I would run the command:
tar -cvzf hw3_submission_gribble.tar.gz hw3
- Use the course dropbox (there is a link on the course
homepage) to submit that tarball.
- Fill out the following survey to give us feedback on the
assignment:
https://catalyst.uw.edu/webq/survey/gribble/168394
We will be basing your grade on several elements:
- The degree to which your code passes the unit tests.
If your code fails a test, we won't attempt
to understand why: we're planning on just including the number of
points that the test drivers print out.
- We have some additional unit tests that test a few additional
cases that aren't in the supplied test driver. We'll be checking
to see if your code passes these as well.
- Whether you were able to successfully get the Web server
working, and whether you were able to get the Twitter search
features working.
- The quality of your code. We'll be judging this on several
qualitative aspects, including whether you've sufficiently
factored your code and whether there is any redundancy in your code
that could be eliminated.
- The readability of your code. For this assignment, we don't
have formal coding style guidelines that you must follow; instead,
attempt to mimic the style of code that we've provided you.
Aspects you should mimic are conventions you see for
capitalization and naming of variables, functions, and arguments,
the use of comments to document aspects of the code, and how code
is indented.
|