For homework #4, you will build on your homework #3 solution to implement a multithreaded Web server front-end to your query processor. In Part A, you will read through some of our code to learn about the infrastructure we have built for you. In Part B, you will complete some of our classes and routines to finish the implementation of a simple Web server. In Part C, you will fix some security problems in our Web server.

As before, pease read through this entire document before beginning the assignment, and please start early!

In HW4, as with HWs 2 and 3, you don’t need to worry about propagating errors back to callers in all situations. You will use Verify333()’s to spot some kinds of errors and cause your program to crash out. However, no matter what a client does, your web server must handle that; only internal issues (such as out of memory) should cause your web server to crash out.

To help you schedule your time, here’s a suggested order for the parts of this assignment. We’re not going to enforce a schedule; it’s up to you to manage your time.

Part A: read through our code

Our web server is a fairly straightforward multithreaded application. Every time a client connects to the server, the server dispatches a thread to handle all interactions with that client. Threads do not interact with each other at all, which greatly simplifies the design of the server.

The figure to the right shows the high-level architecture of the server. There is a main class called HttpServer that uses a ServerSocket class to create a listening socket, and then sits in a loop waiting to accept new connections from clients. For each new connection that the HttpServer receives, it dispatches a thread from a ThreadPool class to handle the connection. The dispatched thread springs to life in a function called HttpServer_ThrFn within the HttpServer.cc file.


The HttpServer_ThrFn function handles reading requests from one client. For each request that the client sends, the HttpServer_ThrFn invokes GetNextRequest on an HttpConnection object to read in the next request and parse it.

To read a request, the GetNextRequest method invokes WrappedRead() some number of times until it spots the end of the request. To parse a request, the method invokes the ParseRequest method (also within HttpConnection). At this point, the HttpServer_ThrFun has a fully parsed HttpRequest object (defined in HttpRequest.h).


The next job of HttpServer_ThrFn is to process the request. To do this, it invokes the ProcessRequest() function, which looks at the request URI to determine if this is a request for a static file, or if it is a request associated with the search functionality. Depending on what it discovers, it either invokes ProcessFileRequest() or ProcessSearchRequest().

Once those functions return an HttpResponse, the HttpServer_ThrFn invokes the WriteResponse method on the HttpConnection object to write the response back to the client.


Our web server isn’t too complicated, but there is a fair amount of plumbing to get set up. In this part of the assignment, we want you to read through a bunch of lower-level code that we’ve provided for you. You need to understand how this code works to finish our web server implementation, but we won’t have you modify this plumbing.

What to do

Part B: get the basic web server working

You are now going to finish a basic implementation of the http333d web server. We’ll have you implement some of the event handling routines at different layers of abstraction in the web server, culiminating with generating HTTP and HTML to send to the client.

What to do

At this point, your web server should run correctly, and everything should compile with no warnings. Try running your web server and connecting to it from a browser. Also try running the test_suite under valgrind to make sure there are no memory issues. Finally, launch the web server under valgrind to make sure there are no issues or leaks; after the web server has launched, exercise it by issuing a few queries, then kill the web server. (The supplied code does have some leaks, but your code should not make things significantly worse.)

Part C: fix security vulnerabilities

Now that the basic web server works, you will discover that your web server (probably) has two security vulnerabilities. We are going to point these out to you, and you will repair them.

What to do

We’ll bet that your implementation has two security flaws.

Fix these two security flaws, assuming they do in fact exist in your server. As a point of reference, in solution_binaries/, we’ve provided a version of our web server that has both of these flaws in place (http333d_withflaws). Feel free to try it out, but DO NOT leave this server running, as it will potentially expose all of your files to anybody that connects to it.

Congrats, you’re done with the HW4 project sequence!!

Bonus

There are two bonus tasks for this assignment. As before, you can do them, or not; if you don’t, there will be no negative impact on your grade. You should not attempt either bonus task unless and until the basic assignment is working properly. We will not award any bonus credit if the basic assignment is not substantially correct.

This part of the assignment is deliberately open-ended, with much less structure than earlier parts. The (small) amount of extra credit granted will depend on how interesting your extension is and how well it is implemented.

What to turn in

When you’re ready to turn in your assignment, do the following:

In the hw4 directory:

$ make clean
$ cd ..
$ tar czf hw4_<username>.tar.gz hw4
$ # make sure the tar file has no compiler output files in it, but
$ # does have all your source and other files you intend to submit
$ tar tzf hw4_<username>.tar.gz

Turn in hw4_<username>.tar.gz using the course dropbox linked on the main course webpage.

Grading

We will be basing your grade on several elements: