CSE 333 Homework 4
Out: Friday, February 27, 2026
Due: Thursday, March 12, 2026 by 11:59 PM
Closes: Sunday, March 15, 2026 by 11:59 PM
Goals
In this assignment, you will build on top of your Homework 3 implementation to complete a multithreaded web server front-end to your query processor.
- In Part A, you will read through some of our code to learn about the infrastructure we have built for you.
- In Part B, you will complete some of our classes and routines to finish the implementation of a simple web server.
- In Part C, you will fix some security problems in our web server.
As before, please read through this entire document before beginning the assignment, and please start early!
Multithreaded Web Server
General Implementation Notes
- You may not modify the
Makefiledistributed with the project. In particular, there are reasonable ways to do the necessary string handling without using the Boost Regex library. - You may not modify any of the existing header files
or class definitions distributed with the code.
If you wish to add extra "helper" functions you can to do that by
including additional static functions in the implementation
(
.cc) files. - You don't need to worry about propagating errors
back to callers in all situations.
You will use
Verify333()'s to spot errors and cause your program to crash out if they occur. However, no matter what a client does, or what input the web server reads, your web server must handle that; only internal issues (such as out of memory) should cause your web server to crash out.
Suggested Work Schedule
To help you schedule your time, here's a suggested order for the parts of this assignment. We're not going to enforce a schedule; it's up to you to manage your time.
- Read over the project specifications and understand which code is responsible for what.
- Finish
ServerSocket.cc. Make sure to cover all functionality, not just what is in the unit tests. - Implement
FileReader.cc, which should be very easy, andGetNextRequest()inHttpConnection.cc. - Complete
ParseRequest()inHttpConnection.cc. This can be tricky, as it involves both Boost and string parsing. - Finish the code for
http333d.cc. - Implement
HttpServer_ThrFn()inHttpServer.cc. - Complete
ProcessFileRequest()andProcessQueryRequest()inHttpServer.cc. At this point, you should be able to search the "333gle" site and view the webpages available under/static/, e.g.,http://localhost:5555/static/bikeapalooza_2011/index.html. - Fix the security issues with the website, if you have any.
- Make sure everything works as it is supposed to.
Part A: Read Through Our Code
Our web server is a fairly straightforward multithreaded application. Every time a client connects to the server, the server dispatches a thread to handle all interactions with that client. Threads do not interact with each other at all, which greatly simplifies the design of the server.
The figure to the right shows the high-level architecture of the
server.
There is a main class called HttpServer that uses a
ServerSocket class to create a listening socket, and
then sits in a loop waiting to accept new connections from clients.
For each new connection that the HttpServer receives,
it dispatches a thread from a ThreadPool class to
handle the connection.
The dispatched thread springs to life in a function called
HttpServer_ThrFn() within the
HttpServer.cc file.
The HttpServer_ThrFn() function handles reading
requests from one client.
For each request that the client sends, the
HttpServer_ThrFn() invokes
GetNextRequest() on the HttpConnection
object to read in the next request and parse it.
To read a request, the GetNextRequest() method invokes
WrappedRead() some number of times until it spots the
end of the request.
To parse a request, the method invokes the
ParseRequest() method (also within
HttpConnection).
At this point, the HttpServer_ThrFun() has a fully
parsed HttpRequest object (defined in
HttpRequest.h).
The next job of HttpServer_ThrFn() is to process the
request.
To do this, it invokes the ProcessRequest() function,
which looks at the request URI to determine if this is a request
for a static file, or if it is a request associated with the search
functionality.
Depending on what it discovers, it either invokes
ProcessFileRequest() or
ProcessSearchRequest().
Once those functions return an HttpResponse, the
HttpServer_ThrFn() invokes the
WriteResponse() method on the
HttpConnection object to write the response back to
the client.
Our web server isn't too complicated, but there is a fair amount of plumbing to get set up. In this part of the assignment, we want you to read through a bunch of lower-level code that we've provided for you. You need to understand how this code works to finish our web server implementation, but we won't have you modify this plumbing.
Part A Instructions
- Change to the directory containing your CSE333 GitLab
repository.
Use
git pullto retrieve the newhw4/folder for this assignment. You will need thehw1/,hw2,hw3/, andprojdocs/directories in the same folder as your newhw4/folder since hw4 links to files in those previous directories. Also, as with previous parts of the project, you can use the solution_binaries/ versions of the previous parts of the project if you wish. If you decide to use our solution binaries, copy the libhw1.a file in hw1/solution_binaries to the main hw1 folder, copy the libhw2.a file in hw2/solution_binaries to the hw2 folder, and copy the libhw3.a file in hw3/solution_binaries to the hw3 folder. - Look around inside of
hw4/to familiarize yourself with the structure. Note that there arelibhw1/,libhw2/, andlibhw3/directories that contain symlinks to yourlibhw1.a,libhw2.a, andlibhw3.a, respectively. You can replace your libraries with ours (from the appropriatesolution_binariesdirectories) if you prefer. - Next, run
maketo compile the two HW4 binaries. One of which is the usual unit test binary. Run it, and you'll see the unit tests fail, crash out, and you won't yet earn the automated grading points tallied by the test suite. -
The second binary is the web server itself:
http333d. Its usage message will reveal its command-line arguments; an example call looks like:$ ./http333d 5555 ../projdocs unit_test_indices/*
In the meantime, start up a working web server using the provided solution binary:$ ./solution_binaries/http333d 5555 ../projdocs unit_test_indices/*
You might need to pick a different port than 5555 if someone else is using that port on the same machine as you. - Use a web browser to explore what the server
should look like when it's finished:
- If you are running the code on a lab
computer or the CSE home VM, launch a browser on that machine
and open
http://localhost:5555/
and
http://localhost:5555/static/bikeapalooza_2011/Bikeapalooza.html
in different tabs, changing the 5555 to the port you
specified when launching
http333d. - If you are running the code on
attu, note which specific machine you are running the web server on (e.g.,attu4) and open http://attu4.cs.washington.edu:5555/ and http://attu4.cs.washington.edu:5555/static/bikeapalooza_2011/Bikeapalooza.html in different tabs, changing the attu number and port number as needed.
Enter a few search queries in the first tab and then click around the Bikeapalooza gallery in the second tab; this is what your finished web server will be capable of!
- If you are running the code on a lab
computer or the CSE home VM, launch a browser on that machine
and open
http://localhost:5555/
and
http://localhost:5555/static/bikeapalooza_2011/Bikeapalooza.html
in different tabs, changing the 5555 to the port you
specified when launching
-
When you are done with the
http333dserver, the most graceful way to shut it down is to use the special URL/quitquitquit(for example, http://attu4.cs.washington.edu:5555/quitquitquit ).You will implement support for this URL in your
HttpServer. When the server receives a request for/quitquitquit, it should shut itself down cleanly.We strongly recommend using this method when running under Valgrind, since Valgrind requires a graceful shutdown to finalize and report accurate heap statistics.
Alternatively, you may open another terminal window on the same machine and run:
$ ps -u
to find the server’s process id (pid), then run:
$ kill pid
You can also type Control-C in the terminal window where the web server is running, but this is less graceful and may prevent Valgrind from reporting accurate statistics. - Read through
ThreadPool.handThreadPool.cc. You don't need to implement anything in either, but several pieces of the project rely on this code. The header file is well-documented, so it ought to be clear how it's used. There's also a unit test file that you can peek at. - Read through
HttpUtils.handHttpUtils.cc. This class defines a number of utility functions that the rest of HW4 uses. You will have to implement some of these utilites while completingtest_suite. Make sure that you understand what each of the utilities do, and why we may want them. - Finally, read through
HttpRequest.handHttpResponse.h. These files define theHttpRequestandHttpResponseclasses, which represent a parsed HTTP request and response, respectively.
Part B: Basic Web Server
You are now going to finish a basic implementation of the
http333d web server.
You will need to implement some of the event handling routines at
different layers of abstraction in the web server, culiminating
with generating HTTP and HTML to send to the client.
Part B Instructions
- Take a look at
ServerSocket.h. This file contains a helpful class for creating a server-side listening socket, and accepting a new connection from a client. We've provided you with the class declaration inServerSocket.hbut no implementation inServerSocket.cc; your next job is to build it. You'll need to make the code handle both IPv4 and IPv6 clients. Run the test suite to see if you make it past theServerSocketunittests. - Read through
FileReader.handFileReader.cc. Note that the implementation ofFileReader.ccis missing; go ahead and implement it. See if you make it past theFileReaderunittests. - Read through
HttpConnection.handHttpConnection.cc. The two major functions inHttpConnection.cchave their implementations missing, but have generous comments for you to follow. Implement the missing functions, and see if you make it past theHttpConnectionunittests. - Read through
HttpUtils.handHttpUtils.cc. There are two functions inHttpUtils.ccthat have their implementations missing, but have generous comments to help you figure out their implementation. Implement the missing functions, and see if you make it past theHttpUtilsunittests. -
Now comes the hardest part of the assignment.Read through
HttpServer.cc,HttpServer.h, andhttp333d.cc. Note that some parts ofHttpServer.ccandhttp333d.ccare missing. Go ahead and implement those missing functions. The only requirement here is that your web server mimics the same behavior (i.e., have a search bar, process files and queries correctly, and show their results similarly) as the solution binaries; although entirely optional, you are free to modify the look of your 333gle site:- If you just want to get the same "look and feel" of our server, you can use the solution binary and then view source to see the HTML to emulate.
- In the past, some students implemented 333gle in "dark mode", had a Shrek theme, etc.
- If you want to add more features that are more complex than altering apperance, check out the Bonus below.
Once you have the functions implemented, test your
http333dbinary to see if it works by running the web server and connecting to it from a browser (as described in Part A Step 5 above), exercising both the web search and static file serving functionalities. - At this point, your web server should run
correctly, and everything should compile with no warnings.
Try running the
test_suiteunder valgrind to make sure there are no memory issues. Finally, launch the web server under valgrind to make sure there are no issues or leaks: after the web server has launched, exercise it by issuing a few queries, then kill the web server.The supplied code DOES have some memory leaks, but your code should not make things significantly worse.
Part C: Fix Security Vulnerabilities
Now that the basic web server works, you will discover that your web server (probably) has two security vulnerabilities. We are going to point these out to you, and you will repair them. Of course, it IS possible that the way you implemented things above means you have already dealt with these flaws.
HttpUtils.cc will be very helpful in fixing these
security flaws in your web server.
Part C Instructions
Fix the following two security flaws, if currently found in your
server.
As a point of reference, we've provided a version of our web server
that has both of these flaws in place
(solution_binaries/http333d_withflaws).
Feel free to try it out, but DO NOT leave this server
running, as it will potentially expose all of your files to anybody
that connects to it.
- The first is called a
http://en.wikipedia.org/wiki/Cross-site_scripting
Using Firefox or Safari (Chrome prevents this attack), try typing the following query into both your web server and the solution binary web server and compare the behavior of the two:hello <script>alert("Boo!");</script>To fix this flaw, you need "escape" untrusted input from the client before you relay it to output. - Use telnet to connect to your web
server and manually send a request for the following URI.
(Note: browsers are smart enough to help defend against this
attack, so you can't just type it into a browser URL bar, but
nothing prevents attackers from directly connecting to your
server with a program of their own!)
/static/../hw4/http333d.cc
This is called a directory traversal attack. Instead of trusting the file pathname provided by a client, you need to normalize the path and verify that it names a file within your document subdirectory tree (which would be../projdocs/if the example command shown in Part A was used to start the server). If the provided path names something outside of that subdirectory, you should return an error message instead of the file contents.
Bonus Tasks
If you want to do any of the bonus parts, first create a
hw4-final tag in your repository to mark the version
of the assignment with the required parts of the project.
That will allow us to more easily evaluate how well you did on the
basic requirements of the assignment.
For HW4 bonus grading, create a file readme_bonus.txt
in your top-level hw4 directory for summarizing the
additions.
When you are done adding additional bonus parts and have committed
and pushed them to your GitLab repository, tag that commit
hw4-bonus.
If we find a hw4-bonus tag in your repository, we'll
grade the bonus parts; otherwise we'll assume that you just did the
required parts.
- Perform a performance analysis of your web server
implementation, determining:
- What throughput your server can handle (measured both in requests per second and bytes per second),
- What latency clients experience (measure in seconds per request), and
- What the performance bottleneck is.
The
httperftool for Linux can generate synthetic load. You should conduct this performance analysis for a few different usage scenarios; e.g., you could vary the size of the web page you request, and see its impact on the number of pages per second your server can deliver. If you choose to do this bonus task, please include a PDF file in your submission containing relevant performance graphs and analysis. - Figure out some interesting feature to add to
your web server, and implement it!
Here are some example ideas:
- Find the implementation of a chatbot, such as ELIZA, and add it to your web server.
- Implement logging functionality; every time your server serves content, write out some record with a timestamp to a log file; make the log file available through the web server itself.
- Change the results page to show context from matching
documents, similar to how Google shows excerpts from matching
pages; specifically, make it so that each result in the
result list shows:
x words + <bold>hit word</bold> + y words
for one or more of the query words that hit.
If you choose to do this bonus task, describe your added feature(s) and how to use them in
readme_bonus.txt. This part of the assignment is deliberately open-ended, with much less structure than earlier parts. The (small) amount of extra credit granted will depend on how interesting your extension is and how well it is implemented.
Testing
As with the previous homework, you can compile the your
implementation by using the make command.
This will result in several output files, including an executable
called test_suite.
After compiling your solution with make, you can run
all of the tests for the homwork by running:
$ ./test_suite
You can also run only specific tests by passing command-line
arguments into test_suite.
This is extremely helpful for debugging specific parts of the
assignment, especially since test_suite can be run
with these settings through valgrind and
gdb!
Some examples:
- To only run the
HttpConnectiontests, enter:$ ./test_suite --gtest_filter=Test_HttpConnection.*
- To run all tests except the
ServerSockettests, enter:$ ./test_suite --gtest_filter=-Test_ServerSocket.*
You can specify which tests are run for any of the tests in the assignment — you just need to know the names of the tests! You can list them all out by running:
$ ./test_suite --gtest_list_tests