As before, your task is to build a basic web proxy capable of
accepting HTTP requests, making requests from remote servers, caching
results, and returning data to a client. Unlike before, you should be
able to accept multiple client requests concurrently. You must
implement two different versions of the proxy: one that achieves
concurrency by fork
ing a request for each new client request,
and one that uses the pthread
library to spawn a new thread for
each new client request.
If you want, you can implement other optimizations, such as handle persistent connections from a client (see HTTP's Keep-Alive instructions), or by creating a process or thread pool for faster processing. A process/thread pool starts up by creating some fixed number of processes/thread on bootup (say, 20). Then, when receiving a new request, it hands-off the request to one of the existing processes/threads, removing it from the pool. (If none are available, showing a higher degree of concurrency, then it can create a new one.) Upon completing executing a request, the process/thread is returned to the pool for future requests. Apache and most servers that adopt a multi-process/threaded style use such pools for lower latency and system load. But again, these optimizations are optional.
This assignment can be completed in either C or C++. It should
compile and run (using g++) without errors or warnings from the penguin
servers, producing a binary called proxy
that takes as
its first argument a port to listen from. Don't use a hard-coded port
number (e.g., port 80). As before, you shouldn't assume that your
server will be running on a particular IP address, or that clients
will be coming from a pre-determined IP.
Run your client with the following command:
./proxy
[-p|-t] <port>
, where port
is the port
number that the proxy should listen on. The argument -p
specifies that the proxy should run in multi-process mode, while
-t
specifies that the proxy should run in multi-threaded
mode. You must implement both. As a basic test of functionality, try
requesting a page using telnet concurrently from two different shells.
Instructions for setting up your browser to access your proxy can be fou nd in the instructions of the previous assignment.
Download the testing script. (Note: You should use 'Save Target As' in the browser or 'wget' to download this script. Copy and paste may not work, since Python scripts differentiate tabs from spaces.)
In addition to the Berkeley sockets library, there are some functions you will need to use for this assignment
fork
, waitpid
pthread_create
, pthread_exit
, etc.
You can find the details of these functions in the Unix man
pages:
man 2 fork
man pthread
Links:
You should submit your completed proxy by the date posted on the course website. You will need to submit a tarball file containing the following:
Your tarball should be named cos518_proxy_USERNAME.tgz
where USERNAME
is your username. The sample Makefile in the skeleton zip file we provide will make this tarball for you with the make tar
command.
Your proxy will be graded with the following criteria:
make
on your assignment, it should compile without errors or warnings on the penguin cycle machines and produce a binary named proxy
. The first command line argument be the -p or -t switch, the second should be the port that the proxy will listen from.
Writing code that will interact with other programs on the Internet is a little different than just writing something for your own use. The general guideline often given for network programs is: be lenient about what you accept, but strict about what you send. This is often referred to as Postel's Law. That is, even if a client doesn't do exactly the right thing, you should make a best effort to process their request if it is possible to easily figure out their intent. On the other hand, you should ensure that anything that you send out conforms to the published protocols as closely as possible. If an incoming request has a single field out of whack (such as sending you a request using HTTP 0.9 or 1.1), uses non-standard line terminators (some clients only send \r instead of the standard \r\n), or does something you don't quite expect with HTTP headers, you should still handle the request rather than dropping the request. Pay attention to parts of the RFC that specify areas where not all clients may conform exactly to what you expect. We'll be looking for this kind of interoperability in both the second round of tests that we run and in the style portion of your grade.
When in doubt, try to follow the behavior specified in RFC 1945. Also, check the FAQ for more specific guidelines.
Last updated: Wed Oct 07 11:09:28 -0400 2009