HW 3: Server Load

HW3

You may do this project individually or in 2-person teams. If you choose the latter, be sure to tell the TA who your partner is.

Introduction

A common task in evaluating computer systems is to experimentally determine the throughput and latency of the system. This homework exercises these skills (and addresses common pitfalls) for a simple test system.

Setup

Select a set of machines on which to run your tests. You will need one machine to act as a server and several to act as clients. Download and install a web server of your choice (e.g., apache, JO!, Jigsaw, ...) Write a script to generate F test files, each of size 10KB. F should be sufficient that total size of the test files is twice the size of the test machine's main memory. Generate the test files on the machine's local disk (for .cs machines, the /tmp partition is globally accessible and is usually big enough. If you use a shared machine, be sure to clean up after yourself when you are done!) Write a coordinator script/program that launches N workload generators on a set of M machines and have each workload generator put its output in a separate file:
        rm /u/userId/gradOS/hw4/out/N/*
        for(ii = 0; ii < N; ii++){
           ssh userId@machine[ii % M] /u/userId/gradOS/hw4/generator nRequsts avgDelay > /u/userId/gradOS/hw4/out/N/out.ii
Each workload generator is invoked with two paramaters: nRequests -- the number of requests to issue before exiting -- and delay -- the average delay between requests. Each workload generator has the following main loop
         for(ii = 0; ii < nRequests; ii++){
           start = currentTime
           send request for randomly selected object
           receive reply
           end = currentTime
           latency = end - start;
           print start + " " + latency + "\n" to stdout
           delay = random value using exponential distribution
                   with mean = avgDelay
           sleep until start + delay
Note that an HTTP server receives ASCII requests of the form
         GET /users/dahlin/index.html HTTP/1.1\n
      Host: www.cs.utexas.edu\n
      \n
(Note that there needs to be a blank line to tell the server that the request is done.) Try using the command "telnet www.cs.utexas.edu 80" and typing a request similar to the above to see how this works. Notice that the Content-Length field in the reply tells you how many bytes will appear after the blank line that separates the reply header from the reply body. Finally, you will want to write an analysis script to read these output files and generate the graphs specified below. (I typically use shell scripts, awk, and gnuplot to produce such graphs, but feel free to use whatever tools you are comfortable with. *Don't* try to calculate these graphs manually!)

Exercise 1

A key problem with a lot of experiments is that people run experiments, see a graph that has rougly the expected shape, and stop there even though the graphs don't make any sense. It is important to do back of the envelope calculations to make sure you understand the results and that they make sense. Before you run any tests, estimate the expected maximum troughput that the system should be able to achieve (show your work and explain your reasoning). Similarly, estimate expected the minimum latency that a client should observe when the system is lightly loaded (show your work and explain your reasoning).

Exercise 2

Run an experiment and produce the plots described below.

Set N to 100 and average delay to 0.1 second and plot

x axis: elapsed time y axis: throughput (requests completed per second)
x axis: elapsed time y axis: latency (time from when request is issued until it completes)

Discuss any start-up effects you observe. What is causing them and how much do they affect the results? Describe how you will account for these start-up/shut down effects in the remaining experiments.

Exercise 3

A common error is to report "average response time" for a sysetm without thinking too much about the system load. In reality, response time is highly dependent on load and rather than reporting just one latency number (at some carefully-or-not-so-carefully selected load), it is usually much better to plot the latency as a function of load.

Run a series of tests and produce the plots described below.

Set average think time to 0.1 second and vary N from 1 to a number large enough to maximize throughput. In your plots, be sure to only use data from the steady state operation of your system (avoiding start-up/shut down transients). Create the following plots:

x axis: N, y axis: avg throughput (requests completed per second in steady state)
x axis: N y axis: avg response time (steady state)

Notice in the first graph the difference between offered load (proportional to N * 1/avgDelay) and realized througput (the actual number of requests per second processed by the system). Some researcher will draw a graph like the second graph here and label the x axis "load". Why is that misleading?

The following graph shows the right way to plot load v. latency.

x axis: throughput y axis: avg response time (steady state)

What is minimum latency observed. How does this compare to your estimate?

What is maximum throughput observed. How does this compare to your estimate?

Exercise 4: Extra Credit

The shape of the load v. latency curve depends on how bursty the arriving requests are. The above experiment is designed to produce exponentially distributed arrival times. Change the load generators to (a) evenly space the requests in time or (b) bunch together requests so that M requests are all issued at about the same moment. How does the throughput v. latency curve change. Why?

This completes the lab.