CS 382M: Advanced Computer Architecture

Homework 5

Due: Part 1 of this homework is soley for your benefit and will not be graded. You should definitely do it before the exam, however. Part 2 is due at the beginning of class on Friday April 30 when we discuss the papers.

Part 1:

This question came from a previous exam. You should be able to
solve it without actually running the benchmark described. Of course,
you are welcome to run the experiments if you want. Note, however,
that the message primitives described here are more light-weight
than most network protocols give you, so be sure you are measuring
what you think you are measuring.

Consider the results of running a simple network micro-benchmark on
two machines. The machine under test runs the following procedure to
take a measurement:

/*
 * Send nPasses small messages, receiving at most 1 reply
 * message per message sent. After each send (and possibly
 * reply) delay the next send (and reply) by delta microseconds.
 */
Time 
bench(int nPasses, float delta)
{
  Time totTime, avgTime;
  
  StartTimer();
  for(i = 0; i < nPasses; i++){
    send();
    pollAndRecieve();
    spin(delta);
  }
  totTime = stopTimer;
  avgTime = totTime / nPasses;
  drainNetwork();
  return avgTime;
}

The function send() sends a small message to the destination
machine. 

The function pollAndRecieve() polls the network for any messages that
have arrived at the network interface. If any messages have arrived,
the procedure processes one message from the network interface. If no
messages are detected when the procedure polls the interface, the
routine returns immediately without processing a message.

The function spin() busy-waits in an empty loop for delta
microseconds.

The destination machine sits in a tight loop polling the network
interface for messages. Whenever it receives a message, it immediately
sends a small message in reply.

After the time measurements have been complete, the benchmark calls
drainNetwork() to receive all messages sent by the destination machine
to the machine under test that were not received in the main loop of
the measurement. This action is not timed.

In all of the following questions, assume that the only things that
consume system resources are sending and receiving messages and
spinning. Assume that control instructions and polling the network
take negligible time.

a) For the benchmark run bench(nPasses = 1, delta = 0.0), draw a
time-line for the machine under test and the destination machine. On
this time-line, draw boxes to represent the send and receive
overheads, and draw arrows linking sends to the corresponding
receives. Label the send overhead (Os), receive overhead (Or), network
latency (L), and timer start/stop times (start, stop).

    Machine Under Test    ---------------------------------------------

                                                                      t ->

    Destination Machine   ---------------------------------------------



b) Consider the steady state behavior of a bench() run where nPasses
and delta have both been set to relatively large values. Illustrate a
period of time during steady state that includes 3 sends and 3
receives by both the machine under test and the destination
machine. Label the send overhead (Os), reveive overhead (Or), network
latency (L), and delta (D).

    Machine Under Test    ---------------------------------------------

                                                                      t ->

    Destination Machine   ---------------------------------------------


The benchmark is usually run by varying delta and nPasses and plotting
avgTime. The following figure shows the results of such a set of
runs. The x axis represents the number of messages sent (nPasses), the
y axis shows the average time per pass around the loop (avgTime), each
line links points measured with a common delta delay (D).




Using the data from this figure, answer the following questions

Grading note: to get credit on this problem, you must explain your
answer. Adding and explaining notations to the figure might be a good
way to explain several of the following answers. In your explanation,
clearly state which lines and points you are looking at.

c) What is the send overhead (Os) for this machine?











d) What is the receive (Or) overhead for this machine?








e) What is the round-trip-time (RTT) and network latency (L) for this
machine? 







f)  What is the gap (g) for this machine (more difficult!)? 



PART2

(Due Friday April 30, start of class)
  • Paper critique
  • Read the two papers assigned for this class.

    For each of the papers, turn in a short (1/2 page max critique of the paper answering the following three questions.

    The main goal of this homework question is to encourage you to read the papers in advance of the class discussion so that you come to class prepared to be an active participant in the discussions. Our grading of the above questions will reflect that goal. Note that it is not appropriate to turn in an answer to this question if you have not made an honest effort to read and understand the papers.