CS380L: Advanced Operating Systems

Lab #1

The goal of this assignment is to understand the differences between the native host, a container and a VM by measuring the performance of certain programs in these different environments and trying to understand what influences the end-to-end performance.

Before you start

We are interested in doing experimental computer science. We will follow the scientific method, which Wikipedia tells me has been around for a long time. But knowing about science in the abstract is relatively easy; actually doing good science is difficult both to learn and to execute.

Let's start with reproducibility. You will write a report for this lab, and in your report you will include details about your system. Think about what it would take to recreate your results. I won't spell out exactly what information you should include, but please include everything relevant while not bogging down your reader. You should report things like the kernel version for your host and guest system. If you used CloudLab, include details about the hardware of the machine type you used.

Your report should answer every question in this lab and should do so in a way that is clearly labeled.

I have a major pet peeve with excessive digits of precision. Your measurements are usually counts. If you average three counts, don't give me six decimal places of precision even if six digits is the default output format for floats in the language you are using. Decide how many digits are meaningful and then report that many. Also, make your decimal points line up when that makes sense. For example, if you report a mean and a standard deviation, make the decimal places always align so you can see easily if the standard deviation is less than a tenth of the mean (which is a good sign for reproducibility).

I would use C or C++, but you can use whatever programming tools you want. One thing I want you to do both for this class and for real life is always check the return code of every single sytem call you ever make. I know it sounds a bit pedantic, but start the habit now and you will have a happier programming life. For almost every system call all that means is checking if the return code less than zero and if so call perror. When system calls don't work, you really want to know about it early, trust me on this point.

Getting a container running using Docker

In Lab #0, you learned how to get a VM running using QEMU/KVM. Now you will run a container using Docker. Once you've been able to get your container running, try installing a package using aptitude to verify your container has network access.

Tools for measuring programs

Before heading to the main part of your lab, I would like to introduce some tools that you will use to measure your programs. You are encouraged to use any other tools that you think are valuable.

Expriment setup

Measuring mmap

Your first task will be to write a program that mmaps a 1GB region (either file-backed or anonymous) and writes the first byte of each page (chosen in a random order) exactly once. To access each page of a region exactly once in a random order, you might want to generate a random permutation. Here is an example that takes an array and shuffles it based on Fisher-Yates shuffle:

void shuffle(uint64_t *array, size_t n)
{
  if (n > 1) {
    size_t i;
    for (i = 0; i < n - 1; i++) {
      size_t j = i + rand() / (RAND_MAX / (n - i) + 1);
      uint64_t t = array[j];
      array[j] = array[i];
      array[i] = t;
    }
  }
}

We want to produce deterministic results. You should bind your program to a specific core. Also, for this expriment, we want you to make sure that the entire file is cached in the system's page cache before each time you run the expriment. You can write a simple program that sequentially reads the entire file for several times to load the file into the page cache. Before each expriment, you should use fincore to make sure that the entire file is cached in the page cache. The point here is to make sure that the standard deviation of your results is small. Your results are not deterministic if they vary dramatically from experiment to experiment.

First, let's do the expriment on your host machine:

Next, let's do the same thing in a container under two settings:

Now, let's do the same thing in a VM under two different settings:

Summarize your results in the table below. Please also include the standard deviation of your results.

file-backed private file-backed shared anonymous private anonymous shared
Native host
Container using host FS
Container using overlayfs first run:
second run:
first run:
second run:
VM with EPT
VM without EPT

Are there any numbers you find interesting? You may use the tools we introduced above to measure your programs and help you understand the end-to-end performance. Please answer the following questions on your report:
  1. Explain any performance differences between file-backed and anonymous mmap.
  2. Is there any difference between MAP_PRIVATE and MAP_SHARED? Explain the differences.
  3. If you find that the file-backed private case in a VM is slow, can you explain why? How to improve the performance?
  4. Are there any difference in performance among the native host, the container and a VM using EPT? Whether there is a difference or not, please explain why.
  5. Explain any performance differences between VM with EPT and VM without EPT.
  6. Is there any performance difference for workloads on a container using the host file system and one using overlayfs? Whether there is a difference or not, please explain why.
  7. In the case of container using overlayfs with file-backed mmap, is there any difference between your first and second run? Explain the differences.

Measuring direct file I/O

The second part of your lab is to fill out the table below. Please also include the standard deviation of your results.

sequential read sequential write random read random write
Native host
Container using host FS
Container using overlayfs first run:
second run:
first run:
second run:
first run:
second run:
first run:
second run:
VM with EPT

We want you to measure the performance of direct file I/O, including random read/write and sequential read/write. Write a program that opens the same file used in the previous sections using O_DIRECT. Construct an offset_array and pass it to the function below. Here, IO_SIZE is a macro, which defines the size of each I/O request. We use 4096 bytes as the I/O size. offset_array stores the offset of each I/O request. For sequential read/write, offset_array should look like {0, 4096, 8192, 12288, ..., FILE_SIZE - 4096}. n is the length of the offset_array. For random read/write, generate a random permutation of the sequential offset_array and pass it to the function below. If the opt_read flag is true, we read from the file, if it is false, we write to the file.

In the case of container using overlayfs, just like what we did in the last experiment, we want you to report the amount of time your program consumes on its first and second run. Again, before starting measuring your program, do make sure that lower/file-1g is cached and the upper directory is empty. You can use the same instructions in the last experiment to mount an overlayfs and restore its initial state.

#define	IO_SIZE 4096
void do_file_io(int fd, char *buf, 
      uint64_t *offset_array, size_t n, int opt_read)
{
  int ret = 0;
  for (int i = 0; i < n; i++) {
    ret = lseek(fd, offset_array[i], SEEK_SET);
    if (ret == -1) {
      perror("lseek");
      exit(-1);
    }
    if (opt_read)
      ret = read(fd, buf, IO_SIZE);
    else
      ret = write(fd, buf, IO_SIZE);
    if (ret == -1) {
      perror("read/write");
      exit(-1);
    }
  }
}

Once again we want you to compare the performance measurements and explain the differences. The tools linked above might be helpful to you to better understand the measured performance. Please answer the following questions on your report:

  1. What does the flag O_DIRECT do?
  2. Please explain any differences between the performance of sequential I/O and random I/O.
  3. Please explain differences between read and write benchmarks.
  4. Are there any differences in performance among the native host, a container using the host filesystem and a VM with EPT? Explain these differences.
  5. Explain any performance differences for workloads on a container using the host file system and one using overlayfs.
  6. In the case of container using overlayfs, is there any difference between your first and second run? Explain the differences.

Swap

This is the final part of your lab. In this expriment we want you to understand the functionality of swap. You will use the same program written in the first expriment "Measuring mmap" for this expriment. More specifically, we want you to consider the anonymous private case of your program. Please answer the following questions:

For containers:

  1. Restrict the memory size of your containers to 500MB by specifying --memory="500m" in the Docker command. Then Run your program. Does it finish successfully? Explain why.
  2. Add --memory-swap="1.5g" to your docker command (Don't remove the --memory="500m" flag). Also, please make sure that swap is enabled on your host machine. You can use free -m to check if it is enabled. Run your program again. Does it finish successfully? Explain why.

For VMs:

  1. What happens to your program if you only give your VM 500MB memory? I am assuming that swap is disabled on your VM.
  2. Enable swap on your VM. Configure a 1GB swap region. Run your program again. Does it finish successfully? Explain why.

Report

Your report should be a PDF file submitted to canvas. Here is a description of its contents.

The first section should include everything the reader needs to reproduce all your results. As always, report your experimental platform. Describe the software you are using, like the version of your kernel, VM images and docker images.

The second section should include the results of your first expriment and your answers and explanations to the corresponding questions. Use the table we specified above to report your results. Please specify the units of your measurements. Your report should answer every question and should do so in a way that is clearly labeled. Your explanation should include how you used the tools to help you understand the differences in performance. Don't just include your hypothesis! Use tools to measure your programs and support your hypothesis.

The third section should include the results of your second expriment and your answers and explanations to the corresponding questions. The requirements are the same as the second section.

The final section should include your answers and explanations to the questions in the third expriment. Please also include the output of your program.

Please report how much time you spent on the lab.

Notes

Configure your VM with at least two virtual CPUs, but first confirm that your host system has at least two CPUs.

Check for perf availability in your host system before checking/installing in the guest.

If you run perf list on the command line, it will tell you what counters are supported by your combination of hardware, OS and perf tools.

I'm not sure if it is necessary, but if you get a lot of variation in your results for the experiments that follow, you might want to disable CPU frequency scaling on your system. I would do this in the BIOS, but you can also try user-level tools like this one that allow you to set the frequency directly (or perhaps the "-g performance" option would work, I'm not sure). Here is a tool. https://manpages.ubuntu.com/manpages/hardy/man1/cpufreq-selector.1.html

Your report should be a PDF file submitted to canvas.

Please include how much time you spent on the lab.

Your code will have to run with many different configurations. Consider using getopt, or maybe you would prefer a configuration file, but I find command line options superior for this sort of task as they are more explicit and more easily scripted.

You must give us access to a code repository that contains the history of your lab. This will allow us to see your partial progress.

Your code is subject to visual inspection and should be clean and readable. Points will be deducted if your code is too hard to follow. Your code should follow best practices, for example avoiding arbitrary constants and checking error codes. Seriously, check the return value of every system call you ever make.

Please check in your code as you write it. We will look at your revision history as a way to ensure that you are doing the work. We suggest using github and we request that you share access to your repository with us.