CS380L: Advanced Operating Systems

Assignment #3 Proposal

Drivers

The goal of this assignment is to come up with a plan for Assignment 3.

Assignment 3 is a more open-ended assignment, where you have the flexibility to pursue an OS topic or subsystem that interests you. The goal of the first part of this assignment then, is to identify roughly what you will be doing for the rest of the assignment. Unless you have made other arrangements, you should focus on working within the field of Linux device drivers.

You must submit a proposal (~1 page long), which covers:

Whether you will create a new driver (based upon a skeleton), or modify an existing driver

The functionality you plan to include in the driver

Information about how the driver's functionality will be accessed, and

A brief plan on how you will be evaluating your driver's performance/functionality

Provide a detailed timeline of how you plan to build the system. It is really important to have intermediate milestones where some subset of functionality is completely working by date X rather than working on all functionality in parallel and finding out what works on the deadline. Give a list of 4 key milestones.

What infrastructure will you have to build to run the experiments you want to run? What hardware will you need and where will you get it? (Talk to me early if you have an experiment that needs hardware support but you don't know where to get the hardware from.)

There are many resources concerning device drivers (in addition to the actual code/documentation that came with the kernel source). For example, there is a book "Linux Device Drivers" by Rubini & Corbet. (A slightly older 2nd edition is available online at this site.

I will review your proposal, and I might request a revision.

Recreate a result

If you want to recreate and extend a result from a paper, you can do that. Please indicate what paper and exactly how you plan to recreate the experiments and what you hope to learn.

Ideas (related to TxOS)

1) Two system call tables in TxOS

A critical source of overheads for non-transactional code in TxOS are dynamic checks in the kernel whether the current thread is in a transaction or not. In the non-transactional case, these expensive dynamic checks could be eliminated if instead we could compile two versions of the code in question.

However, we do not wish to copy-and-paste large swaths of Linux source code. Instead, we would like to use a compile-time tool to build the two versions and create two complete implementations of each system call (at least the relevant portions).

We hypothesize that a tool like CIL (C Intermediate Language) would be sufficient for this task. Step one would be to familiarize yourself with CIL, and then try to perform a compile time transformation to a single, simple system call, like getpid. From there, you could incrementally replace more complex system calls. 2) Implement speculation using the TxOS infrastructure.
Transactions and speculations (see Nightingale's SOSP 05 and OSDI 06 papers) are similar in that they both provide a degree of isolation to a set of work, but the guarantees are different. Moreover, a substantial portion of the implementation of this isolation in each system functionally overlaps. It would be interesting to see the amount of additional effort required to leverage the transaction implementation to provide speculation.
I would recommend first reproducing the results for hiding the latency of synchronous writes to a local file system, and perhaps next looking at NFS latency. It would be interesting to see if speculation could be used to hide the latency of committing durable transactions. For extra credit, find additional uses for speculation, with or without transactions.

3) Parallelize apt with transactions (probably easier).
TxOS currently provides transactional installation of single packages. Aside from the internal bookkeeping, apt installs rarely touch the same files and could likely be executed concurrently in the common case. In the uncommon case, two packages may touch the same configuration files or have a dependence that makes parallelization unsafe. The problem is that this is difficult to reason about. The safety guarantees of transactions could make this a more realistic option.
The main focus of the project would be to modify apt to use system transactions in TxOS to parallelize a large software upgrade (like a distribution upgrade). Ideally, one would understand apt well enough to leverage its internal knowledge of package dependences to make smart choices, such as scheduling dependent packages in the same process or transaction.
A likely problem may be package database structure introducing false conflicts, and this may require some straightforward modification to the dpkg bookkeeping.
For the particularly ambitious group, TxOS currently only supports flat nesting, but it might be interesting to experiment with closed nesting to commit the install as one unit but use nested transactions internally for safety. Implementing closed nesting support could be a sub-project or may become available if a group shows significant progress.
4) Large system transactions
Currently, system transactions are constrained to the size of memory. A trivial solution for large data writes would be to spill them to swap, but this will likely perform poorly. A more appealing solution might allocate free disk blocks for data and later update the file system bookkeeping. The downside to this solution is the risk of complicating the VFS interface. The goal of the project might be to explore the design tradeoffs between performance of large transactions and introducing complexity into the API.
One might begin by implementing the simple swap-based solution and then the most straightforward interface to acquiring free blocks. After measuring this, find ways to refine the performance or reduce the complexity of the interface. The project should also attempt to compare performance with database systems and BDB, which will likely do better. An interesting nugget might be the crossover point in terms of data set size where a database outperforms OS transactions.

Security idea

Linux security modules (LSM)
Linux security modules (LSM) provide a flexible means to implement different security models in Linux. Use LSM to implement the mandatory access control model. You should be able to enforce the Biba and La-Padula policies. As a concrete example, your implementation should be able to run a process in a sandbox with access restricted to a few files. Look at Ubuntu AppArmor and Flume (SOSP'07).