**Software Fault Isolation** *Project 4 for [CS 361S](.)* *Due: Friday, April 5th, 11:59 PM* # Goal The goal of this project is to gain hands-on experience with SFI implementation, optimization, and sandbox escape. We have built for you a simple SFI toolchain, verifier, and runtime for RISC-V. You will write SFI modules that take advantage of bugs in the verifier escape the SFI sandbox. You will then modify the verifier and toolchain to fix the bugs. # Background You have already read Gang Tan's [SFI survey article](https://www.cse.psu.edu/~gxt29/papers/sfi-final.pdf). You will also want to read Wahbe et al.'s [1993 paper introducing SFI](https://www.cs.cmu.edu/~srini/15-829/readings/sfi.pdf); the techniques it describes an SFI for processors more similar to RISC-V than the x86 and ARM processors that are the target of most followup work. To gain inspiration, you may also want to look at presentations on Native Client security by [Ben Hawkes and Mark Dowd](https://web.archive.org/web/20130208091645/https://www.lateralsecurity.com/downloads/hawkes_HAR_2009_exploiting_native_client.pdf) and by [Chris Rohlf](https://www.blackhat.com/html/bh-us-12/bh-us-12-archives.html#Rohlf). # The Environment We will use the class VM you installed in [exercise 1](ex1.html). As with [project 1](proj1.html) and [exercise 3](ex3.html), you will need a working libelf; install the `libelf-dev` package in your VM if you haven't already. In [proj4.tar.gz](proj4.tar.gz) we have supplied a custom SFI toolchain for RISC-V. Run `make` in the main project directory to build the `target` program that you will exploit as well as a `demo-runner` program that you can use (along with the `demo.sfi` module described below) to experiment with how our RISC-V SFI environment works. The SFI system is implemented in `sfi.c`; pay particular attention to the `module_check` function and the subroutines it calls. RISC-V instruction decoding is handled by a modified version of Michael Clark's [RISC-V disassembler](https://github.com/michaeljclark/riscv-disassembler), just as it was for [Exercise 3](ex3.html). The Makefile includes pattern rules for building SFI modules suitable for loading in host programs like `target`. These modules live inside the `modules` subdirectory. A module starts as a `.c` file (make sure to `#include "trampoline.h"` so your module has the required trampoline), is compiled to a `.s0` assembly file that is rewritten by the `util/rewrite.pl` into a `.s` assembly file that (if all goes well) follows the SFI rules, and then the `.s` file is assembled and linked into a `.sfi` ELF file that can be loaded with the SFI API. You will want to run `make modules/demo.sfi` and then examine the `modules/demo.s0`, `modules/demo.s`, and `modules/demo.sfi` that are generated. The last of these is a binary file, but you can see its ELF structure with `readelf -a modules/demo.sfi` and disassemble it with `objdump -d modules/demo.sfi` or using GDB. You can run the demo from the main directory using the command `./demo-runner modules/demo.sfi`. It is definitely worth spending some time with the source and GDB understanding how the SFI runtime is put together. The `starter.c` file in the `modules` subdirectory contains a simple module with a `doit` entrypoint that `target` looks for and runs. You can copy this starter to `parta.c`, `partb.c`, and `partd.c` and edit those. ## A note about debugging The GDB installed in our RISC-V VM gets very confused if the global pointer and thread pointer registers (`gp` and `tp`) are zeroed out, something the SFI system does by default when calling into an SFI module. To make it possible to debug calls into an SFI module, we have implemented an `sfi_call_debuggable` entrypoint in `sfi.c` that behaves like `sfi_call` but leaves `gp` and `tp` unchanged during module execution. You can get `target` to make debuggable SFI calls by passing it the `-d` flag. You will absolutely want to use the `-d` flag to `target` when developing your exploits, but you these should ignore the contents of `gp` and `tp`. We will grade your exploits by running `target` *without* the `-d` flag, so make sure you test that your exploit also works this way before submitting your solution. ## A note about the rewriter The compiler, rewriter, assembler, and linker are considered outside the security boundary in an SFI system; only the verifier and loader are trusted. That means that you could write a solution to Parts A, B, and D of this project that includes instructions that would have been rewritten by `rewrite.pl`, by directly editing a `.s` file (or, I suppose, by directly editing a `.sfi` file in a hex editor) and still get credit. However, solutions are possible for parts A and B that start with a `.c` file with inline assembly and pass the rewriter, so start there. ## A note about ASLR The `target` executable in our project is compiled as a position independent executable, so in your exploits you will need to account for ASLR. We will test your Part A, Part B, and Part D solutions using a target program that is the same as the `target.c` file in the assignment tarball, just with a different value for the flag in `print_flag`. You should not hardcode absolute addresses for functions or data in the `target` executable (except for SFI module memory, whose layout is guaranteed identical by our SFI loader), but you _may_ hardcode offsets between functions in the `target` text segment and variables in the `target` data segment. # Part A The SFI validator is supposed to make sure that all loads and stores use the data pointer register `x27`, which always points to the SFI data region. But whoever wrote the validator forgot that the RISC-V floating point extensions include load and store instructions for moving data into and out of the floating-point registers, so those instructions aren't listed in `load_store_uses_data_ptr`. Use this mistake to break out of the SFI sandbox. You will submit a `parta.sfi` module, along with the `parta.s` source (and the `parta.c` and `parta.s0` precursors, if you built `parta.sfi` using our complete toolchain, rather than hand-editing a .s file) that, when loaded in the target program, causes it to print out its flag and exit: ``` $ ./target modules/parta.sfi FLAG: decafbad abad1dea $ ``` The easiest way to make that happen is to modify the saved `ra` in the process control block `pcb` to point to the otherwise uncalled `print_flag` function. ## Fixing the bug We won't ask you to patch this bug, but it wouldn't be too bad: Just add the floating point memory instructions to `load_store_uses_data_ptr`. That wouldn't keep floating point state from leaking between the host application and the SFI module, so a more comprehensive fix would either forbid floating-point instructions entirely inside SFI modules or save and restore the FPU state on context switch along with the integer registers. # Part B As we discussed in class, compressed instructions make SFI properties harder to enforce. Whoever wrote our validator tried to deal with this by forbidding compressed instructions (see the `not_compressed` function), but that turns out not to be enough ... Use this mistake to break out of the SFI sandbox. You will submit a `partb.sfi` module, along with the `partb.s` source (and the `partb.c` and `partb.s0` precursors, if you built `partb.sfi` using our complete toolchain, rather than hand-editing a .s file) that, when loaded in the target program, causes it to print out its flag and exit: ``` $ ./target modules/partb.sfi FLAG: decafbad abad1dea $ ``` It is probably easiest, again, to overwrite the saved `ra` in `pcb`. It's tempting to synthesize a jump to `print_flag`, but libc will get unhappy about not having the right value in the global pointer register `gp`, and `sfi_return` will restore that register for you if you let it. # Part C In Part C, you will fix the SFI validator bugs that allow SFI modules to execute compressed instructions. You can reject bad direct jumps (ones whose destination is known at validation time) by adding checks to `direct_branch_allowed`. Preventing bad _indirect_ jumps is a bit harder. You'll want a stronger invariant to hold for `x26`, the code pointer. This will require a different masking instruction sequence. You'll want to change `rewrite.pl` and `rewrite-fastsp.pl` to produce the new sequence, and change the checks in `mask_after_code_ptr_set` to make sure that the new sequence is present whenever `x26` is changed. Hint: Remember that RISC-V immediates are always sign extended, so an `andi` with `-4` will clear the two least significant bits of the anded value. You will submit the `sfi.c` file with your fixes and `rewrite.pl` and `rewrite-fastsp.pl` scripts that produce the masking sequence expected by your modified verifier. (The changes to the rewriting scripts shouldn't interact with the fast SP functionality; you should be able to copy your changes from `rewrite.pl` into the corresponding part of `rewrite-fastsp.pl` and have everything work out.) Make sure to test that your toolchain still runs the demo SFI application. (If it's been a while since you last wrote Perl, you may want to refer to the [Perl online documentation](https://perldoc.perl.org/).) # Part D If you compare `modules/demo.s0` (before SFI rewriting) and `modules/demo.s` (after SFI rewriting), you will see that most of the overhead is due to stack accesses, where a single load or store instruction is replaced by a four-instruction sequence. In class, we discussed an optimization that reduces this overhead, which for this project we will call "fast SP." To apply the fast SP optimization, we need to guarantee that the stack pointer `sp` always points to the data region. Because of the limited range of immediate offsets to RISC-V memory operations, this guarantee means that any load or store at an address `imm(sp)` (where `imm` is any immediate value) must either be to the data region or to the guard pages above and below the data region. In either case, the access is safe: It will either access memory the module is allowed to access, or crash. To guarantee that `sp` always points to the data region, we could mask it after any update, using an `and` instruction followed by an `or` instruction, just as we mask the data pointer. We can also optimize a little further: We can allow instructions of the form `addi sp, sp, imm` (where `imm` is an immediate value) without masking. If `sp` was inside the data region before the `addi` instruction then it can't be too far outside the data region after the instruction (again because of the limited range of RISC-V immediates), and a memory access to an offset from `sp` will access the guard region and crash the module. The code in `sfi.c` implements the fast SP approach described above and enables it when `check_module` is passed a nonzero `fastsp` argument, as happens when `demo-runner` and `target` are given the `-f` command-line flag. There's a bit of a problem, though. While the argument above is correct when a single `addi` to `sp` is followed by a load or store through `sp`, it is *not* correct if many `addi` instructions are executed without an intervening load or store through `sp`. Use this mistake to break out of the SFI sandbox. You will submit a `partd.sfi` module, along with the `partd.s` source (and the `partd.c` and `partd.s0` precursors, if you built `partd.sfi` using our complete toolchain, rather than hand-editing a .s file) that, when loaded in the target program in fast SP mode, causes it to print out its flag and exit: ``` $ ./target -f modules-fastsp/partd.sfi FLAG: decafbad abad1dea $ ``` It is probably easiest, yet again, to overwrite the saved `ra` in `pcb`. Note: On our RISC-V VM, programs' text and data segments are loaded quite high in the virtual address space, whereas by default our SFI system loads modules quite low in the virtual address space. This distance means that an exploit for Part D may take several minutes to run. If you find this annoying, you can uncomment the variable assignment `CONFIG_LOADHIGH=y` in the Makefile and rebuild everything (including relinking any SFI modules). With this Makefile change, SFI modules will be loaded much closer to the segments you're trying to overwrite, and Part D exploits should execute nearly instantaneously. Aside from the runtime difference, your exploit for Part D (and other parts) should work with either setting. Indeed, you shouldn't hardcode the SFI module location anywhere in your solution for this project. # Part E In Part E, you will fix the SFI validator bugs that allow SFI modules to execute repeated `addi sp, sp, imm` instructions and access memory outside the sandbox. The fix is simple: We need to make sure that each `addi sp, sp, imm` instruction is immediately followed by a load or store from (an offset to) `sp`. Many `addi` instructions already are. After any that aren't, we can add the instruction `ld zero, 0(sp)`, which loads from `sp` and discards the result; the RISC-V specification guarantees that this instruction will [check that the address in `sp` is on a readable page](https://commaok.xyz/post/riscv_isa_blog_post/) but otherwise have no effect. Uncomment the line `$check_sp = 1;` in `util/rewrite-fastsp.pl` and it will add the `ld zero, 0(sp)` instruction where needed. Your job is to modify the `sp_incremented_or_set_and_masked` function in `sfi.c` to require an `addi` to be immediately followed by a load or store from (an offset to) `sp`. The part you want to change is ```c /* immediate increment of sp okay by itself (?) */ if (dec.op == rv_op_addi && dec.rd == rv_ireg_sp && dec.rs1 == rv_ireg_sp) goto accept; ``` You will submit the `sfi.c` file with your fix. Make sure to test that your toolchain still runs the demo SFI application in fast SP mode. # Logistics You will submit using Gradescope. You should submit a zip file of your solution, without directory structure. Your solution should include at least the following files: * `parta.sfi`, `parta.s`, and `parta.s0` and `parta.c`, if they exist: This is your solution to Part A. Running `make modules/parta.sfi` should reproduce each file from its precursor. * `partb.sfi`, `partb.s`, and `partb.s0` and `partb.c`, if they exist: This is your solution to Part B. Running `make modules/partb.sfi` should reproduce each file from its precursor. * `partd.sfi`, `partd.s`, and `partd.s0` and `partd.c`, if they exist: This is your solution to Part D. Running `make` in the `modules` directory should reproduce each file from its precursor. * `sfi.c`, `rewrite.pl`, and `rewrite-fastsp.pl`: This is your solution to Part C and Part E. The `sfi.c` file should include your changes for both Part C (in `direct_branch_allowed` and `mask_after_code_ptr_set`) and Part E (in `sp_incremented_or_set_and_masked`). The `rewrite.pl` and `rewrite-fastsp.pl` files should include your changes from Part C; the `rewrite-fastsp.pl` file should additionally have the `$check_sp = 1;` line uncommented. # Grading TBD. Scoring is based on functionality in our testing, though solutions that violate the spirit of the assignment (e.g., use an unrelated bug) may be docked points.