CS 380C: Programming Assignment 3

      Assignment: SSA and optimizations
      Instructor: Kathryn S. McKinley
      Assigned: September 28, 2009
      Due: Friday, October 23, 2009, 5PM

    Objective

    The goal of this assignment is to build Static Single Assignment (SSA) form, and perform optimizations on SSA. This assignment has the following components:

    1. Build Static Single Assignment (SSA) form. (25%)
    2. Perform SSA based constant propagation. (30%)
    3. Perform SSA based loop invariant code motion. (30%)
    4. Translate SSA back to non-SSA 3-address code. (15%)
    5. Extra Credits: Perform global common subexpression elimination. (15%)

    Project Description

    You will add this functionality to the optimizer and the translator you have implemented for Lab 1 and 2. Your compiler will call the csc compiler to compile the subset (SC) files. Your new optimizer will be able to take the 3-address code as input, build 3-address SSA code, output 3-address SSA code, perform optimizations on it, translate SSA back to non-SSA 3-address code, generating non-SSA optimized 3-address code.

    Getting Started On Lab3

    1. Download the lab files at http://www.cs.utexas.edu/users/mckinley/380C/labs/cs380c_lab3.tgz
    2. Decompress the file on the cs unix machines using
      	"tar xvfz cs380c_lab3.tgz".
      
      This command will produce the following directory structure
      	cs380c_lab3
      		lab3
      
      There are only two script files and a README file in the lab3 directory. You should reuse all the examples files from lab2.
    3. Implement your own new optimizer in the lab3 directory. As usual, your compiler should accept 3-address code as input from stdin, and write output to stdout. The backend of your compiler needs to generate the following output:
      1. CFG
      2. 3-address code - The backend writes out the optimized or unoptimized 3-address format, as you did previously.
      3. 3-address SSA code - The backend writes out 3-address SSA code.
      4. C - Same Lab 1
      5. Optimization report

      Your compiler will be invoked by the script 'run.sh', and should accept the following command line arguments.
      1. -opt, a comma separated list of transformations. The transformations you will support are:
        • scp (simple constant propagation)
        • cse (common subexpression elimination)
        • licm (loop invariant code motion)
        • ssa (static single assignment)
        For example:
        • -opt=ssa,scp means convert to SSA form and perform constant propagation.
        • -opt=ssa,licm means convert to SSA form and then perform constant propagation.
      2. -backend determines which backend to use: c, cfg, 3addr, rep, and ssa. If ssa is not specified, generate 3-address code without SSA instructions. For example,
        -backend=ssa,3addr means to generate 3-address format code in SSA form.

    Optimizer Implementation Specification

    1. Build SSA form. Implement the SSA algorithm discussed in class. Use the opcode phi for phi functions. Name subscripted variables as variable$subscript. For example, variable i will become i$0, i$1, etc. Take a look at a simple example of a loop, and its representation in 3-address and SSA formats.
      	i = 0;
      	while (i < 10) {
      		i = i + 1;
      	}
      
      3-address format:
      	instr 4: move 0 i#-8
      	instr 5: cmplt i#-8 10
      	instr 6: blbc (5) [10]
      	instr 7: add i#-8 1
      	instr 8: move (7) i#-8
      	instr 9: br [5]
      	instr 10: ret 0
      
      Thw 3-address format with SSA: (Note that SSA does not have offsets. You have to keep track of offsets in your symbol table for memory allocation.)
      	instr 4: move 0 i$0
      	instr 23: phi i$0 i$2
      	instr 24: move (23) i$1
      	instr 5: cmplt i$1 10
      	instr 6: blbc (5) [10]
      	instr 7: add i$1 1
      	instr 8: move (7) i$2
      	instr 9: br [23]
      	instr 10: ret 0
      
      Notice the new instructions instr 23 and 24. Only variable i has a phi instruction. Since virtual registers generated by our frontend csc are all block-local (i.e., written to and read from within the same basic-block), they do not require phi functions. In other words, all of them would be dead. Therefore, do not insert phi functions for block-local virtual variables or other variables with this property. Perform an analysis (using live variables from Lab 2) that detects if the phi function is dead. For this assignment, you do not have to add support to your frontend to read in 3-address code in SSA form.
    2. Perform SSA based Constant Propagation. Perform constant propagation on the CFG in SSA form. You must report the number of constants you propagate. Produce an optimization report in the following format:
       
      	Function: 2
      	Number of constants propagated: 3
      	Function: 18
      	Number of constants propagated: 6
      
    3. Perform SSA based loop invariant code motion. For each statement in the program, if none of the operands points to a phi node or a definition inside a loop, the result of this statement is loop invariant. You should hoist that statement to the loop header. Note that you can and should apply these criteria recursively. Produce an optimization report in the following format:
      	Function: 2
      	Number of statement hoisted: 3
      	Function: 9
      	Number of statement hoisted: 2
      
    4. Translate from SSA to non-SSA 3-address code. After performing the SSA based optimizations, you need to translate from SSA format back to regular 3-address code, and then, if specified, to C code.
    5. Extra Credit: Perform global common subexpression elimination on the CFG in SSA form.

    Turning In Your Assignment

    Your assignment should contain the following.

    1. A single tgz file named lab3.tgz, which, when extracted, creates the lab3 directory.
    2. The lab3 directory can contain sub-directories.
    3. The lab3 directory should contain the following files:
      1. README - Please include your name(s) and UTEID(s) here.
      2. compile.sh - a script to compile your source code.
      3. run.sh - a script that runs your compiler. This script should read 3-address code as input from stdin and write output to stdout. The output is specified by the command line arguments described in the previous section.

    Turn in your assignment by running the following commands on a UTCS Linux machine.

    	$ # Go the parent directory of the lab1 directory.
    	$ tgz lab3 lab3/
    	$ turnin --submit hadi cs380c-lab3 lab3.tgz
    	$ turnin --list hadi cs380c-lab3
     
    Thanks & Acknowledgements

    These assignments are derived from similar ones developed by Keshav Pingali. Martin Burtscher developed the csc compiler that the assignments use. Thank you Keshav and Martin.