CS 380C: Programming Assignment 2

      Assignment: Dataflow and optimizations
      Instructor: Kathryn S. McKinley
      Assigned: Wednesday, September 9, 2009
      Due: Friday, October 2, 2009, 5PM

    Objective

    The goal of this assignment is to develop an optimizer that performs dataflow analysis and scalar optimizations. This assignment has the following components:

    1. Construct the Control Flow Graph (CFG). (5%)
    2. Perform Strongly Connected Region (SCR) analysis. (5%)
    3. Perform reaching definitions (20%) and use it to perform simple constant propagation. (25%)
    4. Perform live variable analysis (20%) and use it to perform dead statement elimination. (25%)
    5. Extra credit: perform constant propagation with the Wegman & Zadeck algorithm. (+15%)

    Project Description

    You will add this functionality to the translator you implemented for the Lab 1. Your compiler will call the csc compiler to compile a C subset file named as <filename.c>. The optimizer will take 3-address format code as input, perform the analysis and optimizations on it, and generate optimized 3-address format code as output. The last step of your optimizer will convert this 3-address form to C, using your translator from Lab 1.

    Getting Started On Lab2

    1. Download the example C files and the CFG output at http://www.cs.utexas.edu/users/mckinley/380C/labs/cs380c_lab2.tgz
    2. Decompress the file on the cs unix machines using
      	"tar xvfz cs380c_lab2.tgz".
      
      This command will produce the following directory structure
      	cs380c_lab2
      		examples
      		lab2
      
      The examples directory contains 10 example c files named as <filename.c>, and their CFG output named <filename.ta.cfg>, and two bash scripts (check.sh, check-one-cfg.sh). Many of these are the same as in Lab 1. You can use the provided files to verify your implementation.
    3. Implement your own optimizer in the lab2 directory. Your compiler should accept 3-address code as input from stdin and write output to stdout. Your compiler backend should generate the following output:
      1. CFG
      2. 3-address code - a backend that prints out your optimized (or unoptimized) 3-address format
      3. C - functionality from Lab 1
      4. An optimization report

      Your compiler will be invoked by the script run.sh' and should accept the following command line arguments.
      1. -opt, a comma separated list of optimizations: scp (Simple Constant Propagation), dse (Dead Statement Elimination) and wcp (Wegman/Zadeck Constant Propagation)
      2. -backend, specifies the particular code generator: c, cfg, 3addr, and rep (the optimization report)

      Here are some usage examples.
      1. ./run.sh -backend=c # This is lab 1
      2. ./run.sh -backend=cfg # Part 1 of lab 2
      3. ./run.sh -opt=scp -backend=3addr # Perform simple constant propagation and generate output in 3-address format.
      4. ./run.sh -opt=wcp -backend=c # Perform Wegman constant propagation and produce C code as output.
      5. ./run.sh -opt=scp -backend=cfg # Perform simple constant propagation and write out the cfg (after these optimizations)
      6. ./run.sh -opt=scp -backend=rep # Perform simple constant propagation and write out the optimization report (after these optimizations)
    4. Verify your implementation.
      Modify the scripts in example directory to check your own implementation
    5. Turn in your implementation. (See "Turning In Your Assignment" section for more details.)

    Optimizer Implementation Specification

    1. Construct the Control Flow Graph (CFG). Before performing iterative data-flow analysis, you will need to generate the CFG. The nodes of the CFG are the basic blocks. The basic blocks in turn consist of a sequence of instructions in 3-address format.

      Your implementation should print out the lead instruction's number as the basic block number. It also needs to print out the edges between the basic blocks.The following is the expected output for the gcd.c example:

       
      	Function: 2
      	Basic blocks: 2 3 5 14
      	CFG:
      	2 -> 3
      	3 -> 5 14
      	5 -> 3
      	14 ->
      	Function: 18
      	Basic blocks: 18 30 46
      	CFG:
      	18 -> 30
      	30 -> 46
      	46 ->
      
      The output should be self-explanatory. The numbers are the instruction numbers that start the functions and basic blocks (leaders from the class lecture). The list of basic blocks and CFG successors should be sorted numerically. No optimization report is required for CFG generation.

      For all programs in the examples directory, the CFGs expressed in the above format are given along with the source program. Before turning in your assignment, you should check whether your output matches these files. We will perform additional tests when grading, and recommend you use your test files from Lab 1 and others to check your code.

    2. Perform SCR analysis on the CFG. Produce an optimization report in the following format:
      	Function: 2
      	Basic blocks: 2 6 11 22
      	SCR: 2 6 11
      
    3. Perform simple constant propagation based on reaching definitions. You need to perform reaching definitions analysis and run a simple constant propagation analysis that determines if the reaching definitions to a variable are all the same constant. The execution time of the algorithm is not a criterion. Your analysis should converge and compute the correct results. You must report the number of constants which have been propagated. Produce an optimization report in the following format:
      	Function: 2
      	Number of constants propagated: 3
      	Function: 18
      	Number of constants propagated: 6
      
    4. Perform dead statement elimination based on liveness analysis. First perform liveness analysis and then use the result of the analysis to eliminate statements that produce a result that is not live. You must report the number of statements which have been eliminated. The statements in strongly connected regions (SCR) and those that are not in a SCR need to be counted separately. Produce optimization report in the following format:
      	Function: 2
      	Number of statements eliminated in SCR: 1
      	Number of statements eliminated not in SCR: 2
      	Function: 18
      	Number of statements eliminated in SCR: 2
      	Number of statements eliminated not in SCR: 3
      
    5. Extra Credit. Perform a more sophisticated constant propagation algorithm based on your own insights and/or Wegman & Zadeck, Constant Propagation with Conditional Branches, ACM Transactions on Programming Languages and Systems, 13(2):181-210, April 1991. Report number of constants your algorithm propagates. Produce an optimization report with the same format as simple constant propagation.

    Turning In Your Assignment

    Your assignment should contain the following.

    1. A single tar.gz file named lab2.tgz, which, when extracted, creates the lab2 directory.
    2. The lab2 directory can contain sub-directories.
    3. The lab2 directory should contain the following files:
      1. README - Please include your name(s) and UTEID.
      2. example-scp.c - a synthetic program that tests constant propagation.
      3. example-dse.c - a synthetic program that tests statement elimination.
      4. compile.sh - a script to compile your source code.
      5. run.sh - a script that runs your compiler. This script should read 3-address code as input from stdin and write the output to stdout. The output is specified by the command line arguments described above.

    Turn in your assignment by running the following commands on a UTCS Linux machine.

    	$ # Go the parent directory of the lab1 directory.
    	$ tgz lab2 lab2/
    	$ turnin --submit hadi cs380c-lab2 lab2.tgz
    	$ turnin --list hadi cs380c-lab2
    
    Thanks & Acknowledgements

    These assignments are derived from similar ones developed by Keshav Pingali. Martin Burtscher developed the csc compiler that the assignments use. Thank you Keshav and Martin.