Homework 2: Scalar Optimizations using Data-flow Analysis Course: CS 380C: Advanced Compiler Techniques (Fall 2007) Instructor: Keshav Pingali Assigned: Thursday, September 13, 2007 Due: Thursday, September 27, 2007, 9:30 AM Updated: Sat Sep 15 14:23:30 CDT 2007 1. Update --------- The examples directory with the CFG output is available here. http://www.cs.utexas.edu/users/pingali/CS380C/2007fa/assignments/assignment2/assignment2.tar.gz Please re-read the assignment carefully again. There was some confusion as to how to generate the output required for various parts of this assignment. This information has been updated. Please take a look at section 3 below. 2. Objective ------------ The goal of this assignment is to perform several scalar optimizations. You will build this on top of the work you did for homework 1. This assignment has the following components: 1. Construct the Control Flow Graph (CFG). 2. Perform liveness analysis. 3. Use the liveness information to perform dead-code elimination. 4. Perform Constant propagation. 2.1. Construct the Control Flow Graph (CFG) ------------------------------------------- Before performing iterative data-flow analysis, you will need to generate the CFG. The nodes of the CFG are the basic blocks. The basic blocks in turn consist of a sequence of instructions in 3-address format. To make evaluating your code easier, you have to write a script 'cfg.sh' that will print the information about the CFG. It should write out the details of each basic block as well as the edges between the basic blocks. This is the output generated for gcd.c (without the BEGIN and END lines): BEGIN Function: 2 Basic blocks: 2 8 10 19 CFG: 2 -> 8 8 -> 10 19 10 -> 8 19 -> Function: 24 Basic blocks: 24 CFG: 24 -> END The output should be self-explanatory. The numbers are the instruction numbers that start the functions and basic blocks. The list of basic blocks and CFG successors should be sorted numerically. For all programs in the examples directory, the CFGs expressed in the above format are given along with the source program. Before turning in your assignment, you should check whether your output matches these files. 2.2. Perform liveness analysis ------------------------------ For this part of the assignment, you need to perform liveness analysis and identify dead variables (variables whose values will not be used later in the program). You can use a simple iterative algorithm, or a more efficient work-list based algorithm for your dataflow analysis. The execution time of the algorithm is not a criterion. Whether your analysis converges and computes the right information is what matters. 2.3. Perform dead code elimination ---------------------------------- Now that you have identified dead variables and dead instructions, remove these instructions from the CFG. Removing dead instructions will lead to move dead variables, and your code should remove them as well. 2.4. Perform constant propagation --------------------------------- Perform Simple Constant propagation (SC) as stated in Kildall: "A unified approach to global program optimization", First annual ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages. http://portal.acm.org/citation.cfm?id=512945 3. Output --------- Your compiler should accept 3-address code as input from stdin, and write output to stdout. Your compiler should implement the following backends 1. C (this was homework 1) 2. CFG (this is section 2.1 of this homework) 3. 3-address code (a backend that writes output back in 3 address format) Your compiler should implement the following optimizations 1. Dead code elimination 2. Simple Constant propagation Your compiler invoked by the script 'run.sh' should accept the following command line arguments. 1. -opt, a comma separated list of optimizations. The optimizations to support are dce and scp. 2. -backend, the backend to be used to write output to. The optimizations to support are c, cfg and 3addr. Here are some usage scenarios. 1. ./run.sh -backend=c # This was assignment 1 2. ./run.sh -backend=cfg # This is section 2.1 of this homework 3. ./run.sh -opt=scp -backend=3addr # Perform simple constant propagation and generate output is 3-address format. 4. ./run.sh -opt=scp,dce -backend=c # Perform simple constant propagation and dead-code elimination and produce C code as output. 5. ./run.sh -opt=dce,scp -backend=cfg # Perform dead-code elimination and simple constant propagation and write out the cfg (after these optimizations) 4. Turning in your assignment ----------------------------- Download this tarball. http://www.cs.utexas.edu/users/pingali/CS380C/2007fa/assignments/assignment2/assignment2.tar.gz This is organized similar to homework 1. The examples directory has some additional output files and scripts to check section 2.1 of this assignment. Your assignment should contain the following: 1. A single tar.gz file named hw2.tar.gz, which, when extracted, creates directory hw2. 2. The hw2 directory can contain sub-directories. 3. The hw2 directory should contain the following files: a. README - Please include your name and UTEID here. b. Synthetic test programs: i. example-deadcode.c - A synthetic program to test the dead code elimination phase. ii.example-scp.c - A synthetic program to test the Constant Propagation phase. c. compile.sh - a script to compile your source code. d. run.sh - a script that runs your compiler. This script should read 3-address code as input from stdin and write output to stdout. The output is specified by the command line arguments described in section 3. The hw2 directory already exists with these files in the tarball you downloaded. Turn in your assignment by running the following commands on a UTCS Linux machine. $ # Go the parent directory of the hw2 directory. $ tar -zcvf hw2.tar.gz hw2 $ turnin --submit suriya cs380c-hw2 hw2.tar.gz $ turnin --list suriya cs380c-hw2 Please get a UTCS linux account as soon as possible. Please use turnin to submit your assignment. Only homeworks that are turned in using the procedure described above will be accepted. 5. Hints -------- 0. Start early :) 1. You may find it easier to implement a generic dataflow analysis framework, and plug deadcode elimination and constant propagation into this framework. 2. The output files were generated by your TA. Please start early and compare your output with the TA's output. If you suspect these files to be incorrect, please send your TA a note. Watch the errata page for corrections. http://www.cs.utexas.edu/users/pingali/CS380C/2007fa/errata.html 3. Watch the clarifications page http://www.cs.utexas.edu/users/pingali/CS380C/2007fa/clarifications.html