CS375: Assignment 2 Expression Compiler Assigned: Tuesday, February 2nd, 2010 Due: Tuesday, February 9th, 2010. 11:59pm Updated: Monday, February 4th, 2010 ======= Updates ======================== 1) There is a modified version of SaM's libraries available at http://www.cs.utexas.edu/users/pingali/CS375/2010Sp/SaM/SamR.jar This jar file provides a method checkInt() in the SamTokenizer class which behaves as follows. It returns a boolean, which is true if the current token is an integer and returns false otherwise. After the function call, the cursor of the tokenizer points to the character/symbol immediately following the integer. Feel free to use (or not use) this when writing your parser. 2) I have added some more functionality to the SaM tokenizer for those of you who would like to use it. The newer version adds methods to peek specific tokens (although you have to do error checking yourself since it would throw an exception if the incorrect token is "peeked". The jar file is available at http://www.cs.utexas.edu/users/pingali/CS375/2010Sp/SaM/SamR2.jar This file includes peekInteger, peekFloat, peekWord, peekString, peekOp which should make writing the parser a lot easier. The source file for SamTokenizer is also available at http://www.cs.utexas.edu/users/pingali/CS375/2010Sp/SaM/SamTokenizer.java Let me know if you have trouble with the code and I will get back to you asap. Email me at rashid.kaleem@gmail.com ======== Problem 1 [50 points] ========== Write an expression compiler that takes fully parenthesized expressions as shown below and generates SaM programs that evaluate the expressions. Specifically, the grammar of expressions is: EXP -> Integer EXP -> ( EXP + EXP ) EXP -> ( EXP - EXP ) EXP -> ( EXP ? EXP : EXP ) The last rule, ( EXP0 ? EXP1 : EXP2 ), is conditional evaluation. Its semantics are: (a) Evaluate EXP0 (b) If EXP0 is zero, the value of this expression is EXP2 (c) Otherwise, the value of this expression is EXP1 The compiler should be written in Java. It should read in an expression from standard input and then write the SaM program to standard output. In the case that the stream read from standard input does not contain a valid expression (e.g., an empty file or two expressions), your compiler should print an error messsage and return some non-zero exit value. In this problem, you are free to write the guts of your compiler in any way you choose, but you must follow the rules given below so we can run your program: the main entry point for your compiler should be a class called assignment2.Problem1, which naturally should contain a static main method. There are two common ways to get data to standard input. One, you can run your compiler from the command line and then type your expression in the terminal. When you are done, type Control-D (Control-Z on Windows) to send the end of file character to your program. Two, you can write your expression to a file and then use unix pipes like in the following command: cat expresssion.txt | java -cp ... assignment2.Problem2 The following is a sample template for the compiler. Feel free to use or not use it as you see fit: package assignment2; import java.io.InputStreamReader; import edu.cornell.cs.sam.io.SamTokenizer; public class Problem1 { public static void main(String[] args) { try { SamTokenizer f = new SamTokenizer(new InputStreamReader(System.in)); String program = getExp(f); System.out.println(program); System.out.println("STOP"); } catch (Exception e) { System.exit(1); } } static String getExp(SamTokenizer f) { switch (f.peekAtKind()) { case CHARACTER: case COMMENT: case FLOAT: case INTEGER: case OPERATOR: case STRING: case WORD: // Fill in code generation here throw new Error("Operation not supported"); } } } Here are some useful methods of SamTokenizer (full documentation is in the SaM API javadoc available from the course homepage): int getInt() - Returns an integer from the token stream, fails if the token is not an integer. Moves the current position to the next token. char getOp() - Returns an operator from the token stream, fails if the token is not an operator. Moves the current position to the next token. void match(char c) - Returns an operator from the token stream, fails if the token does not match the given character. Moves the current position to the next token. ======== Problem 2 [50 points] ========== This problem requires you to implement a recursive-descent parser in a SYSTEMATIC WAY, using the template for recursive-descent parsers shown in class. No credit will be given for ad hoc code. Consider the following grammar : S -> E$ E -> E+T | T T -> T*F | F F -> (E) | int This grammar is not LL(1) but it can be shown that the following LL(1) grammar generates the same language. S -> E$ E -> TE' E'-> +TE' | EPSILON T -> FT' T'-> *FT' | EPSILON F -> (E) | int EPSILON is the empty string. The parsing table for the above LL(1) grammar is available at http://www.cs.utexas.edu/~pingali/CS375/2010Sp/assignments/parseTable.pdf Using this parsing table and the table-driven strategy shown in class, write down a recursive descent parser, using the code skeleton shown in Problem 1. Your program should return true if the input program can be produced by the grammars given above, and false otherwise. == Turn-in Instructions == Assignment submission will be done electronically using the turnin program. First create the following directory structure in your current directory: assignment2/ assignment2/README - Contains the students who worked on this assignment assignment2/assignment2.jar - Compiled code for Problem 1 and Problem 2 assignment2/src/*.java - The source for all the code in assignment2.jar Please include your CS-Id, UTID and name in the README file. The class files of your compiler should be bundled in a jar file, assignment2.jar. To create the file, assuming you have a directory called src which contains all your source code, you may run the following commands from your shell: [[ -d bin ]] || mkdir bin find src -name '*.java' | xargs javac -cp SaM-2.6.2.jar -d bin jar cf assignment2.jar -C bin . Make sure that your compiler runs properly with the following command lines: java -cp assignment2.jar:SaM-2.6.2.jar assignment2.Problem1 java -cp assignment2.jar:SaM-2.6.2.jar assignment2.Problem2 You can submit your assignment by executing the following command: turnin --submit rashid assignment2 assignment2 And you can verify the submission by executing the following command: turnin --list rashid assignment2 If you worked on this assignment with a partner, only one person needs to submit the assignment (but please remember to include your partner's name in the README file).