CS375: Assignment 5 JFlex and CUP. Assigned: Tuesday, February 23th, 2010 Due: Thursday, March 4th at 11:59pm Updated: Thursday, February 25th Note: This is exactly the same as assignment 4, except here you have to use JFlex and CUP. == 0. Updates == 1. There were some typos in the grammar. All the expressions are parenthesized, so ( is actually '('. It has been fixed now. 2. You do not need to worry about redeclaration of variables. Since all local variables are defined at the begining of a method, you do not need to worry about scoping rules for variables. 3. There will be a main method in the input program. Your parser should not depend on this, rather this is how i would be testing your parsers. 4. There is no overloading of methods. 5. You would need a symbol table for each method as pointed out in class. One approach would be to have a separate class for the symbol table (using hash tables or any approach). A symbol table object would be created inside your getMethod method, and be initialized by the getDeclarations method call. Once initialized it would be passed to all (almost) other method invokations inside the getMethod to make sure each rule has the appropriate information. Each method would have its own symbol table. == 1. Bali Compiler [100 points] == Create a handwritten recursive-descent parser and SaM code generator for the Bali language, using the SaMTokenizer for a lexical analyzer. Your compiler should take a Bali2 program file as input and produce a SaM program that executes the Bali2 program. === 1.1 Grammar === The following is the grammar specification of the Bali language. In the grammar specification, all lower-case symbols denote a literal value. Additionally, these literals are reserved words (keywords) and can not be used as identifiers for variables or methods. Non-alphanumeric characters surrounded by single quotes denote the literal consisting of only the non-alphanumeric characters. Upper-case symbols are non-terminals. '*' means zero or more occurrences. '?' means one or zero occurrences. '[ ]' is the character class construction operator. Parentheses are used to group sequences of symbols together. A Bali program is a sequence of zero or more method declarations. The only type in this language is int. Each method declaration has a return type, zero or more formals, and a body. The body consists of zero or more variable declarations, and a sequence of statements. Variables can be initialized when they are declared. The method body consists of a sequence of statements, where each statement is an assignment statement, a conditional statement, a while loop, a return statement, a break statement, a block, or a null statement. These statements have the usual meaning; a break statement must be lexically nested within one or more loops, and when it is executed, it terminates the execution of the innermost loop in which it is nested. Expressions are fully parenthesized to avoid problems with associativity and precedence. The literal 'true' is the value 1. The literal 'false' is the value 0. For the purposes of expressions used in conditions, any non-zero value is true and the value zero is false. Characters between and including '//' and the end of the line are interpreted as a comment and should be discarded. *************************************************************** PROGRAM -> METH_DECL* METH_DECL -> TYPE ID '(' FORMALS? ')' BODY FORMALS -> TYPE ID (',' TYPE ID)* TYPE -> int BODY -> '{' VAR_DECL* STMT* '}' VAR_DECL -> TYPE ID ('=' EXP)? (',' ID ('=' EXP)?)* ';' STMT -> ASSIGN ';' | return EXP ';' | if '(' EXP ')' STMT else STMT | while '(' EXP ')' STMT | break ';' | BLOCK | ';' BLOCK -> '{' STMT* '}' ASSIGN -> LOCATION '=' EXP LOCATION -> ID METHOD -> ID EXP -> LOCATION | LITERAL | METHOD '(' ACTUALS? ')' | '('EXP '+' EXP')' | '('EXP '-' EXP')' | '('EXP '*' EXP')' | '('EXP '/' EXP')' | '('EXP '&' EXP')' | '('EXP '|' EXP')' | '('EXP '<' EXP')' | '('EXP '>' EXP')' | '('EXP '=' EXP')' | '(''-' EXP')' | '(''!' EXP')' | '(' EXP ')' ACTUALS -> EXP '('',' EXP')'* LITERAL -> INT | true | false INT -> '-'? [1-9] [0-9]* ID -> [a-zA-Z] ( [a-zA-Z] | [0-9] | '_' )* If a program does not satisfy the grammar above or does not satisfy the textual description of the language, your compiler should print a short, informative error message and/or exit with a non-zero exit status. === 1.4 Logistics === Make sure that your compiler is in the java class assignment5.BaliCompiler. Your compiler should take two command-line arguments. The first argument is an input file containing a Bali program. The second argument is an output file that will contain your generated SaM code. == 2. Turn-in Instructions == Assignment submission will be done electronically using the turnin program. First create the following directory structure in your current directory: assignment5/ assignment5/README - Contains the students who worked on this assignment assignment5/src/.../*.java - The source for all the code in assignment5.jar Please verify that the assignment5 directory contains the required files (in particular a README file). You can submit your assignment by executing the following command: turnin --submit rashid assignment5 assignment5 If you worked on this assignment with a partner, only one person needs to submit the assignment (but please remember to include your partner's name in the README file).