CS375: Assignment 6 AST for C-subset Assigned: Tuesday, March 23rd. Due: Tuesday, April 6th at 11:59pm Updated: Tuesday, March 31st. == 0. Updates * Go over the grammar and let me know if you find anything that needs to be cleared up. I would add answers to general queries here. * Added updates on the identifier and type in section 3. === 0.1 Grading Details === I had some queries abut how i would grade the assignment, so here are the details. To help you ensure that you are getting things right, you should implement a "void visitor" for all your classes. What this visitor does is simple: It just prints out the tokens used to make the AST node. So suppose you have an AST node for an expression, the void visitor should just print out that expression (semantically equivalent) as it appeared in the source. The way i would check your program is by having your program read an example C file, and then generate the AST for it. From the AST i can choose (randomly) some AST node, and ask it to generate the original code (or call the abstract void visitor on it). The i would use gcc to execute both versions and see if the outputs match. I can use the whole program (root node) or any arbitry node in the AST. One change that you must implement in the VoidVisitor is to have the return type of each visitor be String instead of void so it returns the appropriate String. This would be similar to processing the child leaves (getting their String representation) and combining them appropriately to produce the current String. Also, some additional points to add to the grammar: 1) Do not consider anything that is not in the grammar (break, continue etc) 2) You should be able to handle arrays as parameters since the grammar allows for that. 3) Only consider long and Structures (Structs can contain other strucuts, but eventually everything is a long). 4) You do not have to make use of all of the classes in the compiler folder, some of them are for later use, some you would not use at all. 5) You should have an AST node for each non-terminal. 6) Use the example test cases for solving ambiguity, or best, send me an email. 7) Please include your CSID, UTEID, and name in your readme files followed by any notes/issues you have. 8) Your code (Main method) should be structured as follows: i) Read in and build AST in the first part. ii) Call the VoidVisitor to generate the String representing the program String onto the console. iii)Separate the two cleary so i can change/check your code as necessary. == 1. Bali2 Compiler [100 points] == Generate AST for C-Subset a language that contains a subset of C. Section 3 provides the grammar for the language. You will be provided with grammar files for flex, cup and the AST classes. You can add to them as you like. You will also get some example files that you can use to test your code on. However, feel free to share examples and submit interesting ones with the submissions. === 1.1 JFlex === The Lexer.flex file contains the basic symbols you would need for getting the lexer up and running. The %debug is enabled by default which would let you know what token was last accepted when the parser (if it did) failed. You should remove this option (by deleting the line) once you are done debugging. === 1.2 CUP ==== The Parser.cup file contains the grammar definition for nearly all the syntatic constructs covered in the C-Subset definition. Here is what is not covered for sure : * Structure definitions. * Procedure calls with more than one parameters You should go over the definition and add to it to make sure it conforms to the grammar. Right now, as the code stands, it would accept some of the example files (loop.c, prime.c, regslarge.c, sieve.c, sort.c, collatz.c) and not the rest. You should see what rules are missing and add those to enable the parser to accept all the syntatically correct programs. Right now, the CUP semantic actions are there just to help you see what rules are being applied by the parser. You should insert appropriate code to generate the AST nodes as the parser applies the rules. Some of the nodes have been updated with the right set of actions for building appropriate data strucutres. For instance, the variable declaration rules build up a LinkedList of the right VarDecl nodes in a list so that each declaration gets the right type as defined by the declaration. This example would also help you see how you can use token information form the lexer (in this case, the identifier name) in your parser. === 1.3 AST Classes === The AST classes are laid out as follows. There is an abstract base class Node for all the nodes. The section includes most of the classes you would need to build an AST. Just to test your code, you can use the visit method. The root of the program AST should be the ProgramNode, which contains the variable and constant declaration as well as the procedure declarations. Remember, the class you use should be specified in the non-terminal section of the CUP file. You can use superclasses to make the AST nodes more flexible. === 1.4 Logistics === The directory strucutre is laid out as follows: ./compiler/ast/*.java AST classes (you would add yours here) ./examples/*.c The example C files that you can test your code on. ./lib/*.jar The CUP and JFlex jar files ./Tester.java A simple java program that calls your parser. ./Lexer.flex The JFlex specification file ./Parser.cup The CUP specification file ./Parser.cup.noAST Acceptor cup file (no semantic actions, just s.o.p) Useful to see if your acceptor is working. ./test.sh A linux test script to check your code on all examples ./test.bat Similar script for DOS/Windows. == 2. Turn-in Instructions == Assignment submission will be done electronically using the turnin program. You should use the same directory structure as provided to you. Include all your ast files in the compiler/ast folder. Update your Parser.cup and Lexer.flex files. Please verify that the assignment6 directory contains the required files (in particular a README file). You can submit your assignment by executing the following command: turnin --submit rashid assignment6 assignment6 If you worked on this assignment with a partner, only one person needs to submit the assignment (but please remember to include your partner's name in the README file). == 3. Grammar == - Program is the start symbol. - Use C syntax for identifier names. Factor = Designator | Number | "(" Expression ")". Term = Factor {("*" | "/" | "%") Factor}. SimpleExpr = ["+" | "-"] Term {("+" | "-") Term}. EqualityExpr = SimpleExpr [("<" | "<=" | ">" | ">=") SimpleExpr]. Expression = EqualityExpr [("==" | "!=") EqualityExpr]. ConstExpression = Expression. FieldList = VariableDeclaration {VariableDeclaration}. StructType = "struct" Ident ["{" FieldList "}"]. Type = Ident | StructType. IdentArray = Ident {"[" ConstExpression "]"}. IdentList = IdentArray {"," IdentArray}. VariableDeclaration = Type IdentList ";". ConstantDeclaration = "const" Type Ident "=" ConstExpression ";". Designator = Ident {("." Ident) | ("[" Expression "]")}. Assignment = Designator "=" Expression ";". ExpList = Expression {"," Expression}. ProcedureCall = Ident "(" [ExpList] ")" ";". IfStatement = "if" "(" Expression ")" "{" StatementSequence "}" ["else" "{" StatementSequence "}"]. WhileStatement = "while" "(" Expression ")" "{" StatementSequence "}". Statement = [Assignment | ProcedureCall | IfStatement | WhileStatement]. StatementSequence = {Statement}. FPSection = Type IdentArray. FormalParameters = FPSection {"," FPSection}. ProcedureHeading = Ident "(" [FormalParameters] ")". ProcedureBody = {ConstDeclaration | VariableDeclaration} StatementSequence. ProcedureDeclaration = "void" ProcedureHeading "{" ProcedureBody "}". Program = {ConstantDeclaration | VariableDeclaration} ProcedureDeclaration {ProcedureDeclaration}. You should only consider long data types, so parameters, variables and structure members can only be long. However, structures can also contain other structures as instances, but eventually they all contain long members. Ident and Number are terminal symbols.