CS 307: 10. Expression Unparsing

Due: Wednesday, November 14, 2001.

Our final two assignments will implement a program to translate Scheme into other languages: Pascal, C++, and Java. A compiler parses or translates source code into intermediate code and then translates the intermediate code into machine code. A common form of intermediate code is abstract syntax trees, which are similar to Lisp code. Translation from intermediate code back to source code is sometimes called unparsing.

Most of our functions have a parameter language that specifies the desired target language (pascal, c++, or java).

  1. Write a function (atom->lang atm language) that will convert an atom (non-pair) atm into a string that, when printed using display (which prints a string without the string quotes), will be in the correct form for language. Symbols should be converted using symbol->string. Numbers should be converted using number->string. The following special cases need to be handled:

    SchemePascalC++Java
    ()nilNULLnull
    #ttruetruetrue
    #ffalsefalsefalse
    "string"'string'"string""string"

    The conversion for strings must include the quotation characters (single or double quotes) as part of the string since the string will be printed by display, without string quotes. For example, if the original string is "foo" the output for Java should be the string "\"foo\"".

  2. Write a function (op->lang op language) to convert an operator into the proper form for another language. Many operators are identical in the target language and do not need to be changed; if no conversion is defined, the original op can be returned. assoc is a useful function for looking up the operators for the different languages in a quoted table. op->lang should perform the following conversions, returning strings:

    SchemePascalC++Java
    eq?, eqv?=====
    set!:===
    displaywritecout << System.out.print
    newlinewriteln cout << "\n" System.out.println
    sinsinsinMath.sin
    expexpexpMath.exp
    roundroundroundMath.round
    andand&&&&
    oror||||
    notnot!!

  3. Write a function (op-prec op) that will return the precedence of an operator. Precedence determines which operators are evaluated first in a flat expression: if two operators are adjacent, the one with higher precedence is evaluated first. Use precedence values of 5 for + and -, 6 for * and /, 3 for all comparison operators such as <, 1 for set!, 7 for or, 8 for and, and 9 for not. Return 0 if a precedence is not defined. Write your function using assoc.
  4. Write a function (expr->lang x prec language) that will print an expression x in the correct form for another language. Use display to print strings without printing the string-quotes. Call expr->lang recursively for subexpressions.
    1. If the expression x is not a pair, convert it using atom->lang and print using display.
    2. If the expression is a pair and the precedence of the function (operator) is defined, then print the transformed form of the operator (using op->lang) between its operands. Print parentheses around the expression only if the precedence of the operator is less than or equal to the precedence prec of the surrounding expression. Always print parentheses for unary - (- with only one operand). In calling expr->lang for subexpressions, use the precedence of the current operator as the value for prec.
    3. If the operator is vector-ref, print the first operand, display "[", call expr->lang on the second operand, and display "]".
    4. An easy way to handle vector-set! is to change the code from the given form,
          (vector-set! array  index  value)
      
      into Scheme code that expr->lang would print in the correct way:.
          (set! (vector-ref array index) value)
      
    5. For function calls other than the above, treat the expression as a function or procedure call. Print the name of the function, (, the arguments (separated by commas if there are more than one), and ). A procedure call with no arguments should be printed with () in C++ and Java, but with no parentheses in Pascal. You should treat the calls (newline) and (display) specially for C++, eliminating the parentheses.
    6. Test expr->lang, using an initial precedence of 0, on the following expressions:
      3.0
      myvar
      (+ lim 1)
      (set! flag #t)
      (set! x (* d i))
      (set! x (+ a (* b c)))
      (set! x (* a (+ b c)))
      (set! y (* (exp (- x)) (cos (* c x))))
      (set! x (vector-ref v i))
      (vector-set! v i x)
      
  5. Write a function (crtab n) that calls newline and then prints n spaces. This function will be used for indentation.
  6. Write a function (type x) that returns the type of x. The types should include pair, symbol, integer, real, boolean, string, and vector. Note that both integer? and real? return #t for a number such as 3.0; however, by using exact? or inexact? it can be determined that it should be real.
  7. Write a function (type->lang x language) that converts a type for the target language, as follows. #f is not a Scheme type, but we will use it to indicate ``no type''. You can return void or error for the input #f for Pascal.

    SchemePascalC++Java
    integerintegerintlong
    realrealfloatdouble
    #ferrorvoidvoid