Homework Assignment 6 CS 350c Unique Number: 52140 Spring, 2017 Given: February 23, 2017 Due: March 9, 2017 This homework assignment concerns writing an assembler for the the y86-64 processor. Note, this assignment is a fair amount of work, and it will count as two homework grades. it will require you to write eight to ten pages of code. But, we will give you a complete specification of the assembler in a formal logic. We recommend that you write your assembler in C -- both because this is the language used for many tools like this and because we can help you. If you use some less-well-known language, you will be more on your own. The input to our assembler will be a single file that contains a y86 program. The output of our assembler, when successful, will be a file of 64-bit address, 8-bit pairs. The assembler you produce should be capable of generating any program and placing it in any part of the memory. Our y86 simulator, which you will also write, will use the result of your assembler to initialize the simulator memory. The assembler you write should use two passes. The first pass should produce a map of labels to addresses. This pairing of labels to addresses will be used in the second pass to resolve references to labels. Input to your Y86 assembler should be given as a list of instructions, symbols (labels), and assembler directives. Note, lists are delimited by the characters "(" and ")". To make things simple and easier to read, we ask that when you write y86 assembler, each instruction or directive be placed on its own line. But, this suggestion isn't a requirement -- your assembler should work in any event. Our assemble language includes labels. Labels are names associated with memory addresses. y86 instructions, such as a jump instruction, will expect a label as an argument. Before we can convert a y86 assembly-language program into a list (or file) of address-byte pairs, we need to know the address for all of the labels. This requirement demands that we first read the entire input file associating each label with an address. Then, we will read the input file again, creating a table pairing addresses with bytes -- this is a specification of the memory. The assembler must maintain a memory address where the next assembler directive or instruction will take effect, and then the is altered as necessary. For example, if a NOP instruction is found, the byte #x10 will be inserted into an evolving memory image at address and then, will be increased by 1 (byte). Blank space is used to separate fields, and blank space is one or more spaces, newlines, and/or tabs. A well-formed assembler file contains a list of asseembler commands. It starts with a "(" (an open parenthesis) and ends with a close ")". Between these two characters are a list of assembler commands (listed below). Each line is either a label -- just a name -- or a command All commands begin with a "(" (an open parenthesis) and ends with a close ")". The commands are shown below. The assembler includes directives that associate a label with the current value of , that change , and that create specific data to be placed in the memory. ; ; Name for specific ; (pos ) ; Set to ; (align ) ; set to (+ ; (MOD (expt 2 ))) ; (space ) ; set to (+ ) ; Write sequence of bytes, words starting at address , and ; advance by the number of bytes required for the data. ; (byte ...) ; Write byte(s) ... ; (qword ...) ; Write 64-bit word(s) ... ; (char #\a #\b #\c ...) ; Write character(s) #\a #\b #\c ... ; (string "abc...def") ; Write string as sequence of characters ; Y86-64 control, NOP instructions ; (halt) ; (nop) ; (noop) ; Second NOP instruction ; Y86-64 move instructions ; (rrmovq ) ; Reg-to-reg move Quad ; (cmovle ) ; Conditional move Less Equal ; (cmovl ) ; Conditional move Less than ; (cmove ) ; Conditional move Equal ; (cmovne ) ; Conditional move Not Equal ; (cmovge ) ; Conditional move Greater Equal ; (cmovg ) ; Conditional move Greater ; (irmovq ) ; (rmmovq ()) ; (mrmovq () ) ; Y86-64 logical/arithmetic instructions ; (addq ) ; (subq ) ; (andq ) ; (xorq ) ; (iaddq ) ; Immediate Add ; Y86-64 flow of control instructions ; (jmp ) ; (jle ) ; (jl ) ; (je ) ; (jne ) ; (jge ) ; (jg ) ; (call ) ; (ret) ; (leave) ; Y86-64 stack instructions ; (pushq ) ; (popq ) The result of the assembler should be a file with its first line being only a "(" (an open parenthesis) and with its last line being only a close ")". In between these two (first and last) lines should be lines of address-byte pairs with the following syntax. Thus, an entire file should look like: ( (3 . 5) Associate value 5 with address 3 (7 . 22) Associate value 22 with address 7 ) To aid your effort, we provide a a Lisp-based, specification of the assembler. Look on the homework page for a link. Thus, you may transliterate my Lisp code into your C code. We have posted this assembler to the class homework webpage. To further aid your effort, we provide two example programs. This file can also be found on the class homework webpage.