Topics of this Slideset

- Intro to Assembly language
- Programmer visible state
- Y86 Rudiments
- RISC vs. CISC architectures

Instruction Set Architecture

Assembly Language View
- Processor state: registers, memory, etc.
- Instructions and how instructions are encoded

Layer of Abstraction
- Above: how to program machine, processor executes instructions sequentially
- Below: What needs to be built
  - Use variety of tricks to make it run faster
  - E.g., execute multiple instructions simultaneously

Why Y86?

The Y86 is a “toy” machine that is similar to the x86 but much simpler. It is a gentler introduction to assembly level programming than the x86.

- just a few instructions as opposed to hundreds for the x86;
- fewer addressing modes;
- simpler system state;
- absolute addressing.

Everything you learn about the Y86 will apply to the x86 with very little modification. But the main reason we’re bothering with the Y86 is because we’ll be explaining pipelining in that context.
There are various means of giving a semantics or meaning to a programming system.

Probably the most sensible for an assembly (or machine) language is an operational semantics, also known as an interpreter semantics.

That is, we explain the semantics of each possible operation in the language by explaining the effect that execution of the operation has on the machine state.

The most fundamental abstraction for the machine semantics for the x86/Y86 or similar machines is the fetch-decode-execute cycle. This is also called the von Neumann architecture.

The machine repeats the following steps forever:

- fetch the next instruction from memory (the PC tells you which is next);
- decode the instruction (in the control unit);
- execute the instruction, updating the state appropriately;
- go to step 1.

Program registers: almost the same as x86-64, each 64-bits

Condition flags: 1-bit flags set by arithmetic and logical operations. OF: Overflow, ZF: Zero, SF: Negative

Program counter: indicates address of instruction

Memory

- Byte-addressable storage array
- Words stored in little-endian byte order

Status code: (status can be AOK, HLT, INS, ADR) to indicate state of program execution.

We’re actually describing two languages: the assembly language and the machine language. There is nearly a 1-1 correspondence between them.

Machine Language Instructions

- 1-10 bytes of information read from memory
  - Can determine instruction length from first byte
  - Not as many instruction types and simpler encoding than x86-64
- Each instruction accesses and modifies some part(s) of the program state.
Y86 Instruction Set

Byte

<table>
<thead>
<tr>
<th>Byte</th>
<th>0 1 2 3 4 5 6 7 8 9</th>
</tr>
</thead>
<tbody>
<tr>
<td>halt</td>
<td>0 0</td>
</tr>
<tr>
<td>nop</td>
<td>1 0</td>
</tr>
<tr>
<td>cmovXX rA,rB</td>
<td>2 fn rA rB</td>
</tr>
<tr>
<td>irmovq V,rB</td>
<td>3 0 F rB</td>
</tr>
<tr>
<td>rmmovq rA,D(rB)</td>
<td>4 0 rA rB</td>
</tr>
<tr>
<td>mrmovq D(rB),rA</td>
<td>5 0 rA rB</td>
</tr>
<tr>
<td>OPq rA,rB</td>
<td>6 fn rA rB</td>
</tr>
<tr>
<td>jXX Dest</td>
<td>7 fn Dest</td>
</tr>
<tr>
<td>call Dest</td>
<td>8 0 Dest</td>
</tr>
<tr>
<td>ret</td>
<td>0 0</td>
</tr>
<tr>
<td>pushq rA</td>
<td>A 0 rA F</td>
</tr>
<tr>
<td>popq rA</td>
<td>B 0 rA F</td>
</tr>
</tbody>
</table>

Example from C to Assembly

Suppose we have the following simple C program in file code.c.

```c
int sumInts(long int n)
{
    /* Add the integers from 1..n. */
    long int i;
    long int sum = 0;
    for (i = 1; i <= n; i++) {
        sum += i;
    }
    return sum;
}
```

(We used `long int` to force usage of the 64-bit registers.) You can compile it using the following commands:

```bash
> gcc -O -S code.c
```

Y86 Assembly Example

This is a hand translation into Y86 assembler:

```assembly
sumInts:
    andq %rdi, %rdi  # test %rdi = n
    jle .L4         # if <= 0, done
    irmovq $1, %rcx  # constant 1
    irmovq $0, %rax  # sum = 0
    irmovq $1, %rdx  # i = 1

.L3:
    rrmovq %rdi, %rsi  # temp = n
    addq %rdx, %rax  # sum += i
    addq %rcx, %rdx  # i += 1
    subq %rdx, %rsi  # temp -= i
    jge .L3         # if >= 0, goto L3
    jge .L3         # else return sum

.L4:
    irmovq $0, %rax  # done
    ret
```

How does it get the argument? How does it return the value?
Encoding Registers

Each register has an associated 4-bit id:

<table>
<thead>
<tr>
<th>Register</th>
<th>0</th>
<th>1</th>
<th>2</th>
<th>3</th>
<th>4</th>
<th>5</th>
<th>6</th>
<th>7</th>
</tr>
</thead>
<tbody>
<tr>
<td>%rax</td>
<td>0</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>%rcx</td>
<td>1</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>%rdx</td>
<td>2</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>%rbx</td>
<td>3</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>%r8</td>
<td>8</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>%r9</td>
<td>9</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>%r10</td>
<td>A</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>%r11</td>
<td>B</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>%r12</td>
<td>C</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>%r13</td>
<td>D</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>%r14</td>
<td>E</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>%r15</td>
<td>F</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Almost the same encoding as in x86-64.

Most of these registers are general purpose; %rsp has special functionality.

Y86 Instruction Set (2)

\[
cmovXX \ rA, rB \quad 2 \ fn \ rA, rB
\]

Encompasses:

\[
rrmovq \ rA, rB \quad 2 \ 0 \quad \text{move from register to register}
\]

\[
cmovle \ rA, rB \quad 2 \ 1 \quad \text{move if less or equal}
\]

\[
cmovl \ rA, rB \quad 2 \ 2 \quad \text{move if less}
\]

\[
cmove \ rA, rB \quad 2 \ 3 \quad \text{move if equal}
\]

\[
cmovne \ rA, rB \quad 2 \ 4 \quad \text{move if not equal}
\]

\[
cmovge \ rA, rB \quad 2 \ 5 \quad \text{move if greater or equal}
\]

\[
cmovg \ rA, rB \quad 2 \ 6 \quad \text{move if greater}
\]

Y86 Instruction Set (3)

\[
\text{OPq} \ rA, rB \quad 6 \ fn \ rA, rB
\]

Encompasses:

\[
\text{addq} \ rA, rB \quad 6 \ 0 \quad \text{add}
\]

\[
\text{subq} \ rA, rB \quad 6 \ 1 \quad \text{subtract}
\]

\[
\text{andq} \ rA, rB \quad 6 \ 2 \quad \text{and}
\]

\[
\text{xorq} \ rA, rB \quad 6 \ 3 \quad \text{exclusive or}
\]

Y86 Instruction Set (4)

\[
jXX \ Dest \quad 7 \ fn \ Dest
\]

Encompasses:

\[
\text{jmp} \ Dest \quad 7 \ 0 \quad \text{unconditional jump}
\]

\[
\text{jle} \ Dest \quad 7 \ 1 \quad \text{jump if less or equal}
\]

\[
\text{jl} \ Dest \quad 7 \ 2 \quad \text{jump if less}
\]

\[
\text{je} \ Dest \quad 7 \ 3 \quad \text{jump if equal}
\]

\[
\text{jne} \ Dest \quad 7 \ 4 \quad \text{jump if not equal}
\]

\[
\text{jge} \ Dest \quad 7 \ 5 \quad \text{jump if greater or equal}
\]

\[
\text{jg} \ Dest \quad 7 \ 6 \quad \text{jump if greater}
\]
Simple Addressing Modes

- **Immediate**: value
  
  \[ \text{irmovq~} 0xab, \%rbx \]

- **Register**: \( \text{Reg}[R] \)
  
  \[ \text{rrmovq~} \%rcx, \%rbx \]

- **Normal (R)**: \( \text{Mem}[\text{Reg}[R]] \)
  
  - Register \( R \) specifies memory address.
  - This is often called *indirect* addressing.
  
  \[ \text{mrmovq} (\%rcx), \%rax \]

- **Displacement D(R)**: \( \text{Mem}[\text{Reg}[R] + D] \)
  
  - Register \( R \) specifies start of memory region.
  - Constant displacement \( D \) specifies offset.

  \[ \text{mrmovq} 8(\%rcx), \%rdx \]

---

Conventions

It’s important to understand how individual operations update the system state. *But that’s not enough!*

Much of the way the Y86/x86 operates is based on a set of *programming conventions*. Without them, you won’t understand how programs work, what the compiler generates, or how your code can interact with code written by others.

---

Instruction Set Architecture

**Sample Program**

Let’s write a fragment of Y86 assembly code. Our program swaps the 8-byte values starting in memory locations \(0x0100\) (value A) and \(0x0200\) (value B).

```
start:
  xorq  %rax, %rax
  mrmovq 0x100(%rax), %rbx
  mrmovq 0x200(%rax), %rcx
  rmmovq %rcx, 0x100(%rax)
  rmmovq %rbx, 0x200(%rax)
  halt
```

<table>
<thead>
<tr>
<th>Reg.</th>
<th>Use</th>
</tr>
</thead>
<tbody>
<tr>
<td>%rax</td>
<td>0</td>
</tr>
<tr>
<td>%rbx</td>
<td>A</td>
</tr>
<tr>
<td>%rcx</td>
<td>B</td>
</tr>
</tbody>
</table>

It’s usually a good idea to have a table like this to keep track of the use of registers.
Sample Program: Machine Code

Now, we generate the machine code for our sample program. Assume that it is stored in memory starting at location 0x030. I did this by hand, so check for errors!

```
0x030: 6300  # xorq %rax, %rax
0x032: 50300001000000000000  # mrmovq 0x100(%rax), %rbx
0x03c: 50100002000000000000  # mrmovq 0x200(%rax), %rcx
0x046: 40100001000000000000  # rmmovq %rcx, 0x100(%rax)
0x050: 40300002000000000000  # rmmovq %rbx, 0x200(%rax)
0x05a: 00  # halt
```

<table>
<thead>
<tr>
<th>Reg.</th>
<th>Use</th>
</tr>
</thead>
<tbody>
<tr>
<td>%rax</td>
<td>0</td>
</tr>
<tr>
<td>%rbx</td>
<td>A</td>
</tr>
<tr>
<td>%rcx</td>
<td>B</td>
</tr>
</tbody>
</table>

I did this by hand, so check for errors!

Reg.

%rax 0
%rbx A
%rcx B

A Peak Ahead: Argument Passing

Registers: First 6 arguments

Stack: arguments 7+

Mnemonic to remember the order: “Diane’s silk dress cost $89.”

Return value

%rax

Only allocate stack space when needed.

Instruction Example

Addition Instruction

Generic form: `addq rA, rB`
Encoded representation: 60 0 0 rA rB

- Add value in register rA to that in register rB.
  - Store result in register rB
  - Note that Y86 only allows addition to be applied to register data.
- E.g., `addq %rax, %rsi` is encoded as: 60 06. Why?
- Set condition codes based on the result.
- Two byte encoding:
  - First indicates instruction type.
  - Second gives source and destination registers.

What effects does `addq` have on the state?

Effects on the State

You completely characterize an operation by saying how it changes the state.

What effects does `addq %rsi, %rdi` have on the state?
Effects on the State

You completely characterize an operation by saying how it changes the state.

What effects does addq %rsi, %rdi have on the state?

- Set contents of %rdi to the sum of the current contents of %rsi and %rdi.
- Set condition codes based on the result of the sum.
  - OF: set (i.e., is 1) iff the result causes an overflow
  - ZF: set iff the result is zero
  - SF: set iff the result is negative
- Increment the program counter by 2. Why 2?

There are no effect on the memory or status flag.

Arithmetic and Logical Operations

Add

| addq rA, rB | 6 | 0 | rA | rB |

Subtract (rA from rB)

| subq rA, rB | 6 | 1 | rA | rB |

And

| andq rA, rB | 6 | 2 | rA | rB |

Exclusive Or

| xorq rA, rB | 6 | 3 | rA | rB |

Refer to generically as “OPq”
Encodings differ only by “function code”: lower-order 4-bits in first instruction byte.
Set condition codes as side effect.

Move Operations

Register to Register

| rrmovq rA, rB | 2 | 0 | rA | rB |

Immediate to Register

| irmovq V, rB | 3 | 0 | F | rB | V |

Register to Memory

| rrmovq rA, D(rB) | 4 | 0 | rA | rB | D |

Memory to Register

| mrmovq D(rB), rA | 5 | 0 | rA | rB | D |

Similar to the x86-64 movq instruction.
Similar format for memory addresses.
Slightly different names to distinguish them.

Move Instruction Examples

<table>
<thead>
<tr>
<th>x86-64</th>
<th>Y86</th>
<th>Y86 Encoding</th>
</tr>
</thead>
<tbody>
<tr>
<td>movq $0xabcd, %rdx</td>
<td>irmovq $0xabcd, %rdx</td>
<td>30 F2 cd ab 00 00 00 00 00 00</td>
</tr>
<tr>
<td>movq %rsi, %rbx</td>
<td>rrmovq %rsi, %rbx</td>
<td>20 43</td>
</tr>
<tr>
<td>movq -12(%rbp), %rcx</td>
<td>mrmovq -12(%rbp), %rcx</td>
<td>50 15 f4 ff ff ff ff ff</td>
</tr>
<tr>
<td>movq %rs, 0x41c(%rsp)</td>
<td>rrmovq %rsi, 0x41c(%rsp)</td>
<td>40 64 1c 04 00 00 00 00 00 00</td>
</tr>
<tr>
<td>movq %rax, 12(%rax, %rdx)</td>
<td>none</td>
<td></td>
</tr>
<tr>
<td>movq %rdx, %rsi, 0x41c(%rsp)</td>
<td>none</td>
<td></td>
</tr>
</tbody>
</table>

The Y86 adds special move instructions to compensate for the lack of certain addressing modes.
Conditional Move Instructions

Move (conditionally)

- Refer to generically as "cmovXX"
- Encodings differ only by function code \( fn \)
- \texttt{rrmovq} instruction is a special case
- Based on values of condition codes
- Conditionally copy value from source to destination register

Note that \texttt{rrmovq} is a special case of \texttt{cmovXX}.

Example of CMOV

Suppose you want to compile the following C code:

```c
long min (long x, long y) {
    if (x <= y)
        return x;
    else
        return y;
}
```

The following is one potential implementation of this. Notice that there are no jumps.

```assembly
min:
  rrmovq %rdi, %rax # ans <- x
  rrmovq %rdi, %r8  # temp <- x
  subq %rsi, %r8    # if (temp - y) > 0
  cmovg %rsi, %rax  # ans <- y
  ret
```

Jump Instructions

Jump (conditionally)

- Refer to generically as "jXX"
- Encodings differ only by function code \( fn \)
- Based on values of condition codes
- Same as x86-64 counterparts
- Encode full destination address (unlike PC-relative addressing in x86-64)
**Jump Instructions**

<table>
<thead>
<tr>
<th>Jump Unconditionally</th>
<th>Dest</th>
</tr>
</thead>
<tbody>
<tr>
<td>jmpDest</td>
<td>7 0</td>
</tr>
<tr>
<td>Jump when less or equal</td>
<td>Dest</td>
</tr>
<tr>
<td>jleDest</td>
<td>7 1</td>
</tr>
<tr>
<td>Jump when less</td>
<td>Dest</td>
</tr>
<tr>
<td>jlDest</td>
<td>7 2</td>
</tr>
<tr>
<td>Jump when equal</td>
<td>Dest</td>
</tr>
<tr>
<td>jeDest</td>
<td>7 3</td>
</tr>
<tr>
<td>Jump when not equal</td>
<td>Dest</td>
</tr>
<tr>
<td>jneDest</td>
<td>7 4</td>
</tr>
<tr>
<td>Jump when greater or equal</td>
<td>Dest</td>
</tr>
<tr>
<td>jgeDest</td>
<td>7 5</td>
</tr>
<tr>
<td>Jump when greater</td>
<td>Dest</td>
</tr>
<tr>
<td>jgDest</td>
<td>7 6</td>
</tr>
</tbody>
</table>

**Jump Example**

Suppose you want to count the number of elements in a null terminated list A with starting address in %rdi.

```assembly
len:
    irmovq $0, %rax  # result = 0
    rmovq (%rdi), %rdx  # val = *A
    andq %rdx, %rdx  # Test val
    je Done  # If 0, goto

Loop:
    ...
Done:
    ret
```

**Y86 Program Stack**

- Region of memory holding program data.
- Used in Y86 (and x86-64) for supporting procedure calls.
- Stack top is indicated by %rsp, address of top stack element.
- Stack grows toward lower addresses.
  - Top element is at lowest address in the stack.
  - When pushing, must first decrement stack pointer.
  - When popping, increment stack pointer.

**Stack Operations**

**Push**

- Decrement %rsp by 8.
- Store quad word from rA to memory at %rsp.
- Similar to x86-64 pushq operation.

```
pushq rA
```

**Pop**

- Read quad word from memory at %rsp.
- Save in rA.
- Increment %rsp by 8.
- Similar to x86-64 popq operation.

```
popq rA
```
**Subroutine Call and Return**

**Subroutine call**

```
call Dest
```

- Push address of next instruction onto stack.
- Start executing instructions at Dest.
- Similar to x86-64 call instruction.

**Subroutine return**

```
ret
```

- Pop value from stack.
- Use as address for next instruction.
- Similar to x86-64 ret instruction.

*Note that call and ret don’t implement parameter/return passing. You have to do that in your code.*

---

**Miscellaneous Instructions**

**No operation**

```
nop
```

- Don’t do anything but advance PC.

**Halt execution**

```
halt
```

- Stop executing instructions; set status to HLT.
- x86-64 has a comparable instruction, but you can’t execute it in user mode.
- We will use it to stop the simulator.
- Encoding ensures that program hitting memory initialized to zero will halt.

---

**Status Conditions**

<table>
<thead>
<tr>
<th>Mnemonic</th>
<th>Code</th>
<th>Meaning</th>
</tr>
</thead>
<tbody>
<tr>
<td>AOK</td>
<td>1</td>
<td>Normal operation</td>
</tr>
<tr>
<td>HLT</td>
<td>2</td>
<td>Halt inst. encountered</td>
</tr>
<tr>
<td>ADR</td>
<td>3</td>
<td>Bad address (instr. or data)</td>
</tr>
<tr>
<td>INS</td>
<td>4</td>
<td>Invalid instruction</td>
</tr>
</tbody>
</table>

**Desired behavior:**

- If AOK, keep executing
- Otherwise, stop program execution

---

**Writing Y86 Code**

**Try to use the C compiler as much as possible.**

- Write code in C.
- Compile for x86-64 with `gcc -Og -S`.
- Transliterate into Y86 code.
- Modern compilers make this more difficult, because they optimize by default.

To understand Y86 (or x86) code, you have to know the meaning of the statement, but also certain *programming conventions*, especially the *stack discipline*.

- How do you pass arguments to a procedure?
- Where are local variables created?
- How does a procedure return a value?
- How do procedures save and restore the state of the caller?
Coding example: Find number of elements in a null-terminated list.

```c
long len1(long a[]);
```

The answer in this case should be 3.

First try writing typical array code:

```c
/* Count elements in null-terminated list */
long len1(long a[])
{
    long len;
    for (len = 0; a[len]; len++);
    return len;
}
```

Compile with `gcc -Og -S`

Problem: Hard to do array indexing on Y86, since we don’t have scaled addressing modes.

x86 Code:

```
L3:
    addq $1, %rax
    cmpq $0, (%rdi, %rax, 8)
    jne L3

Loop:
    addq %r8, %rax     # len++
    addq %r9, %rdi     # a++
    rmovq (%rdi), %rdx # val = *a
    andq %rdx, %rdx    # Test val
    jne Loop           # If !0, goto Loop

Done:
    ret
```

Result:

- Compiler generates exact same code as before!
- Compiler converts both versions into the same intermediate form.
**Y86 Sample Program Structure**

```
init:       # Initialization
    ...
call Main
halt
.align 8   # Program data
Array:     ...
Main:      # Main function
    ...
call len
    ...
len:       # Length function
    ...
.pos 0x100 # Place stack
Stack:
```

- Program starts at address 0
- Must set up stack
  - Where located
  - Pointer values
  - Mustn't overwrite data
- Must initialize data

**Y86 Program Structure (2)**

```
init:       # Set up stack pointer
    irmovq Stack, %rsp
    # Execute main program
    call Main
    # Terminate
    halt
# Array of 4 elements + final 0
.align 8
Array:
    .quad 0x0000000000000000
    .quad 0x00c0000000000000
    .quad 0x0b00000000000000
    .quad 0x0a00000000000000
    .quad 0
```

- Program starts at address 0
- Must set up stack
- Must initialize data
- Can use symbolic names

**Y86 Program Structure (3)**

```
Main:      
    irmovq Array, %rdi
    # call len(Array)
call len
ret
```

Set up call to len:
- Follow x86-64 procedure conventions
- Pass array address as argument

**Y86 Assembler**

A program that translates Y86 code into machine language.
- 1-1 mapping of instructions to encodings.
- Resolves symbolic names.
- Translation is linear.
- Assembler directives give additional control.

Some common directives:
- `.pos x`: subsequent lines of code start at address x.
- `.align x`: align the next line to an x-byte boundary (e.g.,
  long ints should be at a quadword address, divisible by 8).
- `.quad x`: put an 8-byte value x at the current address; a way
  to initialize a value.
Assembling Y86 Program

Generates “object code” file len.yo

Actually looks like disassembler output

<table>
<thead>
<tr>
<th>0x054:</th>
<th>len:</th>
</tr>
</thead>
<tbody>
<tr>
<td>0x054:</td>
<td>irmovq $1, %r8</td>
</tr>
<tr>
<td>0x054:</td>
<td>irmovq $8, %r9</td>
</tr>
<tr>
<td>0x056:</td>
<td>irmovq $0, %rax</td>
</tr>
<tr>
<td>0x072:</td>
<td>mrmovq (%rdi), %rdx</td>
</tr>
<tr>
<td>0x076:</td>
<td>andq %rdx, %rdx</td>
</tr>
<tr>
<td>0x07e:</td>
<td>je Done</td>
</tr>
<tr>
<td>0x087:</td>
<td>Loop:</td>
</tr>
<tr>
<td>0x087:</td>
<td>addq %r8, %rax</td>
</tr>
<tr>
<td>0x087:</td>
<td>addq %r9, %rdi</td>
</tr>
<tr>
<td>0x08b:</td>
<td>mrmovq (%rdi), %rdx</td>
</tr>
<tr>
<td>0x095:</td>
<td>andq %rdx, %rdx</td>
</tr>
<tr>
<td>0x097:</td>
<td>jne Loop</td>
</tr>
<tr>
<td>0x0a0:</td>
<td>Done:</td>
</tr>
<tr>
<td>0x0a0:</td>
<td>ret</td>
</tr>
</tbody>
</table>

Simulating Y86 Programs

Instruction set simulator

Computes effect of each instruction on process state

Prints changes in state from original

Stopped in 33 steps at PC = 0x13, Status 'HLT', CC Z=1
S=0 O=0

Changes to registers:

%rax: 0x0000000000000000 0x0000000000000004
%rsp: 0x0000000000000000 0x0000000000000100
%rdi: 0x0000000000000000 0x000000000000038
%rbx: 0x0000000000000000 0x0000000000000001
%rcx: 0x0000000000000000 0x0000000000000008

Changes to memory:

0x00f0: 0x0000000000000000 0x0000000000000053
0x00f8: 0x0000000000000000 0x0000000000000013

CISC Instruction Sets

Complex Instruction Set Computer

- Dominant ISA style through the 80s.
- Lots of instructions:
  - Variable length
  - Stack as mechanism for supporting functions
  - Explicit push and pop instructions.
- ALU instructions can access memory.
  - E.g., addq %rax, 12(%rbx, %rcx, 8)
  - Requires memory read and write in one instruction execution.
  - Some ISAs had much more complex address calculations.
- Set condition codes as a side effect of other instructions.
- Basic philosophy:
  - Memory is expensive;
  - Instructions to support high-level language constructs.

RISC Instruction Sets

Reduced Instruction Set Computer

- Originated in IBM Research; popularized in Berkeley and Stanford projects.
- Few, simple instructions.
  - Takes more instructions to execute a task, but faster and simpler implementation
  - Fixed length instructions for simpler decoding
- Register-oriented ISA
  - More registers (32 typically)
  - Stack is back-up for registers
- Only load and store instructions can access memory (mrmovq and rmmovq in Y86).
- Explicit test instructions set condition values in register.
- Philosophy: KISS
## Original Debate
- Strong opinions!
- CISC proponents—easy for compiler, fewer code bytes
- RISC proponents—better for optimizing compilers, can make run fast with simple chip design

## Current Status
- For desktop processors, choice of ISA not a technical issue
  - With enough hardware, can make anything run fast
  - Code compatibility more important
- x86-64 adopted many RISC features
  - More registers; use them for argument passing
- For embedded processors, RISC makes sense
  - Smaller, cheaper, less power
  - Most cell phones use ARM processor

## Y86-64 Instruction Set Architecture
- Similar state and instructions to x86-64
- Simpler encodings
- Somewhere between CISC and RISC

## How Important is ISA Design?
- Less now than before: with enough hardware, can make almost anything run fast!