# MACHINE-LEVEL PROGRAMMING I: BASICS CS 429H: SYSTEMS I **Instructor:** **Emmett Witchel** # Today: Machine Programming I: Basics - History of Intel processors and architectures - C, assembly, machine code - Assembly Basics: Registers, operands, move ### Intel x86 Processors - Totally dominate laptop/desktop/server market - Evolutionary design - Backwards compatible up until 8086, introduced in 1978 - Added more features as time goes on - Complex instruction set computer (CISC) - Many different instructions with many different formats - But, only small subset encountered with Linux programs - Hard to match performance of Reduced Instruction Set Computers (RISC) - But, Intel has done just that! - In terms of speed. Less so for low power. ### Intel x86 Evolution: Milestones | Name | Date | Iransistors | MHZ | |------------------------------------------------|-----------------------------------|----------------------|--------------------| | • 8086 | 1978 | 29K | 5-10 | | <ul><li>First 16-bit</li></ul> | processor. B | asis for IBM PC & DC | OS | | <ul> <li>1MB addre</li> </ul> | ess space | | | | • 386 | 1985 | 275K | 16-33 | | <ul><li>Added "flo</li><li>Capable o</li></ul> | at addressing'<br>of running Unix | | ed in later models | | <ul><li>Pentium 4F</li></ul> | | 125M | 2800-3800 | | | | ferred to as x86-64 | | | <ul><li>Core i7</li></ul> | 2008 | 731M | 2667-3333 | | | | | | Transisters AAU- ### Intel x86 Processors: Overview IA: often redefined as latest Intel architecture ### Intel x86 Processors, contd. ### Machine Evolution | • | 386 | 1985 | |---|-------------|------| | • | Pentium | 1993 | | • | Pentium/MMX | 1997 | | • | PentiumPro | 1995 | | • | Pentium III | 1999 | | • | Pentium 4 | 2001 | | • | Core 2 Duo | 2006 | | • | Core i7 | 2008 | - Added Features - Instructions to support multimedia operations - Parallel operations on 1, 2, and 4-byte data, both integer & FP - Instructions to enable more efficient conditional operations - Linux/GCC Evolution - Two major steps: 1) support 32-bit 386. 2) support 64-bit x86-64 # x86 Clones: Advanced Micro Devices (AMD) - Historically - AMD has followed just behind Intel - A little bit slower, a lot cheaper - Then - Recruited top circuit designers from Digital Equipment Corp. and other downward trending companies - Built Opteron: tough competitor to Pentium 4 - Developed x86-64, their own extension to 64 bits ### Intel's 64-Bit - Intel Attempted Radical Shift from IA32 to IA64 - Totally different architecture (Itanium) - Executes IA32 code only as legacy - Performance disappointing - AMD Stepped in with Evolutionary Solution - x86-64 (now called "AMD64") - Intel Felt Obligated to Focus on IA64 - Hard to admit mistake or that AMD is better - 2004: Intel Announces EM64T extension to IA32 - Extended Memory 64-bit Technology - Almost identical to x86-64! - All but low-end x86 processors support x86-64 - But, lots of code still runs in 32-bit mode # Our Coverage - IA32 - The traditional x86 - x86-64/EM64T - The emerging standard - Presentation - Book presents IA32 in Sections 3.1—3.12 - Covers x86-64 in 3.13 # Today: Machine Programming I: Basics - History of Intel processors and architectures - C, assembly, machine code - Assembly Basics: Registers, operands, move ### **Definitions** - Architecture: (also instruction set architecture: ISA) The parts of a processor design that one needs to understand to write assembly code. - Examples: instruction set specification, registers. - Microarchitecture: Implementation of the architecture. - Examples: cache sizes and core frequency. Example ISAs (Intel): x86, IA, IPF ### Assembly Programmer's View - Register file - Heavily used program data - Condition codes - Store status information about most recent arithmetic operation - Used for conditional branching ### Memory - Byte addressable array - Code, user data, (some) OS data - Includes stack used to support procedures # Program to Process - We write a program in e.g., C. - A compiler turns that program into an instruction list. - The CPU interprets the instruction list (which is more a graph of basic blocks). ``` void X (int b) { if(b == 1) { ... int main() { int a = 2; X(a); } ``` # Process in Memory • What is in memory. - Program to process. - What you wrote ``` void X (int b) { if(b == 1) { ... int main() { int a = 2; X(a); } ``` What must the OS track for a process? ``` main; a = 2 Stack X; b = 2 Heap void X (int b) { if(b == 1) { int main() { int a = 2; X(a); Code ``` ### A shell forks and execs a calculator ``` int pid = fork(); if(pid == 0) { close(".history"); exec("/bin/calc"); } else { wait(pid); ``` ``` int padc=maonk(); ifipidq== 0) { cdosewithistory"); exec="getningut"(); } exec_in(ln); wait(pid); ``` ### **USER** OS ``` pid = 128 open files = ".history" last_cpu = 0 ``` ``` pid = 128 open files = last_cpu = 0 ``` Process Control Blocks (PCBs) ### A shell forks and then execs a # Anatomy of a Process Executable File Process's address space ### Turning C into Object Code - Code in files p1.c p2.c - Compile with command: gcc -01 p1.c p2.c -o p - Use basic optimizations (-01) - Put resulting binary in file p ### Compiling Into Assembly ### C Code ``` int sum(int x, int y) { int t = x+y; return t; } ``` ### **Generated IA32 Assembly** ``` pushl %ebp movl %esp,%ebp movl 12(%ebp),%eax addl 8(%ebp),%eax popl %ebp ret ``` Some compilers use instruction "leave" ### **Obtain with command** ``` /usr/local/bin/gcc -01 -S code.c ``` Produces file code.s ### Assembly Characteristics: Data Types - "Integer" data of 1, 2, or 4 bytes - Data values - Addresses (untyped pointers) Floating point data of 4, 8, or 10 bytes - No aggregate types such as arrays or structures - Just contiguously allocated bytes in memory ### Assembly Characteristics: Operations Perform arithmetic function on register or memory data - Transfer data between memory and register - Load data from memory into register - Store register data into memory - Transfer control - Unconditional jumps to/from procedures - Conditional branches # Object Code ### Code for sum ``` 0x401040 <sum>: 0x55 0x89 0xe5 0x8b 0x45 0x0c 0x03 0x45 0x08 0x5d • Total ``` 0xc3 - Total of 11 bytes - Each instruction1, 2, or 3 bytes - Starts at address 0x401040 ### Assembler - Translates .s into .o - Binary encoding of each instruction - Nearly-complete image of executable code - Missing linkages between code in different files - Linker - Resolves references between files - Combines with static run-time libraries - E.g., code for malloc, printf - Some libraries are dynamically linked - Linking occurs when program begins execution ### Disassembling Object Code ### Disassembled ``` 080483c4 <sum>: 80483c4: 55 push %ebp 80483c5: 89 e5 %esp,%ebp mov 80483c7: 8b 45 0c mov 0xc(%ebp),%eax 80483ca: 03 45 08 add 0x8(%ebp),%eax 80483cd: 5d %ebp pop 80483ce: c3 ret ``` - Disassemblerobjdump -d p - Useful tool for examining object code - Analyzes bit pattern of series of instructions - Produces approximate rendition of assembly code - Can be run on either a .out (complete executable) or .o file ### Alternate Disassembly ### **Object** # 0x401040: 0x55 0x89 0xe5 0x8b 0x45 0x0c 0x03 0x45 0x08 0x5d 0xc3 ### Disassembled ``` Dump of assembler code for function sum: 0x080483c4 < sum + 0>: %ebp push 0x080483c5 < sum + 1>: %esp,%ebp mov 0x080483c7 < sum + 3>: mov 0xc(%ebp),%eax 0x080483ca < sum + 6>: add 0x8(%ebp),%eax 0x080483cd < sum + 9>: %ebp pop 0x080483ce < sum + 10>: ret ``` - Within gdb Debugger gdb p disassemble sum - Disassemble procedurex/11xb sum - Examine the 11 bytes starting at sum ### What Can be Disassembled? ``` % objdump -d WINWORD.EXE WINWORD.EXE: file format pei-i386 No symbols in "WINWORD.EXE". Disassembly of section .text: 30001000 <.text>: 30001000: 55 push %ebp 30001001: 8b ec %esp,%ebp mov 30001003: 6a ff push $0xffffffff 30001005: 68 90 10 00 30 push $0x30001090 3000100a: 68 91 dc 4c 30 push $0x304cdc91 ``` - Anything that can be interpreted as executable code - Disassembler examines bytes and reconstructs assembly source # Today: Machine Programming I: Basics - History of Intel processors and architectures - C, assembly, machine code - Assembly Basics: Registers, operands, move # Integer Registers (IA32) ### Origin (mostly obsolete) # Simple Memory Addressing Modes - Normal (R) Mem[Reg[R]] - Register R specifies memory address ``` movl (%ecx),%eax ``` - Displacement D(R) Mem[Reg[R]+D] - Register R specifies start of memory region - Constant displacement D specifies offset # Using Simple Addressing Modes ``` void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0; } ``` ``` swap: pushl %ebp Set movl %esp,%ebp pushl %ebx movl 8(%ebp), %edx movl 12(%ebp), %ecx movl (%edx), %ebx Body movl (%ecx), %eax movl %eax, (%edx) movl %ebx, (%ecx) %ebx popl popl %ebp ret ``` # Using Simple Addressing Modes ``` void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0; } ``` ### swap: ``` pushl %ebp movl %esp, %ebp pushl %ebx mov1 8(%ebp), %edx movl 12(%ebp), %ecx movl (%edx), %ebx Body movl (%ecx), %eax movl %eax, (%edx) movl %ebx, (%ecx) popl %ebx popl %ebp ret ``` # **Understanding Swap** | Register | Value | |----------|-------| | %edx | хр | | %ecx | ур | | %ebx | t0 | | %eax | t1 | ``` movl 8(%ebp), %edx # edx = xp movl 12(%ebp), %ecx # ecx = yp movl (%edx), %ebx # ebx = *xp (t0) movl (%ecx), %eax # eax = *yp (t1) movl %eax, (%edx) # *xp = t1 movl %ebx, (%ecx) # *yp = t0 ``` | | | Address | |------------|--------------------------------------------|----------------| | Understand | ling Swap | 123 0x124 | | | • | 456 0x120 | | | | 0x11c | | %eax | | 0x118 | | %edx | Offset | 0x114 | | %ecx | yp 12 | 0x120 0x110 | | %ebx | <b>xp</b> 8 | 0x124 0x10c | | %esi | 4 | Rtn adr 0x108 | | | %ebp → 0 | 0x104 | | %edi | -4 | 0x100 | | %esp | morel 0/%obm) %oder # | ad = | | %ebp 0x104 | movl 8(%ebp), %edx # movl 12(%ebp), %ecx # | <del>-</del> | | _ | | ebx = *xp (t0) | | | movl (%ecx), %eax # | eax = *yp (t1) | | | <pre>movl %eax, (%edx) #</pre> | *xp = t1 | | | <pre>movl %ebx, (%ecx) #</pre> | *yp = t0 | | | Address | | |------------|------------------------------------------------------|--------------------------| | Understand | ding Swap | 123 0x124 | | | | 456 0x120 | | | | 0x11c | | %eax | | 0x118 | | %edx 0x124 | Offset | 0x114 | | %ecx 0x120 | yp 12 | 0x120 0x110 | | %ebx | <b>xp</b> 8 | 0x124 0x10c | | | 4 | Rtn adr 0x108 | | %esi | %ebp → 0 | 0x104 | | %edi | -4 | 0x100 | | %esp | | | | %ebp 0x104 | movl 8(%ebp), %edx # | <u> </u> | | OCDP ORIOT | <pre>movl 12(%ebp), %ecx # movl (%edx), %ebx #</pre> | ebx = xp $ebx = xp$ (t0) | | | • • • • | eax = *yp (t1) | | | • • • | *xp = t1 | | | · | *yp = t0 | | | | | | | | | Address | |------|--------|----------|--------|--------|------------|----------------|----------------| | Unc | dersta | anding S | wap | | | 123 | 0x124 | | | | | | | | 456 | 0 <b>x</b> 120 | | | | | | | | | 0x11c | | %eax | | | | | | | 0 <b>x</b> 118 | | %edx | 0x124 | | | | Offset | | 0x114 | | %ecx | 0x120 | | | ур | 12 | 0 <b>x</b> 120 | 0x110 | | %ebx | 123 | | | хp | 8 | 0x124 | 0x10c | | | | | | | 4 | Rtn adr | 0x108 | | %esi | | | | %ebp | <b>→</b> 0 | | 0x104 | | %edi | | | | _ | -4 | | 0x100 | | %esp | | | | | | | ORIO | | | 0.101 | | _ | • | | = dx = xp | | | %ebp | 0x104 | movl | | _ | | ecx = Ab | | | | | movl | (%edx) | , %ebx | # € | ebx = *xl | o (t0) | | | | movl | (%ecx) | , %eax | # € | eax = *yr | o (t1) | | | | movl | %eax, | (%edx) | # 7 | *xp = t1 | | | | | movl | %ebx, | (%ecx) | # 7 | *yp = t0 | | | | | 10 | • | | | | Address | |------|--------|----------|---------|---------|------------|-----------------------|---------| | Und | dersta | anding S | wap | | | 123 | 0x124 | | | | | | | | 456 | 0x120 | | | | | | | | | 0x11c | | %eax | 456 | | | | | | 0x118 | | %edx | 0x124 | | | | Offset | | 0x114 | | %ecx | 0x120 | | | ур | 12 | 0x120 | 0x110 | | %ebx | 123 | | | хp | 8 | 0x124 | 0x10c | | | | | | | 4 | Rtn adr | 0x108 | | %esi | | | | %ebp | <b>→</b> 0 | | 0x104 | | %edi | | | | _ | -4 | | 0x100 | | %esp | | | | | | | OXIOO | | | | movl | 8 (%ebp | ), %ed: | x # 6 | edx = xp | | | %ebp | 0x104 | movl | 12 (%eb | p), %e | cx # 6 | ecx = yp | | | | | movl | (%edx) | , %ebx | # 6 | = xd | p (t0) | | | | movl | (%ecx) | , %eax | # € | eax = *y <sub>1</sub> | o (t1) | | | | movl | %eax, | (%edx) | # 3 | *xp = t1 | | | | | movl | %ebx, | (%ecx) | # 7 | *yp = t0 | | | | | Address | |------------|--------------------------------------------------------|------------| | Understand | ding Swap 456 | 0x124 | | | 456 | 0x120 | | | | 0x11c | | %eax 456 | | 0x118 | | %edx 0x124 | Offset | 0x114 | | %ecx 0x120 | yp 12 0x1 | 0x110 | | %ebx 123 | xp 8 0x1 | 0x10c | | %esi | 4 Rtn a | 0x108 | | | %ebp $\longrightarrow$ 0 | 0x104 | | %edi | -4 | 0x100 | | %esp | ma1 0/%abm\ %ad # ad | | | %ebp 0x104 | movl 8(%ebp), %edx # edx = movl 12(%ebp), %ecx # ecx = | _ | | | | = *xp (t0) | | | movl (%ecx), %eax # eax = | = *yp (t1) | | | <pre>movl %eax, (%edx) # *xp =</pre> | = t1 | | | movl %ebx, (%ecx) # *yp = | = t0 | | | | 10 | • | | | | Address | |------|--------|----------|----------|--------|------------|----------------------|----------------| | Und | dersta | anding S | wap | | | 456 | 0x124 | | | | | - | | | 123 | 0 <b>x</b> 120 | | | | | | | | | 0x11c | | %eax | 456 | | | | | | 0x118 | | %edx | 0x124 | | | | Offset | | 0x114 | | %есх | 0x120 | | | УÞ | 12 | 0x120 | 0 <b>x</b> 110 | | %ebx | 123 | | | хp | 8 | 0x124 | 0x10c | | %esi | | | | | 4 | Rtn adr | 0x108 | | | | | | %ebp | <b>→</b> 0 | | 0x104 | | %edi | | | | | -4 | | 0x100 | | %esp | | | 0 (0 -1) | 0 - 1 | . и | <b>.</b> | | | %ebp | 0x104 | movi | _ | • | | edx = xp<br>ecx = yp | | | - | | | (%edx) | | | ix* = xde | o (t0) | | | | movl | (%ecx) | , %eax | # 6 | eax = *yr | o (t1) | | | | movl | %eax, | (%edx) | | *xp = t1 | | | | | movl | %ebx, | (%ecx) | # 3 | *yp = t0 | | ### Complete Memory Addressing Modes Most General Form ``` D(Rb,Ri,S) Mem[Reg[Rb]+S*Reg[Ri]+D] ``` - D: Constant "displacement" 1, 2, or 4 bytes - Rb: Base register: Any of 8 integer registers - Ri: Index register: Any, except for %esp - Unlikely you'd use %ebp, either - S: Scale: 1, 2, 4, or 8 (why these numbers?) - Special Cases # x86-64 Integer Registers | %rax | %eax | % <b>r</b> 8 | %r8d | |------|------|--------------|-------| | %rbx | %ebx | 8 <b>r</b> 9 | %r9d | | %rcx | %ecx | %r10 | %r10d | | %rdx | %edx | %r11 | %r11d | | %rsi | %esi | %r12 | %r12d | | %rdi | %edi | %r13 | %r13d | | %rsp | %esp | 8 <b>r14</b> | %r14d | | %rbp | %ebp | % <b>r15</b> | %r15d | - Extend existing registers. Add 8 new ones. - Make %ebp/%rbp general purpose