

# **Register Allocation and Procedure Calls**

#### **Problem**

- Register values may change across procedure calls
- The allocator must be sensitive to this

- Use interprocedural allocation

# **Two approaches**

- Work within a well-defined calling convention } Make local decisions

  - } Make global decisions

#### April 15, 2015

More Register Allocation









3











# **Interprocedural Register Allocation**

## Wouldn't it be nice to...

- Allocate registers across calls to minimize unnecessary saves/restores?
- Allocate global variables to registers over entire program?

## Compile-time interprocedural register allocation?

- + Could have great performance
- Might be expensive
- Might require lots of recompilation after changes (no separate compilation?)

#### Link-time interprocedural re-allocation?

- + Low compile-time cost
- + Little impact on separate compilation
- Link-time cost

#### April 15, 2015

#### More Register Allocation

12

# Wall's Link-time Register Allocator [Wall 86]

#### **Overall strategy**

- Compiler uses 8 registers for local register allocation
- Linker controls allocation of remaining 52 registers

#### **Compiler does local allocation & planning for linker**

- Load all values at beginning of each basic block; store all values at end of each basic block
- Generate call graph information
- Generate variable usage information for each procedure
- Generate register actions

#### Linker does interprocedural allocation & patches compiled code

- Generates "interference graph" among variables
- Picks best variables to allocate to registers
- Executes register actions for allocated variables to patch code

April 15, 2015

More Register Allocation

13

# **Register** Actions Describe code patch if particular variable allocated to a register - **REMOVE**(**var**): Delete instruction if **var** allocated to a register - OPx(var):Replace op x with register that was allocated to **var** - **RESULT**(**var**): Replace result with register allocated to **var** Usage -r := load var: REMOVE(var) -ri := rj op rk: OP1(var) if **var** loaded into **r**j OP2(var) if **var** loaded into **rk RESULT(var)** if **var** stored from **ri** -store var := r: REMOVE(var) April 15, 2015 More Register Allocation

# Calvin Lin, The University of Texas at Austin

| Example<br>w := (x + y) * z |                            |
|-----------------------------|----------------------------|
| rl := load x                | REMOVE(x)                  |
| r2 := load y                | <b>REMOVE</b> ( <b>y</b> ) |
| r3 := r1 + r2               | OP1(x), OP2(y)             |
| r4 := load z                | <b>REMOVE</b> (z)          |
| r5 := r3 * r4               | OP2(z), RESULT(w)          |
| store w := r5               | REMOVE(w)                  |
|                             |                            |
|                             |                            |





| Example Revisited<br>w := x := y++ * z |                           |                   |  |
|----------------------------------------|---------------------------|-------------------|--|
|                                        |                           |                   |  |
| r2 := r1 + 1                           | OP1(y), RESULT(y)         | RESULT(y)         |  |
| store y := r2                          | REMOVE(y)                 | REMOVE(y)         |  |
| r2 := load z                           | REMOVE(z)                 | <b>REMOVE</b> (z) |  |
| r1 := r1 * r2                          | OP1(y), OP2(z), RESULT(w) | OP2(z), RESULT(w) |  |
| store x := r1                          |                           | STORE(x), OP1(w)  |  |
| store w := r1                          | REMOVE(w)                 | REMOVE(w)         |  |
|                                        |                           |                   |  |
|                                        |                           |                   |  |
|                                        |                           |                   |  |
| April 15, 2015                         | More Register Allocation  | n 18              |  |

# **Deciding Which Variables to Promote to Registers**

## **Steps**

- Use bottom-up algorithm to assign pseudo registers
- Allocate pseudo registers to non-simultaneously live variables
- Allocate real registers to most frequently used pseudo registers





| Machine: DEC WI              | RL Titan RISC processor (64 registers) |  |
|------------------------------|----------------------------------------|--|
| Basic experiment             |                                        |  |
|                              | ime allocator uses 8 registers         |  |
| – Link-time alloca           | ator uses 52 registers                 |  |
| – Simple static fre          | equency estimates                      |  |
| – Small benchmar             | rks                                    |  |
| $\Rightarrow$ 10-25% speed-u | p over local allocation alone          |  |
| Improvements                 |                                        |  |
| – 0-6% with profi            | le data                                |  |
| - 0-5% with comp             | pile-time global allocation            |  |
| Benefit decreases v          | vith number of link-time registers     |  |
| Link-time better tl          | an global register allocation          |  |

# Link-Time Register Allocation: The Big Picture

## **Delayed decision making**

- Make decisions when more information is available, eg. link time
- Requires communication among different system components, in this case the compiler and the linker
- Leads to staged compilation
- Intuitively, more information is better, but effectively using this information can require cleverness

More Register Allocation



# Linear Scan Register Allocation

**Performance results** 

- Linear scan is linear in number of variables
- Graph coloring is  $O(n^2)$
- Code quality is within 12% of graph coloring

April 15, 2015

More Register Allocation

24



# The Highlights

## SSA form

 The interference graphs of programs in SSA form are chordal, which are perfect graphs

# **Perfect graphs**

- Perfect graphs are k-colorable, where k is the size of the largest clique

## Spilling

Can determine exactly where spills must occur without using an interference graph

## Coloring

- In general, can color chordal graphs in  $O(|V|^2)$
- With dominance information, can color chordal graphs in  $O(\omega(G)\cdot n)$  time, where n = # of instructions and  $\omega(G) =$  size of largest live set

April 15, 2015

More Register Allocation





