This file describes the microbenchmarks used for testing the EV6
microarchitecture. The microbenchmark code is mostly C code, but some 
assembly instructions were inserted for microbenchmarks for which it
was not possible to get the desired behavior using the compiler. THe
microbenchmarks have been classified according to the
microarchitecture feature they stress. 


Control Microbenchmarks
-----------------------

C-C
---
cond.c, cond_dec_c.s, cond_compaq_c.s
In this microbenchmark, a branch repeatedly
toggles between taken and not taken. The compiled assembly code was
different for DEC C V5.9-008 on Digital UNIX V4.0 compiler and Compaq
C V6.3-025 on Compaq Tru64 UNIX V5.1. The simulator % error was also
different for the two versions, with the DEC C compiled version reporting a 
larger error. The assembly files are also included in the TAR file as
cond_dec_c.s and cond_compaq_c.s. The assemly files are not useful for
architectures other than the 21264.


C-R
---
recur.c
This is is recursive benchmark that recurs 1000 levels deep.

C-Sn
----
switch[n].c
This benchmark contains a switch-case loop, and each case statement is
taken n times consecutively before the next case is taken.

C-CO
----
complexctrl.c
This is a combination of C-C and C-S.

Execution Core
--------------
E-I
---
straight.c
This is a chain of independent integer arithmetic operations. IPC
should be close to 4.

E-F
---
straightfloat.c
Chain of independent floating point arithmetic operations. IPC should
be close to 1.

E-Dn
----
depchain[n].c
Chain of dependent instructions of length n. 


The memory microbenchmark assume the following:

L1 cache: 64KB, 2 way set associative, 64 byte lines both I and D

L2 cache: 2 MB, direct ,direct mapped, 64 byte lines unified

If used on a simulator with different configuration, please change the 
benchmarks accordingly

Memory Subsystem
----------------
M-I and M-IF
------------
memindep.c, memindepfp.c
Independent loads resident in L1 cache (integer and fp resp)

M-D and M-DF
------------
memdep.c
Chain of dependent loads resident in L1 cache (integer and fp)

M-L2
----
l2_lat.c
Chain of dependent loads resident in L2 cache

M-M
----
mem_lat.c
Chain of dependent loads resident in Main memory

I-P
---
iprefetch.c, iprefetch.s
This is a very large chain of independent arithmetic operations which
is L2 resident. This can be used for testing Icache prefetches. This
was coded in assembly as the compiler ran out of virtual memory when
compiling the C code.


