Performance runs
----------------
Be as comprehensive as possible in your performance runs.  Try out all
measurements that you can (just interchange, just permutation, both,
different tile-sizes etc.). Also look at multiple permutations.

Check that your code performs the optimization correctly, by using small
matrix sizes and running the print() functions. Comment the print()
function when measuring execution time.


Co-ordinating performance runs
------------------------------
There is a program /opt/reserve_hw on ham.csres that you can use to
co-ordinate machine time for performance times. Be nice to others, reserve
small intervals, use machine time judiciously, and do not disturb other
students' runs.
