Note: For those who have accounts on the CS machines at UT-Austin,
we suggest you link to the libraries in
/u/rvdg/PFLAME
rather than building your own.
We make the entire software package,
which includes
To set this up on a Linux machine:
gunzip PFLAME.tar.gz
tar -xf PFLAME.tar
cd PFLAME
Make.machine
files
and fix some of the Make.include
files
in the subdirectories.
make FLAME
make PLAPACK
Note: You probably want to just use the FLAME Website itself at http://www.cs.utexas.edu/users/flame/We make it convenient to download the entire FLAME WebSite
To set this up on a Linux machine:
gunzip flame.tar.gz
tar -xf flame.tar
1: | A | scalar | none | both |
2: | X | vector | T->B | input |
latex sum_unb_var1.tex
latex sum_unb_var1.tex
xdvi sum_unb_var1.dvi
dvips -f sum_unb_var1.dvi -t letter > sum_unb_var1.ps
dvipdf sum_unb_var1.dvi
sum_unb_var1.pdf
)
sum_unb_var1.tex
to create the worksheet
as desired.
Makefile
, Make.machine
, or
Make.include
files.
MYFLAME
, and a subdirectory, e.g. named MYFLAME/FLAME@lab
.
Trsm_blk_var1.m
into a file with that name.
-- stuff is printed --octave --traditional
Set the path to where the basic FLAME@lab functions exist:
Naturally, if you downloaded your own copy of PFLAME, you would want to adjust the path to where you installed PFLAME.path( path, "/u/rvdg/PFLAME/FLAME@lab/Octave" )
Notice that the result is something like% Create a 5x4 random matrix B B = rand( 5, 4 ) % Create a 5x5 random matrix L, make it lower triangular and % add something big to the diagonal to ensure it is not almost singular L = tril( rand( 5, 5 ) ) + 5 * eye( 5 ) % Use the FLAME@lab code in Trsm_blk_var1.m to solve L X = B % Here the "2" is the block size chosen for the blocked algorithm X = Trsm_blk_var1( L, B, 2 ) % Compute the same, but using Mscript intrinsic functions Xref = L \ B % Compare X and Xref X - Xref
which is very small, due to the fact that1.0e-17 * -- some matrix printed here --
1.0e-17
multiplies all entries of the matrix.
-- stuff is printed -- and/or a new window pops upmatlab
Set the path to where the basic FLAME@lab functions exist:
Naturally, if you downloaded your own copy of PFLAME, you would want to adjust the path to where you installed PFLAME.path( path, '/u/rvdg/PFLAME/FLAME@lab/MATLAB' )
Notice that the result is something like% Create a 5x4 random matrix B B = rand( 5, 4 ) % Create a 5x5 random matrix L, make it lower triangular and % add something big to the diagonal to ensure it is not almost singular L = tril( rand( 5, 5 ) ) + 5 * eye( 5 ) % Use the FLAME@lab code in Trsm_blk_var1.m to solve L X = B % Here the "2" is the block size chosen for the blocked algorithm X = Trsm_blk_var1( L, B, 2 ) % Compute the same, but using MATLAB intrinsic functions Xref = L \ B % Compare X and Xref X - Xref
which is very small, due to the fact that1.0e-17 * -- some matrix printed here --
1.0e-17
multiplies all entries of the matrix.
MYFLAME
,
and a subdirectory, e.g. named MYFLAME/FLAMEC
.
Trsm_example.tar.gz
into a file with that name.
> gunzip Trsm_example.tar.gz
> tar -xf Trsm_example.tar
> cd Trsm_example
Makefile
, changing the path in the first line to
a path to where you installed PFLAME.
> make
---- Lots of output ----
> ./test_llnn.x
% number of repeats:% 3
% Enter blocking size:% 104
% enter nfirst, nlast, ninc:% 100 400 100
% enter m: (-1 means m=n)% -1
data_REF( 1, 1:2 ) = [ 100 1181.22 ];
data_var1( 1, 1:7 ) = [ 100 861.24 3.47e-18 830.54 3.47e-18 1130.66 0.00e+00 ];
data_var2( 1, 1:7 ) = [ 100 425.85 1.21e-17 435.38 1.21e-17 1129.89 0.00e+00 ];
---------- Lots of output ---------
data_var4( 4, 1:7 ) = [ 400 409.15 8.67e-19 1572.46 0.00e+00 1576.98 0.00e+00 ];
close all
plot( data_REF( :,1 ), data_REF( :, 2 ), '-' );
hold on
plot( data_var1( :,1 ), data_var1( :, 2 ), 'b:o' );
plot( data_var1( :,1 ), data_var1( :, 4 ), 'b-.o' );
plot( data_var1( :,1 ), data_var1( :, 6 ), 'b--o' );
plot( data_var2( :,1 ), data_var2( :, 2 ), 'r:+' );
plot( data_var2( :,1 ), data_var2( :, 4 ), 'r-.+' );
plot( data_var2( :,1 ), data_var2( :, 6 ), 'r--+' );
plot( data_var3( :,1 ), data_var3( :, 2 ), 'k:*' );
plot( data_var3( :,1 ), data_var3( :, 4 ), 'k-.*' );
plot( data_var3( :,1 ), data_var3( :, 6 ), 'k--*' );
plot( data_var4( :,1 ), data_var4( :, 2 ), 'g:x' );
plot( data_var4( :,1 ), data_var4( :, 4 ), 'g-.x' );
plot( data_var4( :,1 ), data_var4( :, 6 ), 'g--x' );
legend( 'Reference', 'unb\_var1', 'blk\_var1', 'opt\_var1', ...
'unb\_var2', 'blk\_var2', 'opt\_var2', ...
'unb\_var3', 'blk\_var3', 'opt\_var3', ...
'unb\_var4', 'blk\_var4', 'opt\_var4', 2 );
xlabel( 'matrix dimension n' );
ylabel( 'MFLOPS/sec.' );
axis( [ 0 400 0 3600 ] );
print -depsc graphs.eps
hold off
echo "4 104 100 1000 100 -1 " | ./test_llnn.x > output
matlab < output
gv graph.eps
output
: In the line
axis( [ 0 400 0 3600 ] );
change the 3600
to the peak (in MFLOPS/sec) of
the machine on which you ran the experiment.
You can get this information from the file
/proc/cpuinfo.
by multiplying the cpu MHz
number by the number
of floating point operations that your architecture can do per clock cycle
(e.g., 2 for a Pentium 4. Multiply by 4 if you have a 2 CPU Pentium 4 based
SMP and you are linking to a multi-threaded BLAS library. Etc.).
Please send comments to rvdg@cs.utexas.edu.