Unit 3.5.3 Alternatives to Goto's algorithm¶
In Unit 2.5.1, theoretical lower bounds on data movement incurred by matrix-matrix multiplication were discussed. You may want to go back and read , mentioned in that unit, in which it is discussed how Goto's algorithm fits into the picture.
In a more recent paper,
 Tyler M. Smith and Robert A. van de Geijn, The MOMMS Family of Matrix Multiplication Algorithms, arXiv, 2019.
a family of practical matrix-matrix multiplication algorithms is discussed, of which Goto's algorithm is but one member. Analytical and empirical results suggest that for future CPUs, for which the ratio between the cost of a flop and the cost of moving data between the memory layers becomes even worse, other algorithms may become superior.