Unit 12.4.1 BLIS and beyond
One of the strengths of the approach to implementing matrix-matrix multiplication described in Unit 12.2.4 is that it can be applied to related operations. A recent talk discusses some of these.
Robert van de Geijn and Field Van Zee, "The BLIS Framework: Experiments in Portability," SIAM Conference on Parallel Processing for Scientific Computing (PP20). SIAM Activitiy group on Supercomputing Best Paper Prize talk. 2020.