BLIS Retreat 2016

Program (First Draft)

Contributed talks

Monday Sept. 19

Morning POB 2.402
8:30-9:00 Breakfast (POB 2.402)
9:00-9:30 BLIS: Year In Review, 2015-2016 SLIDES Field Van Zee, UT-Austin
9:30-10:10 Implementing Strassen-like Fast Matrix Multiplication Algorithms with BLIS SLIDES Jianyu Huang and Leslie Rice, UT-Austin
10:10-10:40 A New I/O Lower Bound for GEMM with a Tight Constant For SLIDES contact speaker Tyler Smith, UT-Austin
10:30-11:00 Coffee (POB 2.402)
11:00-11:30 Scalable Dense Matrix Multiplication on Multi-Socket Many-Core Systems with Fast Shared Memory SLIDES Natalia Vassilieva, Hewlett Packard Labs
11:30-12:00 An Implementation of GEMM for DMA-enabled Architectures SLIDES Devangi Parikh, TI
12:00-12:30 BLAS for Deep Learning: tuple, mixed-precision, fixed-point, and binary GEMM SLIDES Marat Dukhan, GATech

12:30 - 2 Lunch (GDC 6.302 - Computer Science Faculty Lounge)

Afternoon POB 2.402
2:00-2:30 A Case Study for Malleable Thread-Level Linear Algebra Libraries: The LU Factorization SLIDES Sandra Catalan, Univ. Jaume I
2:30-3:00 libFLAME Optimizations with BLIS SLIDES Kiran Varaganti, AMD
3:00-3:30 Extended BLAS, Integer BLAS, Batched BLAS, etc. Greg Henry, Intel
3:30-4:00 Coffee (POB 2.402)
4:00-4:30 An Algorithmic Specific Code Generator for Matrix-Matrix Multiply-Like Operations SLIDES Richard Veras, CMU
4:30-5:00 A BLIS affair with FPGAs For VIDEO contact speaker
SLIDES
Tze Meng Low
5:00-5:30 Cl1ck + LGen: FLAME for small scale linear algebra SLIDES Diego Fabregat, RWTH-Aachen University

Tuesday Sept. 20

Morning POB 2.402
8:30-9:00 Breakfast (POB 2.402)
9:00-9:30 Tensor Contraction with BLIS SLIDES Devin Matthews, UT
9:30-10:00 Design of a high-performance GEMM-like Tensor-Tensor Multiplication SLIDES Paul Springer, RWTH-Aachen University
10:00-10:30 Using BLIS for tensor computations in Q-Chem SLIDES Evgeny Epifanovsky, Q-Chem
10:30-11:00 Coffee (POB 2.402)
11:00-11:30 PeachPy.io: a platform for crowdsourcing performance tuning SLIDES Marat Dukhan, GATech
11:30-12:00 Dark Memory and Accelerator-Rich System Optimization in the Dark Silicon Era SLIDES Ardavan Pedram, Movidius and Stanford
12:00-12:30 A set of high performance kernel matrix operations on CPU, KNL and GPU SLIDES Chenhan Yu, UT-Austin
12:30-12:45 Closing comments Robert van de Geijn, UT-Austin