R440, Astronomy-Mathematics Building, NTU
(台灣大學天文數學館 440室)
Modern Approaches toward High Performance Dense Linear Algebra Subroutines: Theory and Implementation
Chen-Han Yu (University of Texas at Austin)
Dense Linear Algebra (DLA) is one of the first domains to realize, understand, standardize and support portable high performance across different architecture platforms. For about four decays, BLAS (Basic Linear Algebra Subroutines) are expected to be provided by some domain experts or vendors due to the difficulty and the considerable efforts required in maintaining and the portability in the new computer architecture. Several well-known BLAS libraries such as MKL (Intel), ACML (AMD), ESSL (IBM), GotoBLAS, OpenBLAS and ATLAS usually take a team of engineers and researchers to help survive through time. In this presentation, we briefly introduce how a modern DLA framework BLIS (Instantiating high-performance BLAS-like dense linear algebra libraries) is designed to minimize the architecture dependent part in BLAS, and how these DLA domain expert knowledges can be taught easily with this framework. The presentation comes with a hands-on workshop which lets audiences optimize a simple GEMM (General Matrix-Matrix Multiplication) step-by-step to realize the basic concepts of BLIS.