GEMM-based level 3 BLAS: high-performance model implementations and performance evaluation benchmark
ACM Transactions on Mathematical Software
Charles van Loan
Bo Kågström
An updated set of basic linear algebra subprograms (BLAS)