Back to Search
Start Over
sLASs: a fully automatic auto-tuned linear algebra library based on OpenMP extensions implemented in OmpSs (LASs Library)
- Source :
- Recercat. Dipósit de la Recerca de Catalunya, instname, UPCommons. Portal del coneixement obert de la UPC, Universitat Politècnica de Catalunya (UPC)
- Publication Year :
- 2020
- Publisher :
- Elsevier, 2020.
-
Abstract
- © 2019 Elsevier. This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/ In this work we have implemented a novel Linear Algebra Library on top of the task-based runtime OmpSs-2. We have used some of the most advanced OmpSs-2 features; weak dependencies and regions, together with the final clause for the implementation of auto-tunable code for the BLAS-3 trsm routine and the LAPACK routines npgetrf and npgesv. All these implementations are part of the first prototype of sLASs library, a novel library for auto-tunable codes for linear algebra operations based on LASs library. In all these cases, the use of the OmpSs-2 features presents an improvement in terms of execution time against other reference libraries such as, the original LASs library, PLASMA, ATLAS and Intel MKL. These codes are able to reduce the execution time in about 18% on big matrices, by increasing the IPC on gemm and reducing the time of task instantiation. For a few medium matrices, benefits are also seen. For small matrices and a subset of medium matrices, specific optimizations that allow to increase the degree of parallelism in both, gemm and trsm tasks, are applied. This strategy achieves an increment in performance of up to 40%. This project has received funding from the Spanish Ministry of Economy and Competitiveness, Spain under the project Computación de Altas Prestaciones VII (TIN2015- 65316-P), the Departament d’Innovació, Universitats i Empresa de la Generalitat de Catalunya, Spain, under project MPEXPAR: Models de Programació i Entorns d’Execució Parallels (2014-SGR-1051), and the Juan de la Cierva Grant Agreement No IJCI-2017- 33511. We also acknowledge the funding provided by Fujitsu, Japan under the BSC-Fujitsu joint project: Math Libraries Migration and Optimization.
- Subjects :
- Computer Networks and Communications
Computer science
Auto-tuning
Degree of parallelism
Parallel programming (Computer science)
020206 networking & telecommunications
02 engineering and technology
Parallel computing
LASs
Programació en paral·lel (Informàtica)
Execution time
Theoretical Computer Science
Task (computing)
Matrix (mathematics)
OmpSs
Artificial Intelligence
Hardware and Architecture
Fully automatic
Linear algebra
0202 electrical engineering, electronic engineering, information engineering
Code (cryptography)
020201 artificial intelligence & image processing
Informàtica::Arquitectura de computadors::Arquitectures paral·leles [Àrees temàtiques de la UPC]
Software
Subjects
Details
- Language :
- English
- Database :
- OpenAIRE
- Journal :
- Recercat. Dipósit de la Recerca de Catalunya, instname, UPCommons. Portal del coneixement obert de la UPC, Universitat Politècnica de Catalunya (UPC)
- Accession number :
- edsair.doi.dedup.....c1131e81fde4cac399cdbe3f79a6ea38