Start Over

A Framework for the Automatic Vectorization of Parallel Sort on x86-Based Processors.

Authors :: Hou, Kaixi
Wang, Hao
Feng, Wu-Chun
Source :: IEEE Transactions on Parallel & Distributed Systems. May2018, Vol. 29 Issue 5, p958-972. 15p.
Publication Year :: 2018
Abstract: The continued growth in the width of vector registers and the evolving library of intrinsics on the modern x86 processors make manual optimizations for data-level parallelism tedious and error-prone. In this paper, we focus on parallel sorting, a building block for many higher-level applications, and propose a framework for the Automatic SIMDization of Parallel Sorting (ASPaS) on x86-based multi- and many-core processors. That is, ASPaS takes any sorting network and a given instruction set architecture (ISA) as inputs and automatically generates vector code for that sorting network. After formalizing the sort function as a sequence of comparators and the transpose and merge functions as sequences of vector-matrix multiplications, ASPaS can map these functions to operations from a selected “pattern pool” that is based on the characteristics of parallel sorting, and then generate the vector code with the real ISA intrinsics. The performance evaluation on the Intel Ivy Bridge and Haswell CPUs, and Knights Corner MIC illustrates that automatically generated sorting codes from ASPaS can outperform the widely used sorting tools, achieving up to 5.2x speedup over the single-threaded implementations from STL and Boost and up to 6.7x speedup over the multi-threaded parallel sort from Intel TBB. [ABSTRACT FROM AUTHOR]

Subjects :: *HIGH performance computing
*PARALLEL processing
*CENTRAL processing units
*COMPUTER architecture
*COMPUTER storage devices
*COMPUTER networks

Details

Language :: English
ISSN :: 10459219
Volume :: 29
Issue :: 5
Database :: Academic Search Index
Journal :: IEEE Transactions on Parallel & Distributed Systems
Publication Type :: Academic Journal
Accession number :: 129088143
Full Text :: https://doi.org/10.1109/TPDS.2018.2789903

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

A Framework for the Automatic Vectorization of Parallel Sort on x86-Based Processors.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

A Framework for the Automatic Vectorization of Parallel Sort on x86-Based Processors.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources