Back to Search Start Over

Exploiting the Parallelism Between Conflicting Critical Sections with Partial Reversion.

Authors :
Zheng, Long
Liao, Xiaofei
Jin, Hai
Liu, Haikun
Source :
IEEE Transactions on Parallel & Distributed Systems. Dec2017, Vol. 28 Issue 12, p3443-3457. 15p.
Publication Year :
2017

Abstract

The critical sections with the lock protection greatly limit the concurrency of multi-threaded applications. The prior lock elision based technique is presented to exploit the parallelism between critical sections accessing the disjoint shared data, but still fails to notice and expose a high degree of concurrency between critical sections that contend for the same shared data, i.e., conflicting critical sections (CCS). This paper focuses on exploiting the CCS parallelism. The key insight of this work is that, for each running CCS, a large proportion ( $>$<alternatives><inline-graphic xlink:href="liao-ieq1-2727485.gif"/> </alternatives>73.4%) of parallelism between CCSs can be exploited as fully as possible by simply allowing the parallel execution of their first conflict-free code fragment at runtime. We therefore present BSOptimizer, a new microarchitecture, to perform the partial reversion integrated with a series of sophisticated hardware and software strategies for the CCS parallelization. We complement the off-the-shelf cache coherency protocol to perceive the conflict location of CCS, present a predictive checkpoint mechanism to register and predict the concerned conflict point in a lightweight and accurate fashion, and redefine the traditional mutual exclusive semantics with a binary relationship. With these collaborative techniques, each CCS can be scheduled in parallel. Our experimental results on a wide variety of real programs and PARSEC benchmarks show that, compared to the native execution and two state-of-the-art lock elision techniques (including SLE and SLR), BSOptmizer can dramatically improves the performance of programs with a slight ($<$ <alternatives><inline-graphic xlink:href="liao-ieq2-2727485.gif"/></alternatives>0.8%) energy consumption and ($<$<alternatives> <inline-graphic xlink:href="liao-ieq3-2727485.gif"/></alternatives>3.9%) extra runtime overhead. Our evaluation on a micro-benchmark with software based optimization also verifies that BSOptimizer can accurately exploit the CCS parallelism as promised. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10459219
Volume :
28
Issue :
12
Database :
Academic Search Index
Journal :
IEEE Transactions on Parallel & Distributed Systems
Publication Type :
Academic Journal
Accession number :
126238165
Full Text :
https://doi.org/10.1109/TPDS.2017.2727485