Back to Search Start Over

Leaky Buffer: A Novel Abstraction for Relieving Memory Pressure from Cluster Data Processing Frameworks.

Authors :
Liu, Zhaolei
Ng, T. S. Eugene
Source :
IEEE Transactions on Parallel & Distributed Systems. Jan2017, Vol. 28 Issue 1, p128-140. 13p.
Publication Year :
2017

Abstract

The shift to the in-memory data processing paradigm has had a major influence on the development of cluster data processing frameworks. Numerous frameworks from the industry, open source community and academia are adopting the in-memory paradigm to achieve functionalities and performance breakthroughs. However, despite the advantages of these in-memory frameworks, in practice they are susceptible to memory-pressure related performance collapse and failures. The contributions of this paper are two-fold. First, we conduct a detailed diagnosis of the memory pressure problem and identify three preconditions for the performance collapse. These preconditions not only explain the problem but also shed light on the possible solution strategies. Second, we propose a novel programming abstraction called the leaky buffer that eliminates one of the preconditions, thereby addressing the underlying problem. We have implemented a leaky buffer enabled hashtable in Spark, and we believe it is also able to substitute the hashtable that performs similar hash aggregation operations in any other programs or data processing frameworks. Experiments on a range of memory intensive aggregation operations show that the leaky buffer abstraction can drastically reduce the occurrence of memory-related failures, improve performance by up to 507 percent and reduce memory usage by up to 87.5 percent. [ABSTRACT FROM PUBLISHER]

Details

Language :
English
ISSN :
10459219
Volume :
28
Issue :
1
Database :
Academic Search Index
Journal :
IEEE Transactions on Parallel & Distributed Systems
Publication Type :
Academic Journal
Accession number :
120167467
Full Text :
https://doi.org/10.1109/TPDS.2016.2546909