Back to Search Start Over

Optimizing the Performance of Consistency-Aware Deduplication Using Persistent Memory

Authors :
Song, Chunlin
Chen, Xianzhang
Liu, Duo
Li, Jiali
Tan, Yujuan
Ren, Ao
Source :
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems; 2024, Vol. 43 Issue: 6 p1691-1703, 13p
Publication Year :
2024

Abstract

Block-level data deduplication is a widely used technology for saving storage space by filtering the data blocks with the same hash value. However, existing block-level data deduplication approaches either ignore the data consistency of deduplication or suffer severe performance degradation for providing consistency guarantees. In this article, we propose Consistency-Aware Deduplication (CADedup<superscript>+</superscript>) to achieve high-performance block-level data deduplication with data consistency. The main idea of CADedup<superscript>+</superscript> is to achieve an efficient journaling mechanism for deduplication by taking advantage of persistent memory (PM), such as byte-addressability and near-DRAM access latency. To balance the tradeoffs between performance and consistency requirements in data deduplication, we carefully design three modes of journaling mechanism, i.e., writeback mode, ordered mode, and journal mode, for CADedup<superscript>+</superscript>. We properly place the deduplication metadata of CADedup<superscript>+</superscript> onto the DRAM- PM hybrid memory architecture to minimize PM costs according to the features of metadata updates. The deduplication metadata on PM is managed by a set of metadata transactions and updated with the help of the efficient hardware atomic operations provided by CPU. We implement CADedup<superscript>+</superscript> in the generic block layer in Linux kernel 4.9.0. We conduct extensive experiments on Intel Optane PMEM to evaluate CADedup<superscript>+</superscript> with typical benchmarks. Experimental results show that CADedup<superscript>+</superscript> can reduce 63%–70% write volume and 50%–60% I/O latency over Dmdedup, a widely used open-source block-level data deduplication system, while ensuring deduplication consistency.

Details

Language :
English
ISSN :
02780070
Volume :
43
Issue :
6
Database :
Supplemental Index
Journal :
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Publication Type :
Periodical
Accession number :
ejs66457193
Full Text :
https://doi.org/10.1109/TCAD.2023.3347305