Back to Search Start Over

PDFS: Partially Dedupped File System for Primary Workloads.

Authors :
Yu, Hongliang
Zhang, Xu
Huang, Wei
Zheng, Weimin
Source :
IEEE Transactions on Parallel & Distributed Systems. Mar2017, Vol. 28 Issue 3, p863-876. 14p.
Publication Year :
2017

Abstract

Primary storage dedup is difficult to be accomplished because of challenges to achieve low IO latency and high throughput while eliminating data redundancy effectively in the critical IO Path. In this paper, we design and implement the PDFS, a partially dedupped file system for primary workloads, which is built on a generalized framework using partial data lookup for efficient searching of redundant data in quickly chosen data subsets instead of the whole data. PDFS improves IO latency and throughput systematically by techniques including write path optimization, data dedup parallelization and write order preserving. Such design choices bring dedup to the masses for general primary workloads. Experimental results show that PDFS achieves 74-99 percent of the theoretical maximum dedup ratio with very small or even negative performance degradations compared with main stream file systems without dedup support. Discussions about varied configuring experiences of PDFS are also carried out. [ABSTRACT FROM PUBLISHER]

Details

Language :
English
ISSN :
10459219
Volume :
28
Issue :
3
Database :
Academic Search Index
Journal :
IEEE Transactions on Parallel & Distributed Systems
Publication Type :
Academic Journal
Accession number :
121301727
Full Text :
https://doi.org/10.1109/TPDS.2016.2594070