Back to Search
Start Over
PDFS: Partially Dedupped File System for Primary Workloads.
- Source :
-
IEEE Transactions on Parallel & Distributed Systems . Mar2017, Vol. 28 Issue 3, p863-876. 14p. - Publication Year :
- 2017
-
Abstract
- Primary storage dedup is difficult to be accomplished because of challenges to achieve low IO latency and high throughput while eliminating data redundancy effectively in the critical IO Path. In this paper, we design and implement the PDFS, a partially dedupped file system for primary workloads, which is built on a generalized framework using partial data lookup for efficient searching of redundant data in quickly chosen data subsets instead of the whole data. PDFS improves IO latency and throughput systematically by techniques including write path optimization, data dedup parallelization and write order preserving. Such design choices bring dedup to the masses for general primary workloads. Experimental results show that PDFS achieves 74-99 percent of the theoretical maximum dedup ratio with very small or even negative performance degradations compared with main stream file systems without dedup support. Discussions about varied configuring experiences of PDFS are also carried out. [ABSTRACT FROM PUBLISHER]
Details
- Language :
- English
- ISSN :
- 10459219
- Volume :
- 28
- Issue :
- 3
- Database :
- Academic Search Index
- Journal :
- IEEE Transactions on Parallel & Distributed Systems
- Publication Type :
- Academic Journal
- Accession number :
- 121301727
- Full Text :
- https://doi.org/10.1109/TPDS.2016.2594070