Optimizing the restoration performance of deduplication systems through an energy-saving data layout

Fang Yan, Xi Yang, Jiamou Liu, Heng Liang Tang, Yu An Tan, Yuan Zhang Li*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

While data deduplication is an important data compression technique that removes copies of repeated data to enhance storage utilization, security and privacy risks arise since sensitive or delicate user data are at risk to both insider and outsider attacks. A distinct negative factor to performance of the technique is data fragmentation, which not only slows down the restoration process but also leads to massive power consumption. In this paper, we address this problem from the perspective of data layout. The kernel point of our method is a novel RAID-5-based cross grouping data layout (CGDL). We introduce a selective deduplication algorithm (SDD) to perform data replication and restoration. A new CGDL-based disk scheduling algorithm (LDP) is also proposed that predicts location dependence to save energy by eliminating the redundant disk read/write operations. We evaluate our new method on the Linux MD (multiple device) driver modules. The experiments show that, under a 10 disks 3 groups storage configuration, our method drastically (by 20%) improves restoration efficiency with only 7.6% reduction on the deduplication ratio, while reducing 23% power consumption.

Original languageEnglish
Pages (from-to)461-471
Number of pages11
JournalAnnales des Telecommunications/Annals of Telecommunications
Volume74
Issue number7-8
DOIs
Publication statusPublished - 1 Aug 2019

Keywords

  • Data deduplication
  • Data layout
  • Data restoration
  • Energy saving

Fingerprint

Dive into the research topics of 'Optimizing the restoration performance of deduplication systems through an energy-saving data layout'. Together they form a unique fingerprint.

Cite this