LDPP: A Learned Directory Placement Policy in Distributed File Systems

Yuanzhang Wang, Fengkui Yang, Ji Zhang, Chunhua Li*, Ke Zhou, Chong Liu, Zhuo Cheng, Wei Fang, Jinhu Liu

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Citations (Scopus)

Abstract

Load balance is a critical problem in distributed file systems. Previous works focus on how to distribute data evenly on different nodes or storage devices from the perspective of file level, but neglect to effectively take advantage of the directory's locality and the long duration of the directory's hotness, which may affect the degree of balance and cause performance degradation. To overcome this shortcoming, in this paper, we propose a learning-based directory placement policy, called LDPP, which determines the data layout by predicting the load. We first establish a relationship between directory request characteristics and state information to predict the state information of the directory (storage capacity, bandwidth, and IOPS). Then, the new directory is placed on different nodes in a multi-dimensional manner based on the Manhattan distance according to the predicted multidimensional state information. In addition, we also take into account the trade-off between the same category directory classified by the load prediction module and the peer directories and explore their influence on the balance. Extensive experiments demonstrate that LDPP not only efficiently alleviates load imbalance and increases the utilization of the resources but also improves DFS performance in practice, which can reduce service latency by up to 36 and increase IOPS and bandwidth by 8 and 9, respectively.

Original languageEnglish
Title of host publication51st International Conference on Parallel Processing, ICPP 2022 - Main Conference Proceedings
PublisherAssociation for Computing Machinery
ISBN (Electronic)9781450397339
DOIs
Publication statusPublished - 29 Aug 2022
Externally publishedYes
Event51st International Conference on Parallel Processing, ICPP 2022 - Virtual, Online, France
Duration: 29 Aug 20221 Sept 2022

Publication series

NameACM International Conference Proceeding Series

Conference

Conference51st International Conference on Parallel Processing, ICPP 2022
Country/TerritoryFrance
CityVirtual, Online
Period29/08/221/09/22

Keywords

  • DFS
  • directory placement
  • load balance
  • machine learning

Fingerprint

Dive into the research topics of 'LDPP: A Learned Directory Placement Policy in Distributed File Systems'. Together they form a unique fingerprint.

Cite this