DAPIC: Dynamic adjustment method of parallelism for iterative computing in Flink

Hangxu Ji, Yongjiao Sun*, Xinran Su, Yuwei Fu, Ye Yuan, Guoren Wang, Qi Wang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

The Apache Flink distributed computing system demonstrates significant advantages in iterative computing. Establishing appropriate parallelism for operators is crucial to further enhancing the efficiency of Flink's iterative jobs and optimizing cluster resource utilization. However, the encapsulated user-defined functions within operators cannot be directly analyzed, posing challenges for optimizing parallelism based on computation logic. Moreover, the diverse and complex nature of hardware resources in distributed computing clusters exacerbates this issue, making it difficult to model the relationship between optimal parallelism and computational resources. To address these challenges, this paper proposes the Dynamic Parallelism Adjustment (DAPIC) framework. The DAPIC framework introduces a mechanism for dynamically adjusting operator parallelism in batch processing, leveraging an enhanced GCN + GRU model. Additionally, it incorporates a resource-aware parallelism adjustment mechanism for stream processing, considering the availability of remaining computational resources. Experimental results show that the DAPIC framework improves the efficiency of iterative jobs in batch processing by 38.94% - 138.82% and in stream processing by 28.85% - 31.08%. In heterogeneous bandwidth environments, the advantages of the DAPIC framework become even more pronounced, achieving efficiency improvements exceeding 1.5 times. Furthermore, the framework effectively reduces TaskSlot occupancy, resulting in resource savings of 24.61% - 75.78%.

Original languageEnglish
Article number121803
JournalInformation Sciences
Volume699
DOIs
Publication statusPublished - May 2025

Keywords

  • Flink
  • GRU
  • Iterative computing
  • Parallelism
  • Time-dependent subJobGraph

Cite this