TY - JOUR
T1 - MespaConfig
T2 - Memory-Sparing Configuration Auto-Tuning for Co-Located In-Memory Cluster Computing Jobs
AU - Zong, Zan
AU - Wen, Lijie
AU - Hu, Xuming
AU - Han, Rui
AU - Qian, Chen
AU - Lin, Li
N1 - Publisher Copyright:
© 2008-2012 IEEE.
PY - 2022
Y1 - 2022
N2 - Distributed in-memory computing frameworks usually have lots of parameters (e.g., the buffer size of shuffle) to form a configuration for each execution. A well-tuned configuration can bring large improvements of performance. However, to improve resource utilization, jobs are often share the same cluster, which causes dynamic cluster load conditions. According to our observation, the variation of cluster load reduces effectiveness of configuration tuning. Besides, as a common problem of cluster computing jobs, overestimation of resources also occurs during configuration tuning. It is challenging to efficiently find the optimal configuration in a shared cluster with the consideration of memory-sparing. In this article, we introduce MespaConfig, a job-level configuration optimizer for distributed in-memory computing jobs. Advancements of MespaConfig over previous work are features including memory-sparing and load-sensitive. We evaluate MespaConfig by 6 typical Spark programs under different load conditions. The evaluation results show that MespaConfig improves the performance of six typical programs by up to 12× compared with default configurations. MespaConfig also achieves at most 41 percent reduction of configuration memory usage and reduces the optimization time overhead by 10.8× compared with the state-of-the-art approach.
AB - Distributed in-memory computing frameworks usually have lots of parameters (e.g., the buffer size of shuffle) to form a configuration for each execution. A well-tuned configuration can bring large improvements of performance. However, to improve resource utilization, jobs are often share the same cluster, which causes dynamic cluster load conditions. According to our observation, the variation of cluster load reduces effectiveness of configuration tuning. Besides, as a common problem of cluster computing jobs, overestimation of resources also occurs during configuration tuning. It is challenging to efficiently find the optimal configuration in a shared cluster with the consideration of memory-sparing. In this article, we introduce MespaConfig, a job-level configuration optimizer for distributed in-memory computing jobs. Advancements of MespaConfig over previous work are features including memory-sparing and load-sensitive. We evaluate MespaConfig by 6 typical Spark programs under different load conditions. The evaluation results show that MespaConfig improves the performance of six typical programs by up to 12× compared with default configurations. MespaConfig also achieves at most 41 percent reduction of configuration memory usage and reduces the optimization time overhead by 10.8× compared with the state-of-the-art approach.
KW - Configuration tuning
KW - co-locate
KW - in-memory computing
KW - memory-sparing
KW - performance optimization
UR - http://www.scopus.com/inward/record.url?scp=85102309709&partnerID=8YFLogxK
U2 - 10.1109/TSC.2021.3063118
DO - 10.1109/TSC.2021.3063118
M3 - Article
AN - SCOPUS:85102309709
SN - 1939-1374
VL - 15
SP - 2883
EP - 2896
JO - IEEE Transactions on Services Computing
JF - IEEE Transactions on Services Computing
IS - 5
ER -