AdaptiveConfig: Run-time configuration of cluster schedulers for cloud short-running jobs

Rui Han, Zan Zong, Lydia Y. Chen, Siyi Wang, Jianfeng Zhan

科研成果: 书/报告/会议事项章节会议稿件同行评审

5 引用 (Scopus)

摘要

Cluster schedulers provide flexible resource sharing mechanism for short-running jobs, which occupy a majority of cloud jobs. A scheduler's configuration decides how to allocate resources among jobs and hence it is crucial to their performances. Today's cloud platforms usually rely on cluster administrators to set this configuration, thus it is difficult to optimally configure the scheduler so as to minimize the latencies of heterogeneous and dynamically changing jobs in the cloud. In this paper, we introduce AdaptiveConfig, a run-time configurator for cluster schedulers that automatically adapts to the changing workload and resource status. This includes: (1) an estimator to calculate jobs' performances under different configurations and various scheduling scenarios. The key idea here is to transform a scheduler's resource allocation mechanisms and their variable influence factors (configuration parameters, scheduling constraints, available resources, and workload status) into business rules and facts in a rule engine, thereby reasoning about these correlated factors in job performance estimation. (2) A run-time optimizer that efficiently searches the configuration space to find the optimal configuration for the current workload. We implemented AdaptiveConfig on the popular YARN Capacity and Fair schedulers and demonstrate its effectiveness using workloads of Facebook jobs, i.e. considerably reducing latencies by 2.22 times (and up to 4.50 times) with low optimization overheads.

源语言英语
主期刊名Proceedings - 2018 IEEE 38th International Conference on Distributed Computing Systems, ICDCS 2018
出版商Institute of Electrical and Electronics Engineers Inc.
1519-1526
页数8
ISBN(电子版)9781538668719
DOI
出版状态已出版 - 19 7月 2018
已对外发布
活动38th IEEE International Conference on Distributed Computing Systems, ICDCS 2018 - Vienna, 奥地利
期限: 2 7月 20185 7月 2018

出版系列

姓名Proceedings - International Conference on Distributed Computing Systems
2018-July

会议

会议38th IEEE International Conference on Distributed Computing Systems, ICDCS 2018
国家/地区奥地利
Vienna
时期2/07/185/07/18

指纹

探究 'AdaptiveConfig: Run-time configuration of cluster schedulers for cloud short-running jobs' 的科研主题。它们共同构成独一无二的指纹。

引用此