新型分布式计算系统中的异构任务调度框架

Rui Qi Liu; Bo Yang Li; Yu Jin Gao; Chang Sheng Li; Heng Tai Zhao; Fu Sheng Jin; Rong Hua Li; Guo Ren Wang

doi:10.13328/j.cnki.jos.006451

新型分布式计算系统中的异构任务调度框架

Translated title of the contribution: Heterogeneous Task Scheduling Framework in Emerging Distributed Computing Systems

Rui Qi Liu, Bo Yang Li^*, Yu Jin Gao, Chang Sheng Li, Heng Tai Zhao, Fu Sheng Jin, Rong Hua Li, Guo Ren Wang

^*Corresponding author for this work

School of Computer Science and Technology

Research output: Contribution to journal › Article › peer-review

2 Citations (Scopus)

Abstract

With the rapid development of big data and machine learning, the distributed big data computing engine for machine learning have emerged. These systems can support both batch distributed learning and incremental learning and verification, with low latency and high performance. However, some of them adopt a random task scheduling strategy, ignoring the performance differences of nodes, which easily lead to uneven load and performance degradation. At the same time, for some tasks, if the resource requirements are not met, the scheduling will fail. In response to these problems, a heterogeneous task scheduling framework is proposed, which can ensure the efficient execution and execution of tasks. Specifically, for the task scheduling module, the proposed framework proposes a probabilistic random scheduling strategy resource-Pick_kx and a definite smooth weighted round-robin algorithm around the heterogeneous computing resources of nodes. The resource-Pick_kx al-gorithm calculates the probability according to the performance of the node, and performs random scheduling with probability. The higher the probability of a node with high performance, the higher the possibility of task scheduling to this node. The smooth weighted round-robin algorithm sets the weights according to the node performance at the beginning, and smoothly weights during the scheduling process, so that the task is scheduled to the node with the highest performance. In addition, for task scenarios where resources do not meet the requirements, a container-based vertical expansion mechanism is proposed to customize task resources, create nodes to join the cluster, and complete task scheduling again. The performance of the framework is tested on benchmarks and public data sets through ex-periments. Compared with the current strategy, the performance of the proposed frame is improved by 10% to 20%.

Translated title of the contribution	Heterogeneous Task Scheduling Framework in Emerging Distributed Computing Systems
Original language	Chinese (Traditional)
Pages (from-to)	1005-1017
Number of pages	13
Journal	Ruan Jian Xue Bao/Journal of Software
Volume	33
Issue number	3
DOIs	https://doi.org/10.13328/j.cnki.jos.006451
Publication status	Published - Mar 2022

Access to Document

10.13328/j.cnki.jos.006451

Cite this

@article{4a12a4292d3c4cd1b51f55f69d3511bd,

title = "新型分布式计算系统中的异构任务调度框架",

abstract = "With the rapid development of big data and machine learning, the distributed big data computing engine for machine learning have emerged. These systems can support both batch distributed learning and incremental learning and verification, with low latency and high performance. However, some of them adopt a random task scheduling strategy, ignoring the performance differences of nodes, which easily lead to uneven load and performance degradation. At the same time, for some tasks, if the resource requirements are not met, the scheduling will fail. In response to these problems, a heterogeneous task scheduling framework is proposed, which can ensure the efficient execution and execution of tasks. Specifically, for the task scheduling module, the proposed framework proposes a probabilistic random scheduling strategy resource-Pick_kx and a definite smooth weighted round-robin algorithm around the heterogeneous computing resources of nodes. The resource-Pick_kx al-gorithm calculates the probability according to the performance of the node, and performs random scheduling with probability. The higher the probability of a node with high performance, the higher the possibility of task scheduling to this node. The smooth weighted round-robin algorithm sets the weights according to the node performance at the beginning, and smoothly weights during the scheduling process, so that the task is scheduled to the node with the highest performance. In addition, for task scenarios where resources do not meet the requirements, a container-based vertical expansion mechanism is proposed to customize task resources, create nodes to join the cluster, and complete task scheduling again. The performance of the framework is tested on benchmarks and public data sets through ex-periments. Compared with the current strategy, the performance of the proposed frame is improved by 10% to 20%.",

keywords = "Autoscale, Distributed computing, Heterogeneous task, Load balance, Task scheduling",

author = "Liu, {Rui Qi} and Li, {Bo Yang} and Gao, {Yu Jin} and Li, {Chang Sheng} and Zhao, {Heng Tai} and Jin, {Fu Sheng} and Li, {Rong Hua} and Wang, {Guo Ren}",

year = "2022",

month = mar,

doi = "10.13328/j.cnki.jos.006451",

language = "繁体中文",

volume = "33",

pages = "1005--1017",

journal = "Ruan Jian Xue Bao/Journal of Software",

issn = "1000-9825",

publisher = "Chinese Academy of Sciences",

number = "3",

}

TY - JOUR

T1 - 新型分布式计算系统中的异构任务调度框架

AU - Liu, Rui Qi

AU - Li, Bo Yang

AU - Gao, Yu Jin

AU - Li, Chang Sheng

AU - Zhao, Heng Tai

AU - Jin, Fu Sheng

AU - Li, Rong Hua

AU - Wang, Guo Ren

PY - 2022/3

Y1 - 2022/3

N2 - With the rapid development of big data and machine learning, the distributed big data computing engine for machine learning have emerged. These systems can support both batch distributed learning and incremental learning and verification, with low latency and high performance. However, some of them adopt a random task scheduling strategy, ignoring the performance differences of nodes, which easily lead to uneven load and performance degradation. At the same time, for some tasks, if the resource requirements are not met, the scheduling will fail. In response to these problems, a heterogeneous task scheduling framework is proposed, which can ensure the efficient execution and execution of tasks. Specifically, for the task scheduling module, the proposed framework proposes a probabilistic random scheduling strategy resource-Pick_kx and a definite smooth weighted round-robin algorithm around the heterogeneous computing resources of nodes. The resource-Pick_kx al-gorithm calculates the probability according to the performance of the node, and performs random scheduling with probability. The higher the probability of a node with high performance, the higher the possibility of task scheduling to this node. The smooth weighted round-robin algorithm sets the weights according to the node performance at the beginning, and smoothly weights during the scheduling process, so that the task is scheduled to the node with the highest performance. In addition, for task scenarios where resources do not meet the requirements, a container-based vertical expansion mechanism is proposed to customize task resources, create nodes to join the cluster, and complete task scheduling again. The performance of the framework is tested on benchmarks and public data sets through ex-periments. Compared with the current strategy, the performance of the proposed frame is improved by 10% to 20%.

AB - With the rapid development of big data and machine learning, the distributed big data computing engine for machine learning have emerged. These systems can support both batch distributed learning and incremental learning and verification, with low latency and high performance. However, some of them adopt a random task scheduling strategy, ignoring the performance differences of nodes, which easily lead to uneven load and performance degradation. At the same time, for some tasks, if the resource requirements are not met, the scheduling will fail. In response to these problems, a heterogeneous task scheduling framework is proposed, which can ensure the efficient execution and execution of tasks. Specifically, for the task scheduling module, the proposed framework proposes a probabilistic random scheduling strategy resource-Pick_kx and a definite smooth weighted round-robin algorithm around the heterogeneous computing resources of nodes. The resource-Pick_kx al-gorithm calculates the probability according to the performance of the node, and performs random scheduling with probability. The higher the probability of a node with high performance, the higher the possibility of task scheduling to this node. The smooth weighted round-robin algorithm sets the weights according to the node performance at the beginning, and smoothly weights during the scheduling process, so that the task is scheduled to the node with the highest performance. In addition, for task scenarios where resources do not meet the requirements, a container-based vertical expansion mechanism is proposed to customize task resources, create nodes to join the cluster, and complete task scheduling again. The performance of the framework is tested on benchmarks and public data sets through ex-periments. Compared with the current strategy, the performance of the proposed frame is improved by 10% to 20%.

KW - Autoscale

KW - Distributed computing

KW - Heterogeneous task

KW - Load balance

KW - Task scheduling

UR - http://www.scopus.com/inward/record.url?scp=85126992421&partnerID=8YFLogxK

U2 - 10.13328/j.cnki.jos.006451

DO - 10.13328/j.cnki.jos.006451

M3 - 文章

AN - SCOPUS:85126992421

SN - 1000-9825

VL - 33

SP - 1005

EP - 1017

JO - Ruan Jian Xue Bao/Journal of Software

JF - Ruan Jian Xue Bao/Journal of Software

IS - 3

ER -

新型分布式计算系统中的异构任务调度框架

Abstract

Access to Document

Other files and links

Fingerprint

Cite this