基于多维度特征融合的云工作流任务执行时间预测方法

Hui Fang Li; Jiang Hang Huang; Guang Hao Xu; Yuan Qing Xia

doi:10.16383/j.aas.c210123

基于多维度特征融合的云工作流任务执行时间预测方法

Hui Fang Li, Jiang Hang Huang, Guang Hao Xu, Yuan Qing Xia

自动化学院

Beijing Institute of Technology

科研成果: 期刊稿件 › 文章 › 同行评审

1 引用（Scopus）

摘要

Task runtime estimation is a prerequisite for workflow scheduling in cloud data centers. However, the existing runtime prediction methods for workflow activities fail to effectively extract categorical and numerical features. In this paper, we propose a multi-dimensional feature fusion-based runtime prediction approach for workflow tasks. Firstly, we construct a stacked residual recurrent neural network with an attention mechanism for mapping categorical data from high-dimensional sparse space to low-dimensional dense space so as to enlarge its capability of parsing categorical data for categorical feature extraction. Secondly, extreme gradient boosting is introduced to discretize the numerical data and enhance the nonlinear representation capability for numerical features through sparsely processing the input vectors within dense space. Thirdly, we design a heterogeneous multi-dimensional feature fusion strategy, and then blend the extracted features with original inputs to mine comprehensive knowledge for runtime prediction. Finally, based on the resulting multi-dimensional fused features, a prediction model is developed to fully utilize these features as well as its corresponding hidden knowledge and then to forecast the runtimes accurately for cloud workflow tasks. To verify the effectiveness and superiority of the proposed method, we conduct extensive experiments on a cluster dataset from a real cloud data center. The experimental results show that, our approach outperforms the existing algorithms and can be applied in big data-driven runtime prediction for workflow activities in the cloud.

投稿的翻译标题	Multi-dimensional Feature Fusion-based Runtime Prediction Approach for Cloud Workflow Tasks
源语言	繁体中文
页（从-至）	67-78
页数	12
期刊	Zidonghua Xuebao/Acta Automatica Sinica
卷	49
期	1
DOI	https://doi.org/10.16383/j.aas.c210123
出版状态	已出版 - 1月 2023

关键词

Cloud data centers
ensemble learning
execution time prediction
feature fusion
workflows

访问文件

10.16383/j.aas.c210123

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{62eff96401c04a27b66ef7b37a2112cd,

title = "基于多维度特征融合的云工作流任务执行时间预测方法",

abstract = "Task runtime estimation is a prerequisite for workflow scheduling in cloud data centers. However, the existing runtime prediction methods for workflow activities fail to effectively extract categorical and numerical features. In this paper, we propose a multi-dimensional feature fusion-based runtime prediction approach for workflow tasks. Firstly, we construct a stacked residual recurrent neural network with an attention mechanism for mapping categorical data from high-dimensional sparse space to low-dimensional dense space so as to enlarge its capability of parsing categorical data for categorical feature extraction. Secondly, extreme gradient boosting is introduced to discretize the numerical data and enhance the nonlinear representation capability for numerical features through sparsely processing the input vectors within dense space. Thirdly, we design a heterogeneous multi-dimensional feature fusion strategy, and then blend the extracted features with original inputs to mine comprehensive knowledge for runtime prediction. Finally, based on the resulting multi-dimensional fused features, a prediction model is developed to fully utilize these features as well as its corresponding hidden knowledge and then to forecast the runtimes accurately for cloud workflow tasks. To verify the effectiveness and superiority of the proposed method, we conduct extensive experiments on a cluster dataset from a real cloud data center. The experimental results show that, our approach outperforms the existing algorithms and can be applied in big data-driven runtime prediction for workflow activities in the cloud.",

keywords = "Cloud data centers, ensemble learning, execution time prediction, feature fusion, workflows",

author = "Li, {Hui Fang} and Huang, {Jiang Hang} and Xu, {Guang Hao} and Xia, {Yuan Qing}",

year = "2023",

month = jan,

doi = "10.16383/j.aas.c210123",

language = "繁体中文",

volume = "49",

pages = "67--78",

journal = "Zidonghua Xuebao/Acta Automatica Sinica",

issn = "0254-4156",

publisher = "Science Press",

number = "1",

}

TY - JOUR

T1 - 基于多维度特征融合的云工作流任务执行时间预测方法

AU - Li, Hui Fang

AU - Huang, Jiang Hang

AU - Xu, Guang Hao

AU - Xia, Yuan Qing

PY - 2023/1

Y1 - 2023/1

N2 - Task runtime estimation is a prerequisite for workflow scheduling in cloud data centers. However, the existing runtime prediction methods for workflow activities fail to effectively extract categorical and numerical features. In this paper, we propose a multi-dimensional feature fusion-based runtime prediction approach for workflow tasks. Firstly, we construct a stacked residual recurrent neural network with an attention mechanism for mapping categorical data from high-dimensional sparse space to low-dimensional dense space so as to enlarge its capability of parsing categorical data for categorical feature extraction. Secondly, extreme gradient boosting is introduced to discretize the numerical data and enhance the nonlinear representation capability for numerical features through sparsely processing the input vectors within dense space. Thirdly, we design a heterogeneous multi-dimensional feature fusion strategy, and then blend the extracted features with original inputs to mine comprehensive knowledge for runtime prediction. Finally, based on the resulting multi-dimensional fused features, a prediction model is developed to fully utilize these features as well as its corresponding hidden knowledge and then to forecast the runtimes accurately for cloud workflow tasks. To verify the effectiveness and superiority of the proposed method, we conduct extensive experiments on a cluster dataset from a real cloud data center. The experimental results show that, our approach outperforms the existing algorithms and can be applied in big data-driven runtime prediction for workflow activities in the cloud.

AB - Task runtime estimation is a prerequisite for workflow scheduling in cloud data centers. However, the existing runtime prediction methods for workflow activities fail to effectively extract categorical and numerical features. In this paper, we propose a multi-dimensional feature fusion-based runtime prediction approach for workflow tasks. Firstly, we construct a stacked residual recurrent neural network with an attention mechanism for mapping categorical data from high-dimensional sparse space to low-dimensional dense space so as to enlarge its capability of parsing categorical data for categorical feature extraction. Secondly, extreme gradient boosting is introduced to discretize the numerical data and enhance the nonlinear representation capability for numerical features through sparsely processing the input vectors within dense space. Thirdly, we design a heterogeneous multi-dimensional feature fusion strategy, and then blend the extracted features with original inputs to mine comprehensive knowledge for runtime prediction. Finally, based on the resulting multi-dimensional fused features, a prediction model is developed to fully utilize these features as well as its corresponding hidden knowledge and then to forecast the runtimes accurately for cloud workflow tasks. To verify the effectiveness and superiority of the proposed method, we conduct extensive experiments on a cluster dataset from a real cloud data center. The experimental results show that, our approach outperforms the existing algorithms and can be applied in big data-driven runtime prediction for workflow activities in the cloud.

KW - Cloud data centers

KW - ensemble learning

KW - execution time prediction

KW - feature fusion

KW - workflows

UR - http://www.scopus.com/inward/record.url?scp=85174407201&partnerID=8YFLogxK

U2 - 10.16383/j.aas.c210123

DO - 10.16383/j.aas.c210123

M3 - 文章

AN - SCOPUS:85174407201

SN - 0254-4156

VL - 49

SP - 67

EP - 78

JO - Zidonghua Xuebao/Acta Automatica Sinica

JF - Zidonghua Xuebao/Acta Automatica Sinica

IS - 1

ER -

基于多维度特征融合的云工作流任务执行时间预测方法

摘要

关键词

访问文件

其它文件与链接

指纹

引用此