PEFS: AI-Driven Prediction Based Energy-Aware Fault-Tolerant Scheduling Scheme for Cloud Data Center

  • Avinab Marahatta*
  • , Qin Xin
  • , Ce Chi
  • , Fa Zhang
  • , Zhiyong Liu
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Cloud data centers (CDCs) have become increasingly popular and widespread in recent years with the growing popularity of cloud computing and high-performance computing. Due to the multi-step computation of data streams and heterogeneous task dependencies, task failure frequently occurs, resulting in poor user experience and additional energy consumption. To reduce task execution failure as well as energy consumption, we propose a novel AI-driven energy-aware proactive fault-tolerant scheduling scheme for CDCs in this paper. First, a prediction model based on the machine learning approach is trained to classify the arriving tasks into 'failure-prone tasks' and 'non-failure-prone tasks' according to the predicted failure rate. Then, two efficient scheduling mechanisms are proposed to allocate two types of tasks to the most appropriate hosts in a CDC. The vector reconstruction method is developed to construct super tasks from failure-prone tasks and separately schedule these super tasks and non-failure-prone tasks to the most suitable physical host. All the tasks are scheduled in an earliest-deadline-first manner. Our evaluation results show that the proposed scheme can intelligently predict task failure and achieves better fault tolerance and reduces total energy consumption better than the existing schemes.

Original languageEnglish
Pages (from-to)655-666
Number of pages12
JournalIEEE Transactions on Sustainable Computing
Volume6
Issue number4
DOIs
Publication statusPublished - 2021
Externally publishedYes

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 7 - Affordable and Clean Energy
    SDG 7 Affordable and Clean Energy

Keywords

  • Cloud computing
  • Cloud data center
  • Deep neural network
  • Energy-efficiency
  • Fault-tolerance
  • Prediction
  • Scheduling
  • Task failure

Fingerprint

Dive into the research topics of 'PEFS: AI-Driven Prediction Based Energy-Aware Fault-Tolerant Scheduling Scheme for Cloud Data Center'. Together they form a unique fingerprint.

Cite this