DeepEE: Joint optimization of job scheduling and cooling control for data center energy efficiency using deep reinforcement learning

Yongyi Ran, Han Hu, Xin Zhou, Yonggang Wen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

80 Citations (Scopus)

Abstract

The past decade witnessed the tremendous growth of power consumption in data centers due to the rapid development of cloud computing, big data analytics, and machine learning, etc. The prior approaches that optimize the power consumption of the information technology (IT) system and/or the cooling system always fail to capture the system dynamics or suffer from the complexity of system states and action spaces. In this paper, we propose a Deep Reinforcement Learning (DRL) based optimization framework, named DeepEE, to improve the energy efficiency for data centers by considering the IT and cooling systems concurrently. In DeepEE, we first propose a PArameterized action space based Deep Q-Network (PADQN) algorithm to solve the hybrid action space problem and jointly optimize the job scheduling for the IT system and the airflow rate adjustment for the cooling system. Then, a two-time-scale control mechanism is applied in PADQN to coordinate the IT and cooling systems more accurately and efficiently. In addition, to train and evaluate the proposed PADQN in a safe and quick way, we build a simulation platform to model the dynamics of IT workload and cooling systems simultaneously. Through extensive real-trace based simulations, we demonstrate that: 1) our algorithm can save up to 15% and 10% energy consumption in comparison with the baseline siloed and joint optimization approaches respectively; 2) our algorithm achieves more stable performance gain in terms of power consumption by adopting the parameterized action space; and 3) our algorithm leads to a better tradeoff between energy saving and service quality.

Original languageEnglish
Title of host publicationProceedings - 2019 39th IEEE International Conference on Distributed Computing Systems, ICDCS 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages645-655
Number of pages11
ISBN (Electronic)9781728125190
DOIs
Publication statusPublished - Jul 2019
Event39th IEEE International Conference on Distributed Computing Systems, ICDCS 2019 - Richardson, United States
Duration: 7 Jul 20199 Jul 2019

Publication series

NameProceedings - International Conference on Distributed Computing Systems
Volume2019-July

Conference

Conference39th IEEE International Conference on Distributed Computing Systems, ICDCS 2019
Country/TerritoryUnited States
CityRichardson
Period7/07/199/07/19

Keywords

  • Cooling control
  • Data center
  • Deep reinforcement learning
  • Energy efficiency
  • Job scheduling

Fingerprint

Dive into the research topics of 'DeepEE: Joint optimization of job scheduling and cooling control for data center energy efficiency using deep reinforcement learning'. Together they form a unique fingerprint.

Cite this