TY - JOUR
T1 - Optimizing Data Center Energy Efficiency via Event-Driven Deep Reinforcement Learning
AU - Ran, Yongyi
AU - Zhou, Xin
AU - Hu, Han
AU - Wen, Yonggang
N1 - Publisher Copyright:
© 2008-2012 IEEE.
PY - 2023/3/1
Y1 - 2023/3/1
N2 - To reduce the skyrocketing energy consumption of data centers, the prevailing approaches adopt the time-driven manner to control IT and cooling subsystems. These methods suffer from highly dynamic system states, complex action spaces and the risk of instability caused by frequent and unnecessary control operations. To tackle these problems, we propose a novel event-driven control paradigm and an optimization algorithm, under the deep reinforcement learning (DRL) framework. The principle is to make decisions based on certain critical events (e.g., overheating), rather than fixed periodic control. Specifically, we design an event-driven optimization framework to trigger control operations. Then, we present several models to describe IT and cooling subsystems, and mathematically define events to capture four types of prior factors that impact system performance. Furthermore, we develop an event-driven DRL (E-DRL) optimization algorithm to dispatch jobs and regulate cooling facilities for energy efficiency. Using two different types of real workload traces, we conduct extensive experiments to demonstrate that: 1) E-DRL reduces the number of regulating decisions by ∼95% while achieving a comparable or even better energy efficiency in comparison with the state-of-the-art algorithm; and 2) E-DRL can adapt the control frequency to the changing operational conditions and diverse workloads.
AB - To reduce the skyrocketing energy consumption of data centers, the prevailing approaches adopt the time-driven manner to control IT and cooling subsystems. These methods suffer from highly dynamic system states, complex action spaces and the risk of instability caused by frequent and unnecessary control operations. To tackle these problems, we propose a novel event-driven control paradigm and an optimization algorithm, under the deep reinforcement learning (DRL) framework. The principle is to make decisions based on certain critical events (e.g., overheating), rather than fixed periodic control. Specifically, we design an event-driven optimization framework to trigger control operations. Then, we present several models to describe IT and cooling subsystems, and mathematically define events to capture four types of prior factors that impact system performance. Furthermore, we develop an event-driven DRL (E-DRL) optimization algorithm to dispatch jobs and regulate cooling facilities for energy efficiency. Using two different types of real workload traces, we conduct extensive experiments to demonstrate that: 1) E-DRL reduces the number of regulating decisions by ∼95% while achieving a comparable or even better energy efficiency in comparison with the state-of-the-art algorithm; and 2) E-DRL can adapt the control frequency to the changing operational conditions and diverse workloads.
KW - Data center
KW - deep reinforcement learning
KW - energy efficiency
KW - event-driven optimization
UR - http://www.scopus.com/inward/record.url?scp=85126303579&partnerID=8YFLogxK
U2 - 10.1109/TSC.2022.3157145
DO - 10.1109/TSC.2022.3157145
M3 - Article
AN - SCOPUS:85126303579
SN - 1939-1374
VL - 16
SP - 1296
EP - 1309
JO - IEEE Transactions on Services Computing
JF - IEEE Transactions on Services Computing
IS - 2
ER -