TY - JOUR
T1 - Deep reinforce learning for joint optimization of condition-based maintenance and spare ordering
AU - Hao, Shengang
AU - Zheng, Jun
AU - Yang, Jie
AU - Sun, Haipeng
AU - Zhang, Quanxin
AU - Zhang, Li
AU - Jiang, Nan
AU - Li, Yuanzhang
N1 - Publisher Copyright:
© 2023 Elsevier Inc.
PY - 2023/7
Y1 - 2023/7
N2 - Condition-based maintenance (CBM) policy can avoid premature or late maintenance and reduce system failures and maintenance costs. Most existing CBM studies cannot solve the dimensional disaster problem in multi-component complex systems. Only some studies consider the constraint of maintenance resources when searching for the optimal maintenance policy, which is hard to apply to practical maintenance. This paper studies the joint optimization of the CBM policy and spare components inventory for the multi-component system in large state and action spaces. We use Markov Decision Process to model it and propose an improved deep reinforcement learning algorithm based on the stochastic policy and actor-critic framework. In this algorithm, factorization decomposes the system action into the linear combination of each component's action. The experimental results show that the algorithm proposed in this paper has better time performance and lower system cost compared with other benchmark algorithms. The training time of the former is only 28.5% and 9.12% of that of PPO and DQN algorithms, and the corresponding system cost is decreased by 17.39% and 27.95%, respectively. At the same time, our algorithm has good scalability and is suitable for solving Markov decision-making problems in large-scale state and action space.
AB - Condition-based maintenance (CBM) policy can avoid premature or late maintenance and reduce system failures and maintenance costs. Most existing CBM studies cannot solve the dimensional disaster problem in multi-component complex systems. Only some studies consider the constraint of maintenance resources when searching for the optimal maintenance policy, which is hard to apply to practical maintenance. This paper studies the joint optimization of the CBM policy and spare components inventory for the multi-component system in large state and action spaces. We use Markov Decision Process to model it and propose an improved deep reinforcement learning algorithm based on the stochastic policy and actor-critic framework. In this algorithm, factorization decomposes the system action into the linear combination of each component's action. The experimental results show that the algorithm proposed in this paper has better time performance and lower system cost compared with other benchmark algorithms. The training time of the former is only 28.5% and 9.12% of that of PPO and DQN algorithms, and the corresponding system cost is decreased by 17.39% and 27.95%, respectively. At the same time, our algorithm has good scalability and is suitable for solving Markov decision-making problems in large-scale state and action space.
KW - Actor-critic framework
KW - Condition-based maintenance
KW - Deep reinforcement learning
KW - Markov decision process
KW - Stochastic policy
UR - http://www.scopus.com/inward/record.url?scp=85150448619&partnerID=8YFLogxK
U2 - 10.1016/j.ins.2023.03.064
DO - 10.1016/j.ins.2023.03.064
M3 - Article
AN - SCOPUS:85150448619
SN - 0020-0255
VL - 634
SP - 85
EP - 100
JO - Information Sciences
JF - Information Sciences
ER -