TY - JOUR
T1 - Reinforcement learning-based unknown reference tracking control of HMASs with nonidentical communication delays
AU - Xu, Yong
AU - Wu, Zheng Guang
AU - Che, Wei Wei
AU - Meng, Deyuan
N1 - Publisher Copyright:
© 2023, Science China Press.
PY - 2023/7
Y1 - 2023/7
N2 - This paper focuses on the optimal output synchronization control problem of heterogeneous multiagent systems (HMASs) subject to nonidentical communication delays by a reinforcement learning method. Compared with existing studies assuming that the precise model of the leader is globally or distributively accessible to all or some of the followers, the leader’s precise dynamical model is entirely inaccessible to all the followers in this paper. A data-based learning algorithm is first proposed to reconstruct the leader’s unknown system matrix online. A distributed predictor subject to communication delays is further devised to estimate the leader’s state, where interaction delays are allowed to be nonidentical. Then, a learning-based local controller, together with a discounted performance function, is projected to reach the optimal output synchronization. Bellman equations and game algebraic Riccati equations are constructed to learn the optimal solution by developing a model-based reinforcement learning (RL) algorithm online without solving regulator equations, which is followed by a model-free off-policy RL algorithm to relax the requirement of all agents’ dynamics faced by the model-based RL algorithm. The optimal tracking control of HMASs subject to unknown leader dynamics and communication delays is shown to be solvable under the proposed RL algorithms. Finally, the effectiveness of theoretical analysis is verified by numerical simulations.
AB - This paper focuses on the optimal output synchronization control problem of heterogeneous multiagent systems (HMASs) subject to nonidentical communication delays by a reinforcement learning method. Compared with existing studies assuming that the precise model of the leader is globally or distributively accessible to all or some of the followers, the leader’s precise dynamical model is entirely inaccessible to all the followers in this paper. A data-based learning algorithm is first proposed to reconstruct the leader’s unknown system matrix online. A distributed predictor subject to communication delays is further devised to estimate the leader’s state, where interaction delays are allowed to be nonidentical. Then, a learning-based local controller, together with a discounted performance function, is projected to reach the optimal output synchronization. Bellman equations and game algebraic Riccati equations are constructed to learn the optimal solution by developing a model-based reinforcement learning (RL) algorithm online without solving regulator equations, which is followed by a model-free off-policy RL algorithm to relax the requirement of all agents’ dynamics faced by the model-based RL algorithm. The optimal tracking control of HMASs subject to unknown leader dynamics and communication delays is shown to be solvable under the proposed RL algorithms. Finally, the effectiveness of theoretical analysis is verified by numerical simulations.
KW - HMAS
KW - RL
KW - communication delays
KW - heterogeneous multiagent systems
KW - optimal output synchronization
KW - reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85163853335&partnerID=8YFLogxK
U2 - 10.1007/s11432-022-3729-7
DO - 10.1007/s11432-022-3729-7
M3 - Article
AN - SCOPUS:85163853335
SN - 1674-733X
VL - 66
JO - Science China Information Sciences
JF - Science China Information Sciences
IS - 7
M1 - 170203
ER -