TY - JOUR
T1 - Data-driven identifier–actor–critic learning for cooperative spacecraft attitude tracking with orientation constraints
AU - Xia, Kewei
AU - Wang, Jianan
AU - Zou, Yao
AU - Gao, Hongbo
AU - Ding, Zhengtao
N1 - Publisher Copyright:
© 2024 Elsevier Ltd
PY - 2025/3
Y1 - 2025/3
N2 - This paper investigates the cooperative attitude tracking issue of a cluster of spacecraft subject to orientation constraints. In particular, all the involved spacecraft cooperatively adjust their attitudes to track a time-varying reference via local information exchange while constraining them inside a mandatory orientation zone as well as outside forbidden orientation zones. A dynamic identifier is first exploited to compensate for the dynamics uncertainty. Next, by integrating the sliding mode with the dynamic identifier, a distributed actor–critic reinforcement learning (RL) control algorithm is designed. Moreover, a data-driven online learning algorithm is proposed for the update of the learning weights, which effectively relieves the typical persistent excitation (PE) to the finite excitation (FE). To overcome the orientation constraint dilemmas, a robust control barrier function (CBF) based quadratic programming optimization is designed. It is shown that the attitude tracking errors are ultimately driven to a small tunable neighborhood of origin without violating the underlying orientation constraints. Finally, simulation results validate and highlight the proposed theoretical results.
AB - This paper investigates the cooperative attitude tracking issue of a cluster of spacecraft subject to orientation constraints. In particular, all the involved spacecraft cooperatively adjust their attitudes to track a time-varying reference via local information exchange while constraining them inside a mandatory orientation zone as well as outside forbidden orientation zones. A dynamic identifier is first exploited to compensate for the dynamics uncertainty. Next, by integrating the sliding mode with the dynamic identifier, a distributed actor–critic reinforcement learning (RL) control algorithm is designed. Moreover, a data-driven online learning algorithm is proposed for the update of the learning weights, which effectively relieves the typical persistent excitation (PE) to the finite excitation (FE). To overcome the orientation constraint dilemmas, a robust control barrier function (CBF) based quadratic programming optimization is designed. It is shown that the attitude tracking errors are ultimately driven to a small tunable neighborhood of origin without violating the underlying orientation constraints. Finally, simulation results validate and highlight the proposed theoretical results.
KW - Control barrier function
KW - Distributed control
KW - Orientation constraint
KW - Reinforcement learning
KW - Spacecraft attitude
UR - http://www.scopus.com/inward/record.url?scp=85211045781&partnerID=8YFLogxK
U2 - 10.1016/j.automatica.2024.112035
DO - 10.1016/j.automatica.2024.112035
M3 - Article
AN - SCOPUS:85211045781
SN - 0005-1098
VL - 173
JO - Automatica
JF - Automatica
M1 - 112035
ER -