Skip to main navigation Skip to search Skip to main content

Distributed Estimation and Data-Driven Formation Control of Air-Ground Systems via Efficient Reinforcement Learning

  • Jia Xiu Yang
  • , Yong Xu
  • , Zheng Guang Wu
  • , Yongfu Li

Research output: Contribution to journalArticlepeer-review

Abstract

This article investigates the optimal distributed formation control for heterogeneous air-ground vehicle systems using a data-efficient, off-policy reinforcement learning algorithm. Initially, a distributed predefined-Time observer is proposed under a time-varying formation function to provide estimation of the leader's states, which serves as a fundamental element for formation tracking control. Subsequently, a model-based formation control policy is developed and learned to achieve optimal formation tracking performance. To alleviate the substantial computational burden of existing learning algorithms, we propose a purely data-driven, off-policy learning algorithm that utilizes measured system data to learn the optimal control policy. This approach eliminates the need for the Kronecker product operation in data matrix construction. Furthermore, the proposed algorithm learns an initial stabilizing policy directly from system data. Unlike conventional algorithms that necessitate a full rank condition, extensive data storage, and a predefined initial stabilizing control policy based on known system dynamics, our method eliminates these prerequisites. Finally, we present a numerical example to validate the effectiveness of our algorithm.

Original languageEnglish
Pages (from-to)16622-16633
Number of pages12
JournalIEEE Transactions on Aerospace and Electronic Systems
Volume61
Issue number6
DOIs
Publication statusPublished - 2025
Externally publishedYes

Keywords

  • Efficient reinforcement learning (RL)
  • autonomous airâÂÄa"ground vehicle systems (AA-GVSs)
  • formation control

Fingerprint

Dive into the research topics of 'Distributed Estimation and Data-Driven Formation Control of Air-Ground Systems via Efficient Reinforcement Learning'. Together they form a unique fingerprint.

Cite this