Trajectory Tracking Control for Under-Actuated Hovercraft Using Differential Flatness and Reinforcement Learning-Based Active Disturbance Rejection Control

Xiangyu Kong, Yuanqing Xia*, Rui Hu, Min Lin, Zhongqi Sun, Li Dai

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

17 Citations (Scopus)

Abstract

This paper proposes a scheme of trajectory tracking control for the hovercraft. Since the model of the hovercraft is under-actuated, nonlinear, and strongly coupled, it is a great challenge for the controller design. To solve this problem, the control scheme is divided into two parts. Firstly, we employ differential flatness method to find a set of flat outputs and consider part of the nonlinear terms as uncertainties. Consequently, we convert the under-actuated system into a full-actuated one. Secondly, a reinforcement learning-based active disturbance rejection controller (RL-ADRC) is designed. In this method, an extended state observer (ESO) is designed to estimate the uncertainties of the system, and an actorcritic-based reinforcement learning (RL) algorithm is used to approximate the optimal control strategy. Based on the output of the ESO, the RL-ADRC compensates for the total uncertainties in real-time, and simultaneously, generates the optimal control strategy by RL algorithm. Simulation results show that, compared with the traditional ADRC method, RL-ADRC does not need to manually tune the controller parameters, and the control strategy is more robust.

Original languageEnglish
Pages (from-to)502-521
Number of pages20
JournalJournal of Systems Science and Complexity
Volume35
Issue number2
DOIs
Publication statusPublished - Apr 2022

Keywords

  • Active disturbance rejection control
  • differential flatness
  • reinforcement learning
  • trajectory tracking control
  • under-actuated system

Fingerprint

Dive into the research topics of 'Trajectory Tracking Control for Under-Actuated Hovercraft Using Differential Flatness and Reinforcement Learning-Based Active Disturbance Rejection Control'. Together they form a unique fingerprint.

Cite this