Safe fixed-time reinforcement learning for nonlinear zero-sum games with obstacle avoidance awareness

  • Ping Wang
  • , Chengpu Yu
  • , Maolong Lv
  • , Guang Ren Duan*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

This paper presents a scheme to tackle the fixed-time (FxT) safe optimal obstacle avoidance control issue of nonlinear systems in the presence of external disturbances and multiple obstacles. To mitigate the destabilizing effects of the disturbances, a zero-sum differential game is first formulated, where the safety controller endeavors to minimize the performance index, whereas the disturbance attempts to maximize it. The subsequent development integrates a barrier function (BF) associated with obstacles into the cost function, ensuring the system's safety. Particularly, a damping constant is incorporated to achieve a balance between safety and optimality. By establishing the forward invariance of the safe set and demonstrating the FxT stability of the closed-loop system, a sufficient condition that characterize the FxT safe Nash equilibrium point is provided for the first time, where the Lyapunov function satisfying the FxT convergence differential inequality is also the solution to the steady-state Hamilton–Jacobi–Isaacs (HJI) equation guaranteeing optimality. Afterwards, a critic-only reinforcement learning (RL) strategy is developed and rigorously verified for learning the safe Nash policy within a fixed time. Moreover, the paper proves the FxT stability of the closed-loop system when operating under the approximate optimal Nash strategy. Finally, two simulation scenarios are presented to substantiate the validity of the proposed control framework.

Original languageEnglish
Article number112673
JournalAutomatica
Volume183
DOIs
Publication statusPublished - Jan 2026
Externally publishedYes

Keywords

  • Differential games
  • Fixed-time stability
  • Obstacle avoidance
  • Optimal control
  • Safe reinforcement learning

Fingerprint

Dive into the research topics of 'Safe fixed-time reinforcement learning for nonlinear zero-sum games with obstacle avoidance awareness'. Together they form a unique fingerprint.

Cite this