TY - JOUR
T1 - Tractor Semi-Trailer Off-Tracking and Stability Approximate Bi-Level Policy Optimization
AU - Zhang, Fawang
AU - Duan, Jingliang
AU - Liu, Hui
AU - Cao, Xingyu
AU - Nie, Shida
AU - Guo, Congshuai
AU - Xie, Yujia
AU - Ma, Jun
AU - Wang, Shangli
N1 - Publisher Copyright:
© 2004-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Trajectory tracking control of tractor semi-trailer vehicles poses significant challenges due to inherent off-tracking behavior and roll instability risks. While existing approaches have demonstrated effectiveness, they often rely on computationally intensive numerical solvers and require time-consuming manual tuning of cost function weights. This paper presents an approximate bi-level policy optimization (ABPO) framework that simultaneously optimizes the cost function and synthesizes an explicit control policy to minimize off-tracking while reducing computational complexity. The proposed framework employs a hierarchical structure: the upper level updates cost weights based on the trailer's stability trajectory, while the lower level derives an approximate optimal policy by solving the tractor's control problem. By leveraging Pontryagin's Maximum Principle (PMP), we have developed a novel method to analytically compute cost weight gradients through differentiation of the PMP conditions. This enables the formulation of a related optimal control problem (OCP) whose solutions directly yield gradients for cost parameter updates. The ABPO framework achieves automatic weight coefficient adjustment, enhances trajectory tracking accuracy for both tractor and trailer units, and significantly reduces computational burden. Simulation and experimental validation across 4 classical scenarios demonstrates that the learned policy reduces rearward amplification by 17.82%, lateral tracking errors by 84.15%, and rollover by 64.19%, respectively. Notably, the control policy computation requires less than 10ms, making it suitable for real-time applications. The source code for the algorithms described in this paper is publicly available at https://github.com/TroyResearch/ABPO.git Note to Practitioners - This paper addresses a challenge in the autonomous driving industry: efficient and safe trajectory tracking control for tractor semi-trailer vehicles. Current industrial solutions typically require significant computational resources and time-consuming manual tuning of control parameters, limiting their practical implementation. Our proposed ABPO framework offers a practical solution by automating the parameter-tuning process and reducing computational complexity while maintaining safety standards. The framework can be readily integrated into existing complex control systems with minimal hardware requirements, making it suitable for real-time applications. However, when applying this framework to other control problems, the specific formulation of upper and lower-level problems needs to be carefully redesigned based on the particular problem characteristics to ensure the effectiveness of the control architecture. Future research will focus on extending this methodology to a broader range of control applications, like multi-robot control systems, autonomous fleet navigation, and imitation learning scenarios.
AB - Trajectory tracking control of tractor semi-trailer vehicles poses significant challenges due to inherent off-tracking behavior and roll instability risks. While existing approaches have demonstrated effectiveness, they often rely on computationally intensive numerical solvers and require time-consuming manual tuning of cost function weights. This paper presents an approximate bi-level policy optimization (ABPO) framework that simultaneously optimizes the cost function and synthesizes an explicit control policy to minimize off-tracking while reducing computational complexity. The proposed framework employs a hierarchical structure: the upper level updates cost weights based on the trailer's stability trajectory, while the lower level derives an approximate optimal policy by solving the tractor's control problem. By leveraging Pontryagin's Maximum Principle (PMP), we have developed a novel method to analytically compute cost weight gradients through differentiation of the PMP conditions. This enables the formulation of a related optimal control problem (OCP) whose solutions directly yield gradients for cost parameter updates. The ABPO framework achieves automatic weight coefficient adjustment, enhances trajectory tracking accuracy for both tractor and trailer units, and significantly reduces computational burden. Simulation and experimental validation across 4 classical scenarios demonstrates that the learned policy reduces rearward amplification by 17.82%, lateral tracking errors by 84.15%, and rollover by 64.19%, respectively. Notably, the control policy computation requires less than 10ms, making it suitable for real-time applications. The source code for the algorithms described in this paper is publicly available at https://github.com/TroyResearch/ABPO.git Note to Practitioners - This paper addresses a challenge in the autonomous driving industry: efficient and safe trajectory tracking control for tractor semi-trailer vehicles. Current industrial solutions typically require significant computational resources and time-consuming manual tuning of control parameters, limiting their practical implementation. Our proposed ABPO framework offers a practical solution by automating the parameter-tuning process and reducing computational complexity while maintaining safety standards. The framework can be readily integrated into existing complex control systems with minimal hardware requirements, making it suitable for real-time applications. However, when applying this framework to other control problems, the specific formulation of upper and lower-level problems needs to be carefully redesigned based on the particular problem characteristics to ensure the effectiveness of the control architecture. Future research will focus on extending this methodology to a broader range of control applications, like multi-robot control systems, autonomous fleet navigation, and imitation learning scenarios.
KW - Autonomous vehicle
KW - cost function learning
KW - policy approximation
KW - trajectory tracking
UR - https://www.scopus.com/pages/publications/105013865556
U2 - 10.1109/TASE.2025.3600553
DO - 10.1109/TASE.2025.3600553
M3 - Article
AN - SCOPUS:105013865556
SN - 1545-5955
VL - 22
SP - 20055
EP - 20067
JO - IEEE Transactions on Automation Science and Engineering
JF - IEEE Transactions on Automation Science and Engineering
ER -