TY - JOUR
T1 - Pontryagin's Minimum Principle-Guided RL for Minimum-Time Exploration of Spatiotemporal Fields
AU - Li, Zhuo
AU - Sun, Jian
AU - Marques, Antonio G.
AU - Wang, Gang
AU - You, Keyou
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2025
Y1 - 2025
N2 - This article studies the trajectory planning problem of an autonomous vehicle for exploring a spatiotemporal field subject to a constraint on cumulative information. Since the resulting problem depends on the signal strength distribution of the field, which is unknown in practice, we advocate the use of a model-free reinforcement learning (RL) method to find the solution. Given the vehicle's dynamical model, a critical (and open) question is how to judiciously merge the model-based optimality conditions into the model-free RL framework for improved efficiency and generalization, for which this work provides some positive results. Specifically, we discretize the continuous action space by leveraging analytic optimality conditions for the minimum-time optimization problem via Pontryagin's minimum principle (PMP). This allows us to develop a novel discrete PMP-based RL trajectory planning algorithm, which learns a planning policy faster than those based on a continuous action space. Simulation results: 1) validate the effectiveness of the PMP-based RL algorithm and 2) demonstrate its advantages, in terms of both learning efficiency and the vehicle's exploration time, over two baseline methods for continuous control inputs.
AB - This article studies the trajectory planning problem of an autonomous vehicle for exploring a spatiotemporal field subject to a constraint on cumulative information. Since the resulting problem depends on the signal strength distribution of the field, which is unknown in practice, we advocate the use of a model-free reinforcement learning (RL) method to find the solution. Given the vehicle's dynamical model, a critical (and open) question is how to judiciously merge the model-based optimality conditions into the model-free RL framework for improved efficiency and generalization, for which this work provides some positive results. Specifically, we discretize the continuous action space by leveraging analytic optimality conditions for the minimum-time optimization problem via Pontryagin's minimum principle (PMP). This allows us to develop a novel discrete PMP-based RL trajectory planning algorithm, which learns a planning policy faster than those based on a continuous action space. Simulation results: 1) validate the effectiveness of the PMP-based RL algorithm and 2) demonstrate its advantages, in terms of both learning efficiency and the vehicle's exploration time, over two baseline methods for continuous control inputs.
KW - Exploration of spatiotemporal field
KW - functional constraint
KW - minimum-time trajectory planning
KW - Pontryagin's minimum principle (PMP)
KW - reinforcement learning (RL)
UR - http://www.scopus.com/inward/record.url?scp=86000427143&partnerID=8YFLogxK
U2 - 10.1109/TNNLS.2024.3379654
DO - 10.1109/TNNLS.2024.3379654
M3 - Article
C2 - 38593018
AN - SCOPUS:86000427143
SN - 2162-237X
VL - 36
SP - 5375
EP - 5387
JO - IEEE Transactions on Neural Networks and Learning Systems
JF - IEEE Transactions on Neural Networks and Learning Systems
IS - 3
ER -