摘要
This article studies the informative trajectory planning problem of an autonomous vehicle for field exploration. In contrast to existing works concerned with maximizing the amount of information about spatial fields, this work considers efficient exploration of spatiotemporal fields with unknown distributions and seeks minimum-time trajectories of the vehicle while respecting a cumulative information constraint. In this work, upon adopting the observability constant as an information measure for expressing the cumulative information constraint, the existence of a minimum-time trajectory is proven under mild conditions. Given the spatiotemporal nature, the problem is modeled as a Markov decision process (MDP), for which a reinforcement learning (RL) algorithm is proposed to learn a continuous planning policy. To accelerate the policy learning, we design a new reward function by leveraging field approximations, which is demonstrated to yield dense rewards. Simulations show that the learned policy can steer the vehicle to achieve an efficient exploration, and it outperforms the commonly-used coverage planning method in terms of exploration time for sufficient cumulative information.
源语言 | 英语 |
---|---|
页(从-至) | 1-11 |
页数 | 11 |
期刊 | IEEE Transactions on Neural Networks and Learning Systems |
DOI | |
出版状态 | 已接受/待刊 - 2023 |