摘要
A novel morphing strategy based on reinforcement learning (RL) is developed to solve the morphing decision-making problem with minimum flight time for a long-range variable-sweep morphing aircraft. The proposed morphing strategy focuses on the sparse-reward no-reference decision-making problem caused by terminal performance objectives and long-range missions. A double-layer morphing-flight control framework is established to decouple the design of morphing strategy from flight controller while ensuring flight stability. Under this framework, an RL agent is designed to learn the minimum-flight-time morphing strategy. Specifically, the reward function is divided into primary goal rewards and sub-goal rewards to deal with the sparse-reward no-reference issue. A multi-stage progressive training scheme is developed to train the designed RL agent with a trail of training environments gradually converging to the actual world. This scheme accelerates the training process and promotes the convergence of the RL agent during training. Simulation results in nominal and dispersed conditions indicate the optimality and robustness of the proposed morphing strategy. Moreover, the generalization ability is validated in an untrained scenario.
源语言 | 英语 |
---|---|
文章编号 | 109087 |
期刊 | Aerospace Science and Technology |
卷 | 148 |
DOI | |
出版状态 | 已出版 - 5月 2024 |