摘要
Reinforcement learning (RL) and adaptive/approximate dynamic programming (ADP) algorithms have recently received much attention from various scientific fields (e.g., artificial intelligence, systems and control, and applied mathematics). This is partly due to their successful applications in a series of challenging problems, such as the sequential decision and optimal coordination control problems of large-scale multi-agent systems. In this paper, some preliminaries on RL and ADP algorithms are firstly introduced, and then the developments of these two closely related algorithms in different research fields are reviewed respectively, with emphasis on the developments from solving the sequential decision (optimal control) problem for single agent (control plant) to the sequential decision (optimal coordination control) problem of multi-agent systems by utilizing these two algorithms. Furthermore, after briefly surveying the structure evolution of the ADP algorithm in the last decades and the recent development of the ADP algorithm from model-based offline programming framework to model-free online learning framework, the research progress of the ADP algorithm in solving the optimal coordination control problem of multi-agent systems is reviewed. Finally, some interesting yet challenging issues on MARL algorithms and using ADP algorithms to solve optimal coordination control problem of multi-agent systems are suggested.
投稿的翻译标题 | Reinforcement learning and adaptive/approximate dynamic programming: A survey from theory to applications in multi-agent systems |
---|---|
源语言 | 繁体中文 |
页(从-至) | 1200-1230 |
页数 | 31 |
期刊 | Kongzhi yu Juece/Control and Decision |
卷 | 38 |
期 | 5 |
DOI | |
出版状态 | 已出版 - 5月 2023 |
关键词
- Markov decision process
- adaptive/approximate dynamic programming
- multi-agent system
- optimal coordination control
- reinforcement learning
- sequential decision