Abstract
Based on eligibility trace theory, a delayed fast reinforcement learning algorithm DFSARSA (λ) is proposed. By redefining the eligibility trace and tracking the TD (λ) error, the Q-value of reinforcement learning updates may be postponed when they are needed instead of update in each step as traditional SARSA (λ). The update computing complexity is reduced from O (|S| |A|) to O (|A|) compared with SARSA (λ) and the speed of the reinforcement learning is improved greatly. Simulation results show the method is validity.
| Original language | English |
|---|---|
| Pages (from-to) | 328-331 |
| Number of pages | 4 |
| Journal | Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology |
| Volume | 25 |
| Issue number | 4 |
| Publication status | Published - Apr 2005 |
Keywords
- DFSARSA (λ) algorithm
- Eligibility trace
- Reinforcement learning
- SARSA (λ) algorithm