Abstract
Reinforcement learning (RL) is a learning technique based on trial and error. Q-learning is a method of RL algorithms. It has been applied widely in the adaptive path planning for the autonomous mobile robot. In order to decrease the learning space and increase the learning convergent speed, this paper adopts Q-layered learning method to divide the task of searching optimal path into three basic behaviors (or subtasks), namely static obstacle-avoidance, dynamic obstacle-avoidance and goal approaching. Especially in the learning for the static obstacle-avoidance behavior, a novel priority Q search method (PQA) is used to avoid the blindly search of the random search algorithm (RA) which is always used to select actions in Q-learning. PQA uses the sum of weighted vectors pointing away from obstacles to predict the magnitude of the reinforcement reward receiving from the possible state-action after executing the action. Robot controller will select an action based on the result at the next executing time. At last PQA and RA are both simulated in two different environments. The learning results show that learn steps are fewer by PQA than by RA under same environment to achieve the task. And in the total learning periods PQA has the higher task complete percent. PQA is an effective way to solve the problem of the path planning under dynamic and unknown environment.
Original language | English |
---|---|
Pages | 983-987 |
Number of pages | 5 |
DOIs | |
Publication status | Published - 2006 |
Externally published | Yes |
Event | 2006 IEEE International Conference on Information Acquisition, ICIA 2006 - Weihai, Shandong, China Duration: 20 Aug 2006 → 23 Aug 2006 |
Conference
Conference | 2006 IEEE International Conference on Information Acquisition, ICIA 2006 |
---|---|
Country/Territory | China |
City | Weihai, Shandong |
Period | 20/08/06 → 23/08/06 |
Keywords
- Adaptive path planning
- Mobile robot
- PQA
- Q-learning
- RA