HILPS: Human-in-Loop Policy Search for Mobile Robot Navigation

Mingxing Wen; Yufeng Yue; Zhenyu Wu; Ehsan Mihankhan; Danwei Wang

doi:10.1109/ICARCV50220.2020.9305366

HILPS: Human-in-Loop Policy Search for Mobile Robot Navigation

Mingxing Wen, Yufeng Yue, Zhenyu Wu, Ehsan Mihankhan, Danwei Wang

Nanyang Technological University

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

Abstract

Reinforcement learning has obtained increasing attention in mobile robot mapless navigation in recent years. However, there are still some obvious challenges including the sample efficiency, safety due to dilemma of exploration and exploitation. These problems are addressed in this paper by proposing the Human-in-Loop Policy Search (HILPS) framework, where learning from demonstration, learning from human intervention and Near Optimal Policy strategies are integrated together. Firstly, the former two make sure that expert experience grant mobile robot a more informative and correct decision for accomplishing the task and also maintaining the safety of the mobile robot due to the priority of human control. Then the Near Optimal Policy (NOP) provides a way to selectively store the similar experience with respect to the preexisting human demonstration, in which case the sample efficiency can be improved by eliminating exclusively exploratory behaviors. To verify the performance of the algorithm, the mobile robot navigation experiments are extensively conducted in simulation and real world. Results show that HILPS can improve sample efficiency and safety in comparison to state-of-art reinforcement learning.

Original language	English
Title of host publication	16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	387-392
Number of pages	6
ISBN (Electronic)	9781728177090
DOIs	https://doi.org/10.1109/ICARCV50220.2020.9305366
Publication status	Published - 13 Dec 2020
Externally published	Yes
Event	16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020 - Virtual, Shenzhen, China Duration: 13 Dec 2020 → 15 Dec 2020

Publication series

Name	16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020

Conference

Conference	16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020
Country/Territory	China
City	Virtual, Shenzhen
Period	13/12/20 → 15/12/20

Access to Document

10.1109/ICARCV50220.2020.9305366

Cite this

Wen, M., Yue, Y., Wu, Z., Mihankhan, E., & Wang, D. (2020). HILPS: Human-in-Loop Policy Search for Mobile Robot Navigation. In 16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020 (pp. 387-392). Article 9305366 (16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICARCV50220.2020.9305366

Wen, Mingxing ; Yue, Yufeng ; Wu, Zhenyu et al. / HILPS : Human-in-Loop Policy Search for Mobile Robot Navigation. 16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020. Institute of Electrical and Electronics Engineers Inc., 2020. pp. 387-392 (16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020).

@inproceedings{cc573cb9c9ce4048b3aa63c1b697d877,

title = "HILPS: Human-in-Loop Policy Search for Mobile Robot Navigation",

abstract = "Reinforcement learning has obtained increasing attention in mobile robot mapless navigation in recent years. However, there are still some obvious challenges including the sample efficiency, safety due to dilemma of exploration and exploitation. These problems are addressed in this paper by proposing the Human-in-Loop Policy Search (HILPS) framework, where learning from demonstration, learning from human intervention and Near Optimal Policy strategies are integrated together. Firstly, the former two make sure that expert experience grant mobile robot a more informative and correct decision for accomplishing the task and also maintaining the safety of the mobile robot due to the priority of human control. Then the Near Optimal Policy (NOP) provides a way to selectively store the similar experience with respect to the preexisting human demonstration, in which case the sample efficiency can be improved by eliminating exclusively exploratory behaviors. To verify the performance of the algorithm, the mobile robot navigation experiments are extensively conducted in simulation and real world. Results show that HILPS can improve sample efficiency and safety in comparison to state-of-art reinforcement learning.",

author = "Mingxing Wen and Yufeng Yue and Zhenyu Wu and Ehsan Mihankhan and Danwei Wang",

note = "Publisher Copyright: {\textcopyright} 2020 IEEE.; 16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020 ; Conference date: 13-12-2020 Through 15-12-2020",

year = "2020",

month = dec,

day = "13",

doi = "10.1109/ICARCV50220.2020.9305366",

language = "English",

series = "16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "387--392",

booktitle = "16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020",

address = "United States",

}

Wen, M, Yue, Y, Wu, Z, Mihankhan, E & Wang, D 2020, HILPS: Human-in-Loop Policy Search for Mobile Robot Navigation. in 16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020., 9305366, 16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020, Institute of Electrical and Electronics Engineers Inc., pp. 387-392, 16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020, Virtual, Shenzhen, China, 13/12/20. https://doi.org/10.1109/ICARCV50220.2020.9305366

HILPS: Human-in-Loop Policy Search for Mobile Robot Navigation. / Wen, Mingxing; Yue, Yufeng; Wu, Zhenyu et al.
16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020. Institute of Electrical and Electronics Engineers Inc., 2020. p. 387-392 9305366 (16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - HILPS

T2 - 16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020

AU - Wen, Mingxing

AU - Yue, Yufeng

AU - Wu, Zhenyu

AU - Mihankhan, Ehsan

AU - Wang, Danwei

PY - 2020/12/13

Y1 - 2020/12/13

N2 - Reinforcement learning has obtained increasing attention in mobile robot mapless navigation in recent years. However, there are still some obvious challenges including the sample efficiency, safety due to dilemma of exploration and exploitation. These problems are addressed in this paper by proposing the Human-in-Loop Policy Search (HILPS) framework, where learning from demonstration, learning from human intervention and Near Optimal Policy strategies are integrated together. Firstly, the former two make sure that expert experience grant mobile robot a more informative and correct decision for accomplishing the task and also maintaining the safety of the mobile robot due to the priority of human control. Then the Near Optimal Policy (NOP) provides a way to selectively store the similar experience with respect to the preexisting human demonstration, in which case the sample efficiency can be improved by eliminating exclusively exploratory behaviors. To verify the performance of the algorithm, the mobile robot navigation experiments are extensively conducted in simulation and real world. Results show that HILPS can improve sample efficiency and safety in comparison to state-of-art reinforcement learning.

AB - Reinforcement learning has obtained increasing attention in mobile robot mapless navigation in recent years. However, there are still some obvious challenges including the sample efficiency, safety due to dilemma of exploration and exploitation. These problems are addressed in this paper by proposing the Human-in-Loop Policy Search (HILPS) framework, where learning from demonstration, learning from human intervention and Near Optimal Policy strategies are integrated together. Firstly, the former two make sure that expert experience grant mobile robot a more informative and correct decision for accomplishing the task and also maintaining the safety of the mobile robot due to the priority of human control. Then the Near Optimal Policy (NOP) provides a way to selectively store the similar experience with respect to the preexisting human demonstration, in which case the sample efficiency can be improved by eliminating exclusively exploratory behaviors. To verify the performance of the algorithm, the mobile robot navigation experiments are extensively conducted in simulation and real world. Results show that HILPS can improve sample efficiency and safety in comparison to state-of-art reinforcement learning.

UR - http://www.scopus.com/inward/record.url?scp=85100109886&partnerID=8YFLogxK

U2 - 10.1109/ICARCV50220.2020.9305366

DO - 10.1109/ICARCV50220.2020.9305366

M3 - Conference contribution

AN - SCOPUS:85100109886

T3 - 16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020

SP - 387

EP - 392

BT - 16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 13 December 2020 through 15 December 2020

ER -

Wen M, Yue Y, Wu Z, Mihankhan E, Wang D. HILPS: Human-in-Loop Policy Search for Mobile Robot Navigation. In 16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020. Institute of Electrical and Electronics Engineers Inc. 2020. p. 387-392. 9305366. (16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020). doi: 10.1109/ICARCV50220.2020.9305366

HILPS: Human-in-Loop Policy Search for Mobile Robot Navigation

Abstract

Publication series

Conference

Access to Document

Other files and links

Fingerprint

Cite this