Boosting Feedback Efficiency of Interactive Reinforcement Learning by Adaptive Learning from Scores

Shukai Liu, Chenming Wu*, Ying Li, Liangjun Zhang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Interactive reinforcement learning has shown promise in learning complex robotic tasks. However, the process can be human-intensive due to the requirement of a large amount of interactive feedback. This paper presents a new method that uses scores provided by humans instead of pairwise preferences to improve the feedback efficiency of interactive reinforcement learning. Our key insight is that scores can yield significantly more data than pairwise preferences. Specifically, we require a teacher to interactively score the full trajectories of an agent to train a behavioral policy in a sparse reward environment. To avoid unstable scores given by humans negatively impacting the training process, we propose an adaptive learning scheme. This enables the learning paradigm to be insensitive to imperfect or unreliable scores. We extensively evaluate our method for robotic locomotion and manipulation tasks. The results show that the proposed method can efficiently learn near-optimal policies by adaptive learning from scores while requiring less feedback compared to pairwise preference learning methods. The source codes are publicly available at https://github.com/SSKKai/Interactive-Scoring-IRL.

Original languageEnglish
Title of host publication2023 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages7561-7567
Number of pages7
ISBN (Electronic)9781665491907
DOIs
Publication statusPublished - 2023
Event2023 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2023 - Detroit, United States
Duration: 1 Oct 20235 Oct 2023

Publication series

NameIEEE International Conference on Intelligent Robots and Systems
ISSN (Print)2153-0858
ISSN (Electronic)2153-0866

Conference

Conference2023 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2023
Country/TerritoryUnited States
CityDetroit
Period1/10/235/10/23

Fingerprint

Dive into the research topics of 'Boosting Feedback Efficiency of Interactive Reinforcement Learning by Adaptive Learning from Scores'. Together they form a unique fingerprint.

Cite this