Imitation Learning Method of Multi-quality Expert Data Based on GAIL

Dengmin Xiao, Bo Wang, Zhongqi Sun, Xiao He

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper focuses on the imitation learning methods of multi-quality expert data based on Generative Adversarial Imitation Learning (GAIL). The agent is able to acquire high-quality behavioral policies through GAIL by imitating actions from experts and learning the experience distribution instead of the reward function. Considering that multi-quality expert imitation learning can achieve the effect of data augmentation, a novel GAIL-based method named MT-GAIL is proposed for imitation learning. We first define the reliability coefficient of different expert data by calculating the accuracy of corresponding discriminator. Then the reliability coefficient is used as the weight to calculate the reward function that is defined as the sum of the products of the weights and the output of corresponding discriminator. The series of rewards, states and actions are finally fed into the experience pool to train the network of policy builder. We compare the GAIL method through experiments for the cases of single-expert and multi-quality expert trajectories, which shows that the proposed MT-GAIL method is capable of avoiding the worst expert data. The effects of different reward value calculation methods on multi-quality expert data are also conducted to illustrate the distinct advantage of our proposed discriminator output value weighting method.

Original languageEnglish
Title of host publicationProceedings - 2023 China Automation Congress, CAC 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages8642-8647
Number of pages6
ISBN (Electronic)9798350303759
DOIs
Publication statusPublished - 2023
Event2023 China Automation Congress, CAC 2023 - Chongqing, China
Duration: 17 Nov 202319 Nov 2023

Publication series

NameProceedings - 2023 China Automation Congress, CAC 2023

Conference

Conference2023 China Automation Congress, CAC 2023
Country/TerritoryChina
CityChongqing
Period17/11/2319/11/23

Keywords

  • GAIL
  • Mujoco
  • imitation learning
  • multi-quality expert data

Fingerprint

Dive into the research topics of 'Imitation Learning Method of Multi-quality Expert Data Based on GAIL'. Together they form a unique fingerprint.

Cite this

Xiao, D., Wang, B., Sun, Z., & He, X. (2023). Imitation Learning Method of Multi-quality Expert Data Based on GAIL. In Proceedings - 2023 China Automation Congress, CAC 2023 (pp. 8642-8647). (Proceedings - 2023 China Automation Congress, CAC 2023). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/CAC59555.2023.10451805