A reinforcement learning from human feedback based method for task allocation of human robot collaboration assembly considering human preference

Jingfei Wang, Yan Yan, Yaoguang Hu, Xiaonan Yang*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Currently, human-robot collaboration is considered as an important enabling technology in human-centered manufacturing of industry 5.0. Reasonable task allocation and sequencing of human-robot collaboration process are necessary to fully utilize the strengths of workers and robots to improve workers’ performance and experience. Although many human factors are considered in current studies of task allocation, it is difficult for workers to provide preferred choices and feedback to directly affect the decision-making due to the complexity of decision-making process, moreover, it may result in a solution that is not suitable for individual worker. To address this problem, a task allocation method based on human feedback reinforcement learning is proposed in this study. In this method, multi-agent reinforcement learning is applied to pre-train the agent models to solve the task allocation and sequencing problem with multiple optimal objectives. An analytic hierarchy process-based method is utilized to analyze human action preferences to build a heuristic reward model. Furthermore, a preference training approach using knowledge distillation is proposed, and agents are adjusted through preference rewards and pre-trained optimization experiences to learn a decision-making policy that suits worker preferences. The effectiveness of the method is verified in comparative and ablation experiments.

Original languageEnglish
Article number103497
JournalAdvanced Engineering Informatics
Volume66
DOIs
Publication statusPublished - Jul 2025

Keywords

  • Human robot collaboration
  • Reinforcement learning
  • Reinforcement learning from human feedback
  • Task allocation and sequencing

Fingerprint

Dive into the research topics of 'A reinforcement learning from human feedback based method for task allocation of human robot collaboration assembly considering human preference'. Together they form a unique fingerprint.

Cite this