A LLM-Based Robot Partner with Multi-modal Emotion Recognition

Yutong Jiang, Shuai Shao*, Yaping Dai, Kaoru Hirota

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The integration of Large Language Models (LLMs) with robotic systems has opened new avenues for the development of empathetic and interactive robot partners. This paper introduces a service robot system that incorporates multi-modal emotion recognition and LLM-based emotion dialogue generation. The system captures user emotions through a tri-modal emotion recognition model (TriMER), which processes audio, text, and facial expressions using advanced techniques like BiLSTM, CNN, and Deformable Convolutional Networks (DCN). Experiments conducted using the IEMOCAP dataset show that our TriMER model achieves an accuracy of 74.15% in recognizing emotions. By combining emotion recognition with LLM, the robot can better understand and respond to human emotions, facilitating more natural and empathetic interactions. This development holds promise for applications in elder care, aiming to enhance both physical and mental well-being.

Original languageEnglish
Title of host publicationIntelligent Robotics and Applications - 17th International Conference, ICIRA 2024, Proceedings
EditorsXuguang Lan, Xuesong Mei, Caigui Jiang, Fei Zhao, Zhiqiang Tian
PublisherSpringer Science and Business Media Deutschland GmbH
Pages71-83
Number of pages13
ISBN (Print)9789819607853
DOIs
Publication statusPublished - 2025
Event17th International Conference on Intelligent Robotics and Applications, ICIRA 2024 - Xi'an, China
Duration: 31 Jul 20242 Aug 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume15210 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference17th International Conference on Intelligent Robotics and Applications, ICIRA 2024
Country/TerritoryChina
CityXi'an
Period31/07/242/08/24

Keywords

  • Human-Robot Interaction
  • Large Language Models
  • Multi-modal Emotion Recognition
  • Robot Partners

Cite this