TY - GEN
T1 - A LLM-Based Robot Partner with Multi-modal Emotion Recognition
AU - Jiang, Yutong
AU - Shao, Shuai
AU - Dai, Yaping
AU - Hirota, Kaoru
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
PY - 2025
Y1 - 2025
N2 - The integration of Large Language Models (LLMs) with robotic systems has opened new avenues for the development of empathetic and interactive robot partners. This paper introduces a service robot system that incorporates multi-modal emotion recognition and LLM-based emotion dialogue generation. The system captures user emotions through a tri-modal emotion recognition model (TriMER), which processes audio, text, and facial expressions using advanced techniques like BiLSTM, CNN, and Deformable Convolutional Networks (DCN). Experiments conducted using the IEMOCAP dataset show that our TriMER model achieves an accuracy of 74.15% in recognizing emotions. By combining emotion recognition with LLM, the robot can better understand and respond to human emotions, facilitating more natural and empathetic interactions. This development holds promise for applications in elder care, aiming to enhance both physical and mental well-being.
AB - The integration of Large Language Models (LLMs) with robotic systems has opened new avenues for the development of empathetic and interactive robot partners. This paper introduces a service robot system that incorporates multi-modal emotion recognition and LLM-based emotion dialogue generation. The system captures user emotions through a tri-modal emotion recognition model (TriMER), which processes audio, text, and facial expressions using advanced techniques like BiLSTM, CNN, and Deformable Convolutional Networks (DCN). Experiments conducted using the IEMOCAP dataset show that our TriMER model achieves an accuracy of 74.15% in recognizing emotions. By combining emotion recognition with LLM, the robot can better understand and respond to human emotions, facilitating more natural and empathetic interactions. This development holds promise for applications in elder care, aiming to enhance both physical and mental well-being.
KW - Human-Robot Interaction
KW - Large Language Models
KW - Multi-modal Emotion Recognition
KW - Robot Partners
UR - http://www.scopus.com/inward/record.url?scp=85218498262&partnerID=8YFLogxK
U2 - 10.1007/978-981-96-0786-0_6
DO - 10.1007/978-981-96-0786-0_6
M3 - Conference contribution
AN - SCOPUS:85218498262
SN - 9789819607853
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 71
EP - 83
BT - Intelligent Robotics and Applications - 17th International Conference, ICIRA 2024, Proceedings
A2 - Lan, Xuguang
A2 - Mei, Xuesong
A2 - Jiang, Caigui
A2 - Zhao, Fei
A2 - Tian, Zhiqiang
PB - Springer Science and Business Media Deutschland GmbH
T2 - 17th International Conference on Intelligent Robotics and Applications, ICIRA 2024
Y2 - 31 July 2024 through 2 August 2024
ER -