Generating Emotional Coherence and Diverse Responses in a Multimodal Dialogue System

Yunfei Huang, Kan Li, Zhuo Chen, Lipeng Wang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The perception of emotion and the diversity of generated response are two key factors considered by researchers in multimodal dialogue generation. However, in the field of multimodal dialogue generation, these two key factors have not been considered at the same time. In our model, we first extract the features of each modal from the multimodal context dialogue, and use the heterogeneous graph neural network to represent the large graph network composed of dialogue history, voice, video, and speaker's emotional state. Then, we use conditional variational autoencoders to generate coherent and diverse responses. A large number of experiments have shown that our model can not only automatically generate reaction emotions in two multimodal datasets, but also has coherence and controllability, which is significantly better than previous more advanced models.

Original languageEnglish
Title of host publicationProceedings - 2021 2nd International Conference on Electronics, Communications and Information Technology, CECIT 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages625-630
Number of pages6
ISBN (Electronic)9781665437578
DOIs
Publication statusPublished - 2021
Event2nd International Conference on Electronics, Communications and Information Technology, CECIT 2021 - Virtual, Sanya, China
Duration: 27 Dec 202129 Dec 2021

Publication series

NameProceedings - 2021 2nd International Conference on Electronics, Communications and Information Technology, CECIT 2021

Conference

Conference2nd International Conference on Electronics, Communications and Information Technology, CECIT 2021
Country/TerritoryChina
CityVirtual, Sanya
Period27/12/2129/12/21

Keywords

  • CVAE
  • Dialogue system
  • Graph heterogeneous neural network
  • emotional dialogue generation

Fingerprint

Dive into the research topics of 'Generating Emotional Coherence and Diverse Responses in a Multimodal Dialogue System'. Together they form a unique fingerprint.

Cite this