Dialogue emotion model based on local–global context encoder and commonsense knowledge fusion attention

Weilun Yu, Chengming Li, Xiping Hu, Wenhua Zhu, Erik Cambria*, Dazhi Jiang*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)

Abstract

Emotion Recognition in Conversation (ERC) is a task aimed at predicting the emotions conveyed by an utterance in a dialogue. It is common in ERC research to integrate intra-utterance, local contextual, and global contextual information to obtain the utterance vectors. However, there exist complex semantic dependencies among these factors, and failing to model these dependencies accurately can adversely affect the effectiveness of emotion recognition. Moreover, to enhance the semantic dependencies within the context, researchers commonly introduce external commonsense knowledge after modeling it. However, injecting commonsense knowledge into the model simply without considering its potential impact can introduce unexpected noise. To address these issues, we propose a dialogue emotion model based on local–global context encoder and commonsense knowledge fusion attention. The local–global context encoder effectively integrates the information of intra-utterance, local context, and global context to capture the semantic dependencies among them. To provide more accurate external commonsense information, we present a fusion module to filter the commonsense information through multi-head attention. Our proposed method has achieved competitive results on four datasets and exhibits advantages compared with mainstream models using commonsense knowledge.

Original languageEnglish
JournalInternational Journal of Machine Learning and Cybernetics
DOIs
Publication statusAccepted/In press - 2024

Keywords

  • Commonsense knowledge
  • Emotion recognition in conversation
  • Local–global encoder
  • Multihead attention

Fingerprint

Dive into the research topics of 'Dialogue emotion model based on local–global context encoder and commonsense knowledge fusion attention'. Together they form a unique fingerprint.

Cite this