A review of multimodal emotion recognition from datasets, preprocessing, features, and fusion methods

Bei Pan, Kaoru Hirota, Zhiyang Jia*, Yaping Dai

*Corresponding author for this work

Research output: Contribution to journalShort surveypeer-review

25 Citations (Scopus)

Abstract

Affective computing is one of the most important research fields in modern human–computer interaction (HCI). The goal of affective computing is to study and develop the theories, methods, and systems that can recognize, explain, process, and simulate human emotions. As a branch of affective computing, emotion recognition aims to enlighten the machine/computer automatically analyzing human emotions, which has received increasing attention from researchers in various fields. Human beings generally observe and understand the emotional states of one person by integrating the perceived information from his/her facial expressions, voice tone, speech content, behavior, or physiological features. To imitate the emotion observation manner of humans, researchers have been devoted to constructing multimodal emotion recognition models by fusing information from two or more modalities. In this paper, we provide a comprehensive review of multimodal emotion recognition from the perspectives of multimodal datasets, data preprocessing, unimodal feature extraction, and multimodal information fusion methods in recent decades. Furthermore, challenges and future research directions of the topic are specified and discussed. The main motivations of this review are to conclude the recent emergence of abundant works on multimodal emotion recognition and to provide potential guidance to researchers in the related field for understanding the pipeline and mainstream approaches to multimodal emotion recognition.

Original languageEnglish
Article number126866
JournalNeurocomputing
Volume561
DOIs
Publication statusPublished - 7 Dec 2023

Keywords

  • Classifier
  • Emotion recognition
  • Feature learning
  • Multimodal information fusion

Fingerprint

Dive into the research topics of 'A review of multimodal emotion recognition from datasets, preprocessing, features, and fusion methods'. Together they form a unique fingerprint.

Cite this