Two-Stage Fuzzy Fusion Based-Convolution Neural Network for Dynamic Emotion Recognition

Min Wu, Wanjuan Su, Luefeng Chen*, Witold Pedrycz, Kaoru Hirota

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

28 Citations (Scopus)

Abstract

The two-stage fuzzy fusion based-convolution neural network is proposed for dynamic emotion recognition by using both facial expression and speech modalities, which not only can extract discriminative emotion features which contain spatio-temporal information, but also can effectively fuse facial expression and speech modalities. Moreover, the proposal is able to handle situations where the contributions of each modality data to emotion recognition are very imbalanced. The local binary patterns coming from three orthogonal planes and spectrogram are considered first to extract low-level dynamic emotion, so that the spatio-temporal information of these modalities can be obtained. To reveal more discriminative features, two deep convolution neural networks are constructed to extract high-level emotion semantic features. Moreover, the two stage fuzzy fusion strategy is developed by integrating canonical correlation analysis and fuzzy broad learning system, so as to take into account the correlation and difference between different modal features, as well as handle the ambiguity of emotional state information. The experimental results obtained on benchmark databases show that the accuracies of the proposed method are higher than those of existing methods (such as the hybrid deep model, and the rule-based and machine learning method) on SAVEE, eNTERFACE'05, and AFEW databases.

Original languageEnglish
Pages (from-to)805-817
Number of pages13
JournalIEEE Transactions on Affective Computing
Volume13
Issue number2
DOIs
Publication statusPublished - 2022
Externally publishedYes

Keywords

  • Emotion recognition
  • canonical correlation analysis
  • deep learning
  • fuzzy broad learning system

Fingerprint

Dive into the research topics of 'Two-Stage Fuzzy Fusion Based-Convolution Neural Network for Dynamic Emotion Recognition'. Together they form a unique fingerprint.

Cite this