TY - CHAP
T1 - Two-Stage Fuzzy Fusion Based-Convolution Neural Network for Dynamic Emotion Recognition
AU - Chen, Luefeng
AU - Wu, Min
AU - Pedrycz, Witold
AU - Hirota, Kaoru
N1 - Publisher Copyright:
© 2020, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2021
Y1 - 2021
N2 - The two-stage fuzzy fusion based-convolution neural network is proposed for dynamic emotion recognition by using both facial expression and speech modalities, which not only can extract discriminative emotion features which contain spatio-temporal information, but can also effectively fuse facial expression and speech modalities. Moreover, the proposal is able to handle situations where the contributions of each modality data to emotion recognition are very imbalanced. The local binary patterns coming from three orthogonal planes and spectrogram are considered first to extract low-level dynamic emotion, so that the spatio-temporal information of these modalities can be obtained. To reveal more discriminative features, two deep convolution neural networks are constructed to extract high-level emotion semantic features. Moreover, the two stage fuzzy fusion strategy is developed by integrating canonical correlation analysis and fuzzy broad learning system, so as to take into account the correlation and difference between different modal features, as well as handle the ambiguity of emotional state information.
AB - The two-stage fuzzy fusion based-convolution neural network is proposed for dynamic emotion recognition by using both facial expression and speech modalities, which not only can extract discriminative emotion features which contain spatio-temporal information, but can also effectively fuse facial expression and speech modalities. Moreover, the proposal is able to handle situations where the contributions of each modality data to emotion recognition are very imbalanced. The local binary patterns coming from three orthogonal planes and spectrogram are considered first to extract low-level dynamic emotion, so that the spatio-temporal information of these modalities can be obtained. To reveal more discriminative features, two deep convolution neural networks are constructed to extract high-level emotion semantic features. Moreover, the two stage fuzzy fusion strategy is developed by integrating canonical correlation analysis and fuzzy broad learning system, so as to take into account the correlation and difference between different modal features, as well as handle the ambiguity of emotional state information.
UR - http://www.scopus.com/inward/record.url?scp=85096205574&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-61577-2_7
DO - 10.1007/978-3-030-61577-2_7
M3 - Chapter
AN - SCOPUS:85096205574
T3 - Studies in Computational Intelligence
SP - 91
EP - 114
BT - Studies in Computational Intelligence
PB - Springer Science and Business Media Deutschland GmbH
ER -