TY - GEN
T1 - Multimodal Information-Based Broad and Deep Learning Model for Emotion Understanding
AU - Li, Min
AU - Chen, Luefeng
AU - Wu, Min
AU - Pedrycz, Witold
AU - Hirota, Kaoru
N1 - Publisher Copyright:
© 2021 Technical Committee on Control Theory, Chinese Association of Automation.
PY - 2021/7/26
Y1 - 2021/7/26
N2 - Multimodal information-based broad and deep learning model (MIBDL) for emotion understanding is proposed, in which facial expression and body gesture are used to achieve emotional states recognition for emotion understanding. It aims to understand coexistence multimodal information in human-robot interaction by using different processing methods of deep network and broad network, which obtains the features of depth and width dimensions. Moreover, random mapping in the initial broad learning network could cause information loss and its shallow layer network is difficult to cope with complex tasks. To address this problem, we use principal component analysis to generate the nodes of the broad learning, and the stacked broad learning network is adapted to make it easier for the existing broad learning networks to cope with complex tasks by creating deep variations of the existing network. To verify the effectiveness of the proposal, experiments completed on benchmark database of spontaneous emotion expressions are developed, and experimental results show that the proposal outperforms the state-of-the-art methods. According to the simulation experiments on the FABO database, by using the proposed method, the multimodal recognition rate is 17, 54%, 1.24%, and 0.23% higher than those of the temporal normalized motion and appearance features(TN), the multi-channel CNN (MCCNN), and the hierarchical classification fusion strategy (HCFS), respectively.
AB - Multimodal information-based broad and deep learning model (MIBDL) for emotion understanding is proposed, in which facial expression and body gesture are used to achieve emotional states recognition for emotion understanding. It aims to understand coexistence multimodal information in human-robot interaction by using different processing methods of deep network and broad network, which obtains the features of depth and width dimensions. Moreover, random mapping in the initial broad learning network could cause information loss and its shallow layer network is difficult to cope with complex tasks. To address this problem, we use principal component analysis to generate the nodes of the broad learning, and the stacked broad learning network is adapted to make it easier for the existing broad learning networks to cope with complex tasks by creating deep variations of the existing network. To verify the effectiveness of the proposal, experiments completed on benchmark database of spontaneous emotion expressions are developed, and experimental results show that the proposal outperforms the state-of-the-art methods. According to the simulation experiments on the FABO database, by using the proposed method, the multimodal recognition rate is 17, 54%, 1.24%, and 0.23% higher than those of the temporal normalized motion and appearance features(TN), the multi-channel CNN (MCCNN), and the hierarchical classification fusion strategy (HCFS), respectively.
KW - Convolution neural network
KW - body gesture
KW - broad learning
KW - emotion understanding
KW - facial expression
UR - http://www.scopus.com/inward/record.url?scp=85117348568&partnerID=8YFLogxK
U2 - 10.23919/CCC52363.2021.9549897
DO - 10.23919/CCC52363.2021.9549897
M3 - Conference contribution
AN - SCOPUS:85117348568
T3 - Chinese Control Conference, CCC
SP - 7410
EP - 7414
BT - Proceedings of the 40th Chinese Control Conference, CCC 2021
A2 - Peng, Chen
A2 - Sun, Jian
PB - IEEE Computer Society
T2 - 40th Chinese Control Conference, CCC 2021
Y2 - 26 July 2021 through 28 July 2021
ER -