Video based emotion recognition using CNN and BRNN

Youyi Cai; Wenming Zheng; Tong Zhang; Qiang Li; Zhen Cui; Jiayin Ye

doi:10.1007/978-981-10-3005-5_56

Video based emotion recognition using CNN and BRNN

Youyi Cai, Wenming Zheng^*, Tong Zhang, Qiang Li, Zhen Cui, Jiayin Ye

^*Corresponding author for this work

Southeast University, Nanjing

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

17 Citations (Scopus)

Abstract

Video-based Emotion recognition is a rather challenging computer vision task. It not only needs to model spatial information of each image frame, but also requires considering temporal contextual correlations among sequential frames. For this purpose, we propose a hierarchical deep network architecture to extract high-level spatial-temporal features. In this architecture, two classic deep neural networks, convolutional neutral networks (CNN) and bi-directional recurrent neutral networks (BRNN), are employed to respectively capture facial textural characteristics in spatial domain and dynamic emotion changes in temporal domain. We endeavor to coordinate the two networks by optimizing each of them, so as to boost the performance of the emotion recognition. In the challenging competition, our method achieves a promising performance compared with the baselines.

Original language	English
Title of host publication	Pattern Recognition - 7th Chinese Conference, CCPR 2016, Proceedings
Editors	Tieniu Tan, Xilin Chen, Xuelong Li, Jian Yang, Hong Cheng, Jie Zhou
Publisher	Springer Verlag
Pages	679-691
Number of pages	13
ISBN (Print)	9789811030048
DOIs	https://doi.org/10.1007/978-981-10-3005-5_56
Publication status	Published - 2016
Externally published	Yes

Publication series

Name	Communications in Computer and Information Science
Volume	663
ISSN (Print)	1865-0929

Keywords

Bi-directional recurrent neutral networks (BRNN)
Convolutional neutral networks (CNN)
Emotion recognition

Access to Document

10.1007/978-981-10-3005-5_56

Cite this

Cai, Y., Zheng, W., Zhang, T., Li, Q., Cui, Z., & Ye, J. (2016). Video based emotion recognition using CNN and BRNN. In T. Tan, X. Chen, X. Li, J. Yang, H. Cheng, & J. Zhou (Eds.), Pattern Recognition - 7th Chinese Conference, CCPR 2016, Proceedings (pp. 679-691). (Communications in Computer and Information Science; Vol. 663). Springer Verlag. https://doi.org/10.1007/978-981-10-3005-5_56

@inproceedings{812770ed0ba14dc39a4c4cb1047108d7,

title = "Video based emotion recognition using CNN and BRNN",

abstract = "Video-based Emotion recognition is a rather challenging computer vision task. It not only needs to model spatial information of each image frame, but also requires considering temporal contextual correlations among sequential frames. For this purpose, we propose a hierarchical deep network architecture to extract high-level spatial-temporal features. In this architecture, two classic deep neural networks, convolutional neutral networks (CNN) and bi-directional recurrent neutral networks (BRNN), are employed to respectively capture facial textural characteristics in spatial domain and dynamic emotion changes in temporal domain. We endeavor to coordinate the two networks by optimizing each of them, so as to boost the performance of the emotion recognition. In the challenging competition, our method achieves a promising performance compared with the baselines.",

keywords = "Bi-directional recurrent neutral networks (BRNN), Convolutional neutral networks (CNN), Emotion recognition",

author = "Youyi Cai and Wenming Zheng and Tong Zhang and Qiang Li and Zhen Cui and Jiayin Ye",

note = "Publisher Copyright: {\textcopyright} Springer Nature Singapore Pte Ltd. 2016.",

year = "2016",

doi = "10.1007/978-981-10-3005-5_56",

language = "English",

isbn = "9789811030048",

series = "Communications in Computer and Information Science",

publisher = "Springer Verlag",

pages = "679--691",

editor = "Tieniu Tan and Xilin Chen and Xuelong Li and Jian Yang and Hong Cheng and Jie Zhou",

booktitle = "Pattern Recognition - 7th Chinese Conference, CCPR 2016, Proceedings",

address = "Germany",

}

Cai, Y, Zheng, W, Zhang, T, Li, Q, Cui, Z & Ye, J 2016, Video based emotion recognition using CNN and BRNN. in T Tan, X Chen, X Li, J Yang, H Cheng & J Zhou (eds), Pattern Recognition - 7th Chinese Conference, CCPR 2016, Proceedings. Communications in Computer and Information Science, vol. 663, Springer Verlag, pp. 679-691. https://doi.org/10.1007/978-981-10-3005-5_56

Video based emotion recognition using CNN and BRNN. / Cai, Youyi; Zheng, Wenming; Zhang, Tong et al.
Pattern Recognition - 7th Chinese Conference, CCPR 2016, Proceedings. ed. / Tieniu Tan; Xilin Chen; Xuelong Li; Jian Yang; Hong Cheng; Jie Zhou. Springer Verlag, 2016. p. 679-691 (Communications in Computer and Information Science; Vol. 663).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Video based emotion recognition using CNN and BRNN

AU - Cai, Youyi

AU - Zheng, Wenming

AU - Zhang, Tong

AU - Li, Qiang

AU - Cui, Zhen

AU - Ye, Jiayin

N1 - Publisher Copyright: © Springer Nature Singapore Pte Ltd. 2016.

PY - 2016

Y1 - 2016

N2 - Video-based Emotion recognition is a rather challenging computer vision task. It not only needs to model spatial information of each image frame, but also requires considering temporal contextual correlations among sequential frames. For this purpose, we propose a hierarchical deep network architecture to extract high-level spatial-temporal features. In this architecture, two classic deep neural networks, convolutional neutral networks (CNN) and bi-directional recurrent neutral networks (BRNN), are employed to respectively capture facial textural characteristics in spatial domain and dynamic emotion changes in temporal domain. We endeavor to coordinate the two networks by optimizing each of them, so as to boost the performance of the emotion recognition. In the challenging competition, our method achieves a promising performance compared with the baselines.

AB - Video-based Emotion recognition is a rather challenging computer vision task. It not only needs to model spatial information of each image frame, but also requires considering temporal contextual correlations among sequential frames. For this purpose, we propose a hierarchical deep network architecture to extract high-level spatial-temporal features. In this architecture, two classic deep neural networks, convolutional neutral networks (CNN) and bi-directional recurrent neutral networks (BRNN), are employed to respectively capture facial textural characteristics in spatial domain and dynamic emotion changes in temporal domain. We endeavor to coordinate the two networks by optimizing each of them, so as to boost the performance of the emotion recognition. In the challenging competition, our method achieves a promising performance compared with the baselines.

KW - Bi-directional recurrent neutral networks (BRNN)

KW - Convolutional neutral networks (CNN)

KW - Emotion recognition

UR - http://www.scopus.com/inward/record.url?scp=84994813361&partnerID=8YFLogxK

U2 - 10.1007/978-981-10-3005-5_56

DO - 10.1007/978-981-10-3005-5_56

M3 - Conference contribution

AN - SCOPUS:84994813361

SN - 9789811030048

T3 - Communications in Computer and Information Science

SP - 679

EP - 691

BT - Pattern Recognition - 7th Chinese Conference, CCPR 2016, Proceedings

A2 - Tan, Tieniu

A2 - Chen, Xilin

A2 - Li, Xuelong

A2 - Yang, Jian

A2 - Cheng, Hong

A2 - Zhou, Jie

PB - Springer Verlag

ER -

Video based emotion recognition using CNN and BRNN

Abstract

Publication series

Keywords

Access to Document

Other files and links

Fingerprint

Cite this