TY - JOUR
T1 - Facial Expression Recognition Using Hybrid Features of Pixel and Geometry
AU - Liu, Chang
AU - Hirota, Kaoru
AU - Ma, Junjie
AU - Jia, Zhiyang
AU - Dai, Yaping
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2021
Y1 - 2021
N2 - Facial Expression Recognition (FER) has long been a challenging task in the field of computer vision. Most of the existing FER methods extract facial features on the basis of face pixels, ignoring the relative geometric position dependencies of facial landmark points. This article presents a hybrid feature extraction network to enhance the discriminative power of emotional features. The proposed network consists of a Spatial Attention Convolutional Neural Network (SACNN) and a series of Long Short-term Memory networks with Attention mechanism (ALSTMs). The SACNN is employed to extract the expressional features from static face images and the ALSTMs is designed to explore the potentials of facial landmarks for expression recognition. A deep geometric feature descriptor is proposed to characterize the relative geometric position correlation of facial landmarks. The landmarks are divided into seven groups to extract deep geometric features, and the attention module in ALSTMs can adaptively estimate the importance of different landmark regions. By jointly combining SACNN and ALSTMs, the hybrid features are obtained for expression recognition. Experiments conducted on three public databases, FER2013, CK+, and JAFFE, demonstrate that the proposed method outperforms the previous methods, with the accuracies of 74.31%, 95.15%, and 98.57%, respectively. The preliminary results of Emotion Understanding Robot System (EURS) indicate that the proposed method has the potential to improve the performance of human-robot interaction.
AB - Facial Expression Recognition (FER) has long been a challenging task in the field of computer vision. Most of the existing FER methods extract facial features on the basis of face pixels, ignoring the relative geometric position dependencies of facial landmark points. This article presents a hybrid feature extraction network to enhance the discriminative power of emotional features. The proposed network consists of a Spatial Attention Convolutional Neural Network (SACNN) and a series of Long Short-term Memory networks with Attention mechanism (ALSTMs). The SACNN is employed to extract the expressional features from static face images and the ALSTMs is designed to explore the potentials of facial landmarks for expression recognition. A deep geometric feature descriptor is proposed to characterize the relative geometric position correlation of facial landmarks. The landmarks are divided into seven groups to extract deep geometric features, and the attention module in ALSTMs can adaptively estimate the importance of different landmark regions. By jointly combining SACNN and ALSTMs, the hybrid features are obtained for expression recognition. Experiments conducted on three public databases, FER2013, CK+, and JAFFE, demonstrate that the proposed method outperforms the previous methods, with the accuracies of 74.31%, 95.15%, and 98.57%, respectively. The preliminary results of Emotion Understanding Robot System (EURS) indicate that the proposed method has the potential to improve the performance of human-robot interaction.
KW - Facial expression recognition
KW - attention mechanism
KW - hybrid feature
KW - long short-term memory network
KW - relative geometric position dependency
UR - http://www.scopus.com/inward/record.url?scp=85100517904&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2021.3054332
DO - 10.1109/ACCESS.2021.3054332
M3 - Article
AN - SCOPUS:85100517904
SN - 2169-3536
VL - 9
SP - 18876
EP - 18889
JO - IEEE Access
JF - IEEE Access
M1 - 9335586
ER -