TY - JOUR
T1 - Facial component-landmark detection with weakly-supervised LR-CNN
AU - Zhang, Ruiheng
AU - Mu, Chengpo
AU - Xu, Min
AU - Xu, Lixin
AU - Xu, Xiaofeng
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2019
Y1 - 2019
N2 - In this paper, we propose a weakly supervised landmark-region-based convolutional neural network (LR-CNN) framework to detect facial component and landmark simultaneously. Most of the existing course-to-fine facial detectors fail to detect landmark accurately without lots of fully labeled data, which are costly to obtain. We can handle the task with a small amount of finely labeled data. First, deep convolutional generative adversarial networks are utilized to generate training samples with weak labels, as data preparation. Then, through weakly supervised learning, our LR-CNN model can be trained effectively with a small amount of finely labeled data and a large amount of generated weakly labeled data. Notably, our approach can handle the situation when large occlusion areas occur, as we localize visible facial components before predicting corresponding landmarks. Detecting unblocked components first helps us to focus on the informative area, resulting in a better performance. Additionally, to improve the performance of the above tasks, we design two models as follows: 1) we add AnchorAlign in the region proposal networks to accurately localize components and 2) we propose a two-branch model consisting classification branch and regression branch to detect landmark. Extensive evaluations on benchmark datasets indicate that our proposed approach is able to complete the multi-task facial detection and outperforms the state-of-the-art facial component and landmark detection algorithms.
AB - In this paper, we propose a weakly supervised landmark-region-based convolutional neural network (LR-CNN) framework to detect facial component and landmark simultaneously. Most of the existing course-to-fine facial detectors fail to detect landmark accurately without lots of fully labeled data, which are costly to obtain. We can handle the task with a small amount of finely labeled data. First, deep convolutional generative adversarial networks are utilized to generate training samples with weak labels, as data preparation. Then, through weakly supervised learning, our LR-CNN model can be trained effectively with a small amount of finely labeled data and a large amount of generated weakly labeled data. Notably, our approach can handle the situation when large occlusion areas occur, as we localize visible facial components before predicting corresponding landmarks. Detecting unblocked components first helps us to focus on the informative area, resulting in a better performance. Additionally, to improve the performance of the above tasks, we design two models as follows: 1) we add AnchorAlign in the region proposal networks to accurately localize components and 2) we propose a two-branch model consisting classification branch and regression branch to detect landmark. Extensive evaluations on benchmark datasets indicate that our proposed approach is able to complete the multi-task facial detection and outperforms the state-of-the-art facial component and landmark detection algorithms.
KW - Weakly-supervised
KW - facial landmark
KW - generative adversarial network
KW - region-based convolutional neural network
UR - http://www.scopus.com/inward/record.url?scp=85061086645&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2018.2890573
DO - 10.1109/ACCESS.2018.2890573
M3 - Article
AN - SCOPUS:85061086645
SN - 2169-3536
VL - 7
SP - 10263
EP - 10277
JO - IEEE Access
JF - IEEE Access
M1 - 8598858
ER -