Landmark and Pose Prediction in Occluded Facial Point Cloud via Explicit Joint Feature Fusion Network

Yifei Yang, Jingfan Fan*, Long Shao, Mingyang Lei, Tianyu Fu, Danni Ai, Deqiang Xiao, Hong Song, Yucong Lin, Jian Yang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Facial point clouds collected in practical applications often suffer from pose variations and occlusion. Existing studies typically focus on either pose estimation or landmarks localization, neglecting to fully utilize the effective information from various facial features, thus limiting the improvement of prediction accuracy. Therefore, we propose an innovative 3D facial multi-task prediction network. The proposed network embeds the output of related tasks into feature extraction from the point level to the global level based on the physical dependencies between tasks. This facilitates explicit multi-task knowledge transfer, enabling the simultaneous prediction of facial landmarks, occlusion, and head pose. We introduce a training strategy based on posterior knowledge correction to iteratively refine and improve multi-task prediction results. Moreover, no single dataset provides annotations for all these tasks at once, so we synthesized a 3D landmarks, occlusion and pose (3D-LOP) dataset, which includes annotations for landmarks coordinates, occlusion probability, and head pose. The proposed method was compared with state-of-the-art methods on two public datasets and 3D-LOP. The landmarks localization accuracy improved by 7.1% on the two public datasets, and the pose estimation accuracy and stability on 3D-LOP improved by 28.5% and 32.7%, respectively. The performance on wild data also shows its potential in practical applications.

Original languageEnglish
JournalIEEE Transactions on Circuits and Systems for Video Technology
DOIs
Publication statusAccepted/In press - 2025
Externally publishedYes

Keywords

  • 3D landmark localization
  • Explicit feature fusion
  • Head pose estimation
  • Occlusion probability prediction

Fingerprint

Dive into the research topics of 'Landmark and Pose Prediction in Occluded Facial Point Cloud via Explicit Joint Feature Fusion Network'. Together they form a unique fingerprint.

Cite this