Feature decomposition-based gaze estimation with auxiliary head pose regression

Ke Ni, Jing Chen*, Jian Wang, Bo Liu, Ting Lei, Yongtian Wang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Recognition and understanding of facial images or eye images are critical for eye tracking. Recent studies have shown that the simultaneous use of facial and eye images can effectively lower gaze errors. However, these methods typically consider facial and eye images as two unrelated inputs, without taking into account their distinct representational abilities at the feature level. Additionally, implicitly learned head pose from highly coupled facial features would make the trained model less interpretable and prone to the gaze-head overfitting problem. To address these issues, we propose a method that aims to enhance task-relevant features while suppressing other noises by leveraging feature decomposition. We disentangle eye-related features from the facial image via a projection module and further make them distinctive with an attention-based head pose regression task, which could enhance the representation of gaze-related features and make the model less susceptible to task-irrelevant features. After that, the mutually separated eye features and head pose are recombined to achieve more accurate gaze estimation. Experimental results demonstrate that our method achieves state-of-the-art performance, with an estimation error of 3.90° on the MPIIGaze dataset and 5.15° error on the EyeDiap dataset, respectively.

Original languageEnglish
Pages (from-to)137-142
Number of pages6
JournalPattern Recognition Letters
Volume185
DOIs
Publication statusPublished - Sept 2024

Keywords

  • Deep learning
  • Eye appearance
  • Feature decomposition
  • Gaze estimation

Fingerprint

Dive into the research topics of 'Feature decomposition-based gaze estimation with auxiliary head pose regression'. Together they form a unique fingerprint.

Cite this