Skip to main navigation Skip to search Skip to main content

Deep spatial and channel sliding attention patches for pose-invariant facial expression recognition

  • Xiaoyu Fan
  • , Chaoji Liu*
  • , Shuangxi Li
  • , Jie Yao
  • , Xinfeng Han
  • , Xingqiao Liu
  • , Chong Chen
  • *Corresponding author for this work
  • Anhui Science and Technology University
  • Jiangsu University

Research output: Contribution to journalArticlepeer-review

Abstract

Pose-invariant facial expression recognition (FER) is an import yet challenging research topics in computer vision, especially with the introduction of pose change and self-occlusion, which makes the recognition results changing from one observational angle to another. In this paper, we propose a sliding patch combined with spatial and channel attention network (SPA-SE) for pose-invariant FER. The proposed network comprises three main components: a slide patch (SP) model, a spatial-level patch attention (SPA) model, and a channel-level attention (squeeze- and-extraction) model. The slide patch (SP) model is designed to determine the optimal patch size and stride, reducing the impact of pose variation on recognition accuracy. The spatial-level patch attention (SPA) model guides the network to focus on regional features and adaptively assigns weights to indicate the importance of the local patch. The channel-level attention model is embedded into the bottleneck block to provide more salient feature maps for the SPA model. To evaluate the effectiveness of the SPA-SE network, we conducted experiments on five pose-invariant FER datasets. These include three controllable FER datasets (BU3DFEP1, BU3DFEP2, and Multi-PIE) that achieved accuracies of 78.01%, 81.65%, and 86.77%, respectively, as well as two real-world FER datasets (Pose-RAFDB and Pose-Affect) that achieved accuracies of 86.76% (>30°) and 85.92% (>45°), and 59.84% (>30°) and 60.36% (>45°), respectively. The results demonstrate that our method can effectively improve the recognition accuracy in practical applications.

Original languageEnglish
Article number101327
JournalGraphical Models
Volume145
DOIs
Publication statusPublished - Jun 2026
Externally publishedYes

Keywords

  • Channel-level attention model
  • Deep convolutional neural network
  • Pose-invariant FER
  • Sliding patch model
  • Spatial-level attention model

Fingerprint

Dive into the research topics of 'Deep spatial and channel sliding attention patches for pose-invariant facial expression recognition'. Together they form a unique fingerprint.

Cite this