Abstract
Pose-invariant facial expression recognition (FER) is an import yet challenging research topics in computer vision, especially with the introduction of pose change and self-occlusion, which makes the recognition results changing from one observational angle to another. In this paper, we propose a sliding patch combined with spatial and channel attention network (SPA-SE) for pose-invariant FER. The proposed network comprises three main components: a slide patch (SP) model, a spatial-level patch attention (SPA) model, and a channel-level attention (squeeze- and-extraction) model. The slide patch (SP) model is designed to determine the optimal patch size and stride, reducing the impact of pose variation on recognition accuracy. The spatial-level patch attention (SPA) model guides the network to focus on regional features and adaptively assigns weights to indicate the importance of the local patch. The channel-level attention model is embedded into the bottleneck block to provide more salient feature maps for the SPA model. To evaluate the effectiveness of the SPA-SE network, we conducted experiments on five pose-invariant FER datasets. These include three controllable FER datasets (BU3DFEP1, BU3DFEP2, and Multi-PIE) that achieved accuracies of 78.01%, 81.65%, and 86.77%, respectively, as well as two real-world FER datasets (Pose-RAFDB and Pose-Affect) that achieved accuracies of 86.76% (>30°) and 85.92% (>45°), and 59.84% (>30°) and 60.36% (>45°), respectively. The results demonstrate that our method can effectively improve the recognition accuracy in practical applications.
| Original language | English |
|---|---|
| Article number | 101327 |
| Journal | Graphical Models |
| Volume | 145 |
| DOIs | |
| Publication status | Published - Jun 2026 |
| Externally published | Yes |
Keywords
- Channel-level attention model
- Deep convolutional neural network
- Pose-invariant FER
- Sliding patch model
- Spatial-level attention model
Fingerprint
Dive into the research topics of 'Deep spatial and channel sliding attention patches for pose-invariant facial expression recognition'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver