Conditional Diffusion Model for Skeleton-Based Gesture Recognition With Severe Occlusions

Jinting Liu, Minggang Gan*, Yao Du, Keyi Guan, Jia Guo

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

In the field of skeleton-based gesture recognition, occlusion remains a significant challenge, significantly degrading performance when key joints are occluded or disturbed. To tackle this issue, we propose DiffTrans, a practical conditional diffusion model for occlusion recognition, which enables skeleton-based gesture recognition under high occlusion by generating more likely samples. This study addresses the hand skeleton occlusion problem by framing it as a conditional denoising problem, where unoccluded data serve as observations and occluded data as repair targets. We employ a conditional diffusion model to impute the missing skeleton data and the DSTANet model, which is based on the transformer, to learn the skeleton feature representations. Research results show that the DiffTrans outperforms existing methods under various occlusion modes, maintaining high performance even in scenarios with a high missing rate.

Original languageEnglish
Pages (from-to)1970-1974
Number of pages5
JournalIEEE Signal Processing Letters
Volume32
DOIs
Publication statusPublished - 2025
Externally publishedYes

Keywords

  • Conditional diffusion model
  • occlusion recognition
  • skeleton-based gesture recognition
  • transformer

Cite this