Application of Attention Mechanism-Based Dual-Modality SSD in RGB-D Hand Detection

Xiangjie Zhu, Baokui Li, Qing Fei, Qiang Wang, Haolin Jia

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)

Abstract

Multimodal gesture recognition is a crucial research area in human-computer interaction. This paper proposes a static gesture multimodal recognition technology based on the Single Shot MultiBox Detector (SSD). Firstly, RGB image data and Depth image data are input into the VGG network to extract features. Then, trained features are concatenated in the fusion process, and the weights of features are adaptively learned with attention mechanisms. Results show that combining the two modalities improves model accuracy compared to using RGB images and Depth images separately. Next, the VGG network is replaced with the MobileNet v1 network as the backbone to make the model faster. The proposed method is tested on the Hand Gesture Dataset. The results indicate that the proposed method is superior to the single-modal gesture recognition SSD network.

Original languageEnglish
Title of host publication2023 42nd Chinese Control Conference, CCC 2023
PublisherIEEE Computer Society
Pages7811-7816
Number of pages6
ISBN (Electronic)9789887581543
DOIs
Publication statusPublished - 2023
Event42nd Chinese Control Conference, CCC 2023 - Tianjin, China
Duration: 24 Jul 202326 Jul 2023

Publication series

NameChinese Control Conference, CCC
Volume2023-July
ISSN (Print)1934-1768
ISSN (Electronic)2161-2927

Conference

Conference42nd Chinese Control Conference, CCC 2023
Country/TerritoryChina
CityTianjin
Period24/07/2326/07/23

Keywords

  • Attention Mechanisms
  • MobileNet
  • Multimodal Gesture Recognition
  • SSD
  • Static Gestures

Fingerprint

Dive into the research topics of 'Application of Attention Mechanism-Based Dual-Modality SSD in RGB-D Hand Detection'. Together they form a unique fingerprint.

Cite this