WF DiMP: Weight-aware dual-modal feature aggregation mechanism for RGB-T tracking

Zhaodi Wang*, Yan Ding*, Pingping Wu, Jinbo Zhang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Visual object tracking has attracted a lot of interests due to its applications in numerous fields such as industry and security. Because the change of illumination could lead to RGB tracking failure, more and more researchers focus on RGB-T tracking methods based on fusion of visible and thermal infrared spectrums and hasten their development in recent years. In order to utilize dual-modal complementary information adaptively, we design a weight-aware dual-modal feature aggregation mechanism, and the WF DiMP algorithm for RGB-T tracking is therefore proposed in this paper. In WF DiMP, deep features of visible and thermal infrared images are extracted by ResNet50 and are leveraged to produce heterogenous response maps, from which dual-modal weights are learned adaptively. Weighted deep features are then concatenated as input of classifier and bounding box estimation module respectively in DiMP (Discriminative Model Prediction) network to obtain the final confidence map and an object bounding box. Experiments on VOT-RGBT2019 dataset are carried out. The results show that WF DiMP algorithm has higher tracking accuracy and robustness. The evaluation indexes PR, SR reach 82.1% and 56.3% respectively, which prove the effectiveness of our mechanism given in the paper.

Original languageEnglish
Title of host publicationSeventh Symposium on Novel Photoelectronic Detection Technology and Applications
EditorsJunhong Su, Junhao Chu, Qifeng Yu, Huilin Jiang
PublisherSPIE
ISBN (Electronic)9781510643611
DOIs
Publication statusPublished - 2021
Event7th Symposium on Novel Photoelectronic Detection Technology and Applications - Kunming, China
Duration: 5 Nov 20207 Nov 2020

Publication series

NameProceedings of SPIE - The International Society for Optical Engineering
Volume11763
ISSN (Print)0277-786X
ISSN (Electronic)1996-756X

Conference

Conference7th Symposium on Novel Photoelectronic Detection Technology and Applications
Country/TerritoryChina
CityKunming
Period5/11/207/11/20

Keywords

  • Image fusion
  • Multiple modalities
  • RGB-T tracking
  • Visual object tracking

Fingerprint

Dive into the research topics of 'WF DiMP: Weight-aware dual-modal feature aggregation mechanism for RGB-T tracking'. Together they form a unique fingerprint.

Cite this