Full Attention Tracker: A Good Combination of Pixel-Level and Region-Level Cross-Correlation

Yuxuan Wang, Liping Yan*, Zihang Feng, Yuanqing Xia, Bo Xiao

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The tracker based on Siamese neural network is currently a technical method with high accuracy in the tracking field. With the introduction of transformer in the visual tracking field, the attention mechanism has gradually emerged in tracking tasks. However, due to the characteristics of attention operation, Transformer usually has slow convergence speed, and its pixel-level correlation discrimination in tracking is more likely to lead to overfitting, which is not conducive to long-term tracking. A brand new framework FAT was designed, which is the improvement of MixFormer. The operation for simultaneous feature extraction and target information integration in MixFormer is retained, and the Mixing block is introduced to suppress the background as much as possible before the information interaction. In addition, a new operation is designed: the result of region-level cross-correlation is used as a guidance to help the learning of pixel-level cross-correlation in attention, thereby accelerating the model convergence speed and enhancing the model generalization. Finally, a joint loss function is designed to further improve the accuracy of the model. Experiments show that the presented tracker achieves excellent performance on five benchmark datasets.

Original languageEnglish
Title of host publication2023 42nd Chinese Control Conference, CCC 2023
PublisherIEEE Computer Society
Pages7440-7446
Number of pages7
ISBN (Electronic)9789887581543
DOIs
Publication statusPublished - 2023
Event42nd Chinese Control Conference, CCC 2023 - Tianjin, China
Duration: 24 Jul 202326 Jul 2023

Publication series

NameChinese Control Conference, CCC
Volume2023-July
ISSN (Print)1934-1768
ISSN (Electronic)2161-2927

Conference

Conference42nd Chinese Control Conference, CCC 2023
Country/TerritoryChina
CityTianjin
Period24/07/2326/07/23

Keywords

  • Attention
  • Correlation
  • Transformer
  • Visual Tracking

Fingerprint

Dive into the research topics of 'Full Attention Tracker: A Good Combination of Pixel-Level and Region-Level Cross-Correlation'. Together they form a unique fingerprint.

Cite this