Temporal-Aware Visual Object Tracking with Pyramidal Transformer and Adaptive Decoupling

Yiding Liang, Bo Ma*, Hao Xu

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We propose a one-stream visual object tracking algorithm PVTrack. First, we propose a one-stream pyramidal backbone based on the attention mechanism, which computes the template and search region in parallel to improve the computational efficiency of the tracker, and in which the attention mechanism establishes global contextual information to optimize the tracking performance. Secondly, we propose an adaptive decoupled prediction head, which performs targeted computation on different layers of features output from the backbone: for the low-level semantic features that are rich in target shape information, feature fusion is used to improve the regression accuracy of the model; for the high-level semantic information that is good for classification, classification regression decoupling is used to improve the target localization accuracy by utilizing the high-level semantic features alone. Finally, we introduce the discriminative template updating method and design the template updating threshold function, so as to improve the algorithm's ability of modeling temporal information. In this paper, tests and ablation experiments are conducted on multiple datasets to verify that the proposed one-stream visual object tracking algorithm based on discriminative template updating can effectively improve the computational efficiency and robustness of tracking.

Original languageEnglish
Title of host publication2024 5th International Conference on Artificial Intelligence and Electromechanical Automation, AIEA 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages270-277
Number of pages8
ISBN (Electronic)9798350366174
DOIs
Publication statusPublished - 2024
Event5th International Conference on Artificial Intelligence and Electromechanical Automation, AIEA 2024 - Shenzhen, China
Duration: 14 Jun 202416 Jun 2024

Publication series

Name2024 5th International Conference on Artificial Intelligence and Electromechanical Automation, AIEA 2024

Conference

Conference5th International Conference on Artificial Intelligence and Electromechanical Automation, AIEA 2024
Country/TerritoryChina
CityShenzhen
Period14/06/2416/06/24

Keywords

  • decoupling
  • pyramid
  • transformer
  • visual object tracking

Fingerprint

Dive into the research topics of 'Temporal-Aware Visual Object Tracking with Pyramidal Transformer and Adaptive Decoupling'. Together they form a unique fingerprint.

Cite this