Negative-Driven Training Pipeline for Siamese Visual Tracking

Xin Yang, Chenyang Zhao, Jinqi Yang, Yong Song*, Yufei Zhao

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)
Plum Print visual indicator of research metrics
  • Citations
    • Citation Indexes: 3
  • Captures
    • Readers: 2
see details

Abstract

Although Siamese trackers have recently been gradually replaced by complicated and computationally expensive Transformer trackers, they are simpler and more applicable to real-world deployment. We believe that room for improvement still exists in the Siamese tracking framework and attribute the performance limitation to inadequate annotation, insufficient augmentation and suboptimal assignment, which heavily weakens the discriminative power. As Siamese methods directly inherit the assignment manner from object detection, they both face the imbalance between sparse annotated objects and dense background examples and between easy and hard negative examples. Moreover, existing augmentations and negative pairs are insufficient to simulate practical tracking ambiguity and failure cases. Nevertheless, work in the training vein is still overlooked. Therefore, we strive to yield a negative-driven training pipeline to unleash the potential of the Siamese framework without any extra inference cost. Specifically, 1) We devise strong negative augmentations based on random copy-paste to take full advantage of available annotations and generate more challenging tracking scenarios, especially negative examples. 2) We propose a semisupervised two-phase assignment that jointly utilizes existing annotations and model outputs to mine more appropriate and challenging negative examples. 3) We formulate a complementary reweighting loss by modifying the loss weight matrix to bridge subtasks and highlight the contributions of hard negative examples more smoothly. We choose several classic Siamese trackers to validate the pipeline effectiveness. After training, these trackers can gain, at most, a nearly 14% relative increase in performance, which is comparable to advanced Siamese trackers and even Transformer trackers. The experimental results indicate that the tracking-specific training pipeline is an efficient method for strengthening trackers and requires further development.

Original languageEnglish
Pages (from-to)4416-4429
Number of pages14
JournalIEEE Transactions on Multimedia
Volume26
DOIs
Publication statusPublished - 2024

Keywords

  • Negative example mining
  • Siamese network
  • training pipeline
  • visual tracking

Fingerprint

Dive into the research topics of 'Negative-Driven Training Pipeline for Siamese Visual Tracking'. Together they form a unique fingerprint.

Cite this

Yang, X., Zhao, C., Yang, J., Song, Y., & Zhao, Y. (2024). Negative-Driven Training Pipeline for Siamese Visual Tracking. IEEE Transactions on Multimedia, 26, 4416-4429. https://doi.org/10.1109/TMM.2023.3323134