TY - JOUR
T1 - Negative-Driven Training Pipeline for Siamese Visual Tracking
AU - Yang, Xin
AU - Zhao, Chenyang
AU - Yang, Jinqi
AU - Song, Yong
AU - Zhao, Yufei
N1 - Publisher Copyright:
© 1999-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - Although Siamese trackers have recently been gradually replaced by complicated and computationally expensive Transformer trackers, they are simpler and more applicable to real-world deployment. We believe that room for improvement still exists in the Siamese tracking framework and attribute the performance limitation to inadequate annotation, insufficient augmentation and suboptimal assignment, which heavily weakens the discriminative power. As Siamese methods directly inherit the assignment manner from object detection, they both face the imbalance between sparse annotated objects and dense background examples and between easy and hard negative examples. Moreover, existing augmentations and negative pairs are insufficient to simulate practical tracking ambiguity and failure cases. Nevertheless, work in the training vein is still overlooked. Therefore, we strive to yield a negative-driven training pipeline to unleash the potential of the Siamese framework without any extra inference cost. Specifically, 1) We devise strong negative augmentations based on random copy-paste to take full advantage of available annotations and generate more challenging tracking scenarios, especially negative examples. 2) We propose a semisupervised two-phase assignment that jointly utilizes existing annotations and model outputs to mine more appropriate and challenging negative examples. 3) We formulate a complementary reweighting loss by modifying the loss weight matrix to bridge subtasks and highlight the contributions of hard negative examples more smoothly. We choose several classic Siamese trackers to validate the pipeline effectiveness. After training, these trackers can gain, at most, a nearly 14% relative increase in performance, which is comparable to advanced Siamese trackers and even Transformer trackers. The experimental results indicate that the tracking-specific training pipeline is an efficient method for strengthening trackers and requires further development.
AB - Although Siamese trackers have recently been gradually replaced by complicated and computationally expensive Transformer trackers, they are simpler and more applicable to real-world deployment. We believe that room for improvement still exists in the Siamese tracking framework and attribute the performance limitation to inadequate annotation, insufficient augmentation and suboptimal assignment, which heavily weakens the discriminative power. As Siamese methods directly inherit the assignment manner from object detection, they both face the imbalance between sparse annotated objects and dense background examples and between easy and hard negative examples. Moreover, existing augmentations and negative pairs are insufficient to simulate practical tracking ambiguity and failure cases. Nevertheless, work in the training vein is still overlooked. Therefore, we strive to yield a negative-driven training pipeline to unleash the potential of the Siamese framework without any extra inference cost. Specifically, 1) We devise strong negative augmentations based on random copy-paste to take full advantage of available annotations and generate more challenging tracking scenarios, especially negative examples. 2) We propose a semisupervised two-phase assignment that jointly utilizes existing annotations and model outputs to mine more appropriate and challenging negative examples. 3) We formulate a complementary reweighting loss by modifying the loss weight matrix to bridge subtasks and highlight the contributions of hard negative examples more smoothly. We choose several classic Siamese trackers to validate the pipeline effectiveness. After training, these trackers can gain, at most, a nearly 14% relative increase in performance, which is comparable to advanced Siamese trackers and even Transformer trackers. The experimental results indicate that the tracking-specific training pipeline is an efficient method for strengthening trackers and requires further development.
KW - Negative example mining
KW - Siamese network
KW - training pipeline
KW - visual tracking
UR - http://www.scopus.com/inward/record.url?scp=85174857863&partnerID=8YFLogxK
U2 - 10.1109/TMM.2023.3323134
DO - 10.1109/TMM.2023.3323134
M3 - Article
AN - SCOPUS:85174857863
SN - 1520-9210
VL - 26
SP - 4416
EP - 4429
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
ER -