PL-MCT: pseudo-labeling and multi-frame consistency training for semi-supervised visual tracking

Yiqian Huang; Shuqi Liu; Fei Dong; Xu Li; Xin Yang; Ya Zhou; Jinxiang Huang; Yong Song

doi:10.1007/s00371-024-03651-5

PL-MCT: pseudo-labeling and multi-frame consistency training for semi-supervised visual tracking

Yiqian Huang, Shuqi Liu, Fei Dong, Xu Li, Xin Yang, Ya Zhou, Jinxiang Huang, Yong Song^*

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

Abstract

The exploitation of unlabeled videos for visual object tracking has recently drawn increasing attention. However, unreliable pseudo-labels cause an incomplete appearance of the object and an incorrect search region, which hinders bounding box regression learning. To address this issue, we propose a novel semi-supervised learning method, termed pseudo-labeling and multi-frame consistency training (PL-MCT), for visual tracking, which successfully improves the reliability of pseudo-labels and the robustness of the tracker. Specifically, we introduce a pseudo-label evaluation (PLE) module to provide the reliability score of the pseudo-label and design a prediction-training alternation (PTA) strategy to effectively mitigate the bias of noisy pseudo-labels, which contributes to selecting high-quality pseudo-labels as training pairs. Meanwhile, to cope with the appearance variations of objects in complex scenarios, we employ a multi-frame consistency training scheme that introduced an online update head (OUH) to continue training the tracker to learn the signal in the temporal dimension of videos and update online. Extensive experiments demonstrate the effectiveness of the proposed method. Our method (PL-MCT) achieves a precision score of 0.856 on OTB2015 and 0.408 on LaSOT, which achieves advanced performance compared to other unsupervised methods and has comparable results to preceding supervised methods. Project will be available at https://github.com/HYQ-hyq222/PL-MCT.

Original language	English
Journal	Visual Computer
DOIs	https://doi.org/10.1007/s00371-024-03651-5
Publication status	Accepted/In press - 2024

Keywords

Consistency training
Pseudo-label
Semi-supervised
Visual tracking

Access to Document

10.1007/s00371-024-03651-5

Cite this

Huang, Y., Liu, S., Dong, F., Li, X., Yang, X., Zhou, Y., Huang, J., & Song, Y. (Accepted/In press). PL-MCT: pseudo-labeling and multi-frame consistency training for semi-supervised visual tracking. Visual Computer. https://doi.org/10.1007/s00371-024-03651-5

@article{cc2a3f8e68304aa299483f6033ce9fd7,

title = "PL-MCT: pseudo-labeling and multi-frame consistency training for semi-supervised visual tracking",

abstract = "The exploitation of unlabeled videos for visual object tracking has recently drawn increasing attention. However, unreliable pseudo-labels cause an incomplete appearance of the object and an incorrect search region, which hinders bounding box regression learning. To address this issue, we propose a novel semi-supervised learning method, termed pseudo-labeling and multi-frame consistency training (PL-MCT), for visual tracking, which successfully improves the reliability of pseudo-labels and the robustness of the tracker. Specifically, we introduce a pseudo-label evaluation (PLE) module to provide the reliability score of the pseudo-label and design a prediction-training alternation (PTA) strategy to effectively mitigate the bias of noisy pseudo-labels, which contributes to selecting high-quality pseudo-labels as training pairs. Meanwhile, to cope with the appearance variations of objects in complex scenarios, we employ a multi-frame consistency training scheme that introduced an online update head (OUH) to continue training the tracker to learn the signal in the temporal dimension of videos and update online. Extensive experiments demonstrate the effectiveness of the proposed method. Our method (PL-MCT) achieves a precision score of 0.856 on OTB2015 and 0.408 on LaSOT, which achieves advanced performance compared to other unsupervised methods and has comparable results to preceding supervised methods. Project will be available at https://github.com/HYQ-hyq222/PL-MCT.",

keywords = "Consistency training, Pseudo-label, Semi-supervised, Visual tracking",

author = "Yiqian Huang and Shuqi Liu and Fei Dong and Xu Li and Xin Yang and Ya Zhou and Jinxiang Huang and Yong Song",

note = "Publisher Copyright: {\textcopyright} The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024.",

year = "2024",

doi = "10.1007/s00371-024-03651-5",

language = "English",

journal = "Visual Computer",

issn = "0178-2789",

publisher = "Springer Verlag",

}

TY - JOUR

T1 - PL-MCT

T2 - pseudo-labeling and multi-frame consistency training for semi-supervised visual tracking

AU - Huang, Yiqian

AU - Liu, Shuqi

AU - Dong, Fei

AU - Li, Xu

AU - Yang, Xin

AU - Zhou, Ya

AU - Huang, Jinxiang

AU - Song, Yong

N1 - Publisher Copyright: © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024.

PY - 2024

Y1 - 2024

N2 - The exploitation of unlabeled videos for visual object tracking has recently drawn increasing attention. However, unreliable pseudo-labels cause an incomplete appearance of the object and an incorrect search region, which hinders bounding box regression learning. To address this issue, we propose a novel semi-supervised learning method, termed pseudo-labeling and multi-frame consistency training (PL-MCT), for visual tracking, which successfully improves the reliability of pseudo-labels and the robustness of the tracker. Specifically, we introduce a pseudo-label evaluation (PLE) module to provide the reliability score of the pseudo-label and design a prediction-training alternation (PTA) strategy to effectively mitigate the bias of noisy pseudo-labels, which contributes to selecting high-quality pseudo-labels as training pairs. Meanwhile, to cope with the appearance variations of objects in complex scenarios, we employ a multi-frame consistency training scheme that introduced an online update head (OUH) to continue training the tracker to learn the signal in the temporal dimension of videos and update online. Extensive experiments demonstrate the effectiveness of the proposed method. Our method (PL-MCT) achieves a precision score of 0.856 on OTB2015 and 0.408 on LaSOT, which achieves advanced performance compared to other unsupervised methods and has comparable results to preceding supervised methods. Project will be available at https://github.com/HYQ-hyq222/PL-MCT.

AB - The exploitation of unlabeled videos for visual object tracking has recently drawn increasing attention. However, unreliable pseudo-labels cause an incomplete appearance of the object and an incorrect search region, which hinders bounding box regression learning. To address this issue, we propose a novel semi-supervised learning method, termed pseudo-labeling and multi-frame consistency training (PL-MCT), for visual tracking, which successfully improves the reliability of pseudo-labels and the robustness of the tracker. Specifically, we introduce a pseudo-label evaluation (PLE) module to provide the reliability score of the pseudo-label and design a prediction-training alternation (PTA) strategy to effectively mitigate the bias of noisy pseudo-labels, which contributes to selecting high-quality pseudo-labels as training pairs. Meanwhile, to cope with the appearance variations of objects in complex scenarios, we employ a multi-frame consistency training scheme that introduced an online update head (OUH) to continue training the tracker to learn the signal in the temporal dimension of videos and update online. Extensive experiments demonstrate the effectiveness of the proposed method. Our method (PL-MCT) achieves a precision score of 0.856 on OTB2015 and 0.408 on LaSOT, which achieves advanced performance compared to other unsupervised methods and has comparable results to preceding supervised methods. Project will be available at https://github.com/HYQ-hyq222/PL-MCT.

KW - Consistency training

KW - Pseudo-label

KW - Semi-supervised

KW - Visual tracking

UR - http://www.scopus.com/inward/record.url?scp=85205565051&partnerID=8YFLogxK

U2 - 10.1007/s00371-024-03651-5

DO - 10.1007/s00371-024-03651-5

M3 - Article

AN - SCOPUS:85205565051

SN - 0178-2789

JO - Visual Computer

JF - Visual Computer

ER -

PL-MCT: pseudo-labeling and multi-frame consistency training for semi-supervised visual tracking

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this