PL-MCT: pseudo-labeling and multi-frame consistency training for semi-supervised visual tracking

Yiqian Huang; Shuqi Liu; Fei Dong; Xu Li; Xin Yang; Ya Zhou; Jinxiang Huang; Yong Song

doi:10.1007/s00371-024-03651-5

PL-MCT: pseudo-labeling and multi-frame consistency training for semi-supervised visual tracking

Yiqian Huang, Shuqi Liu, Fei Dong, Xu Li, Xin Yang, Ya Zhou, Jinxiang Huang, Yong Song^*

^*此作品的通讯作者

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

The exploitation of unlabeled videos for visual object tracking has recently drawn increasing attention. However, unreliable pseudo-labels cause an incomplete appearance of the object and an incorrect search region, which hinders bounding box regression learning. To address this issue, we propose a novel semi-supervised learning method, termed pseudo-labeling and multi-frame consistency training (PL-MCT), for visual tracking, which successfully improves the reliability of pseudo-labels and the robustness of the tracker. Specifically, we introduce a pseudo-label evaluation (PLE) module to provide the reliability score of the pseudo-label and design a prediction-training alternation (PTA) strategy to effectively mitigate the bias of noisy pseudo-labels, which contributes to selecting high-quality pseudo-labels as training pairs. Meanwhile, to cope with the appearance variations of objects in complex scenarios, we employ a multi-frame consistency training scheme that introduced an online update head (OUH) to continue training the tracker to learn the signal in the temporal dimension of videos and update online. Extensive experiments demonstrate the effectiveness of the proposed method. Our method (PL-MCT) achieves a precision score of 0.856 on OTB2015 and 0.408 on LaSOT, which achieves advanced performance compared to other unsupervised methods and has comparable results to preceding supervised methods. Project will be available at https://github.com/HYQ-hyq222/PL-MCT.

源语言	英语
期刊	Visual Computer
DOI	https://doi.org/10.1007/s00371-024-03651-5
出版状态	已接受/待刊 - 2024

访问文件

10.1007/s00371-024-03651-5

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{cc2a3f8e68304aa299483f6033ce9fd7,

title = "PL-MCT: pseudo-labeling and multi-frame consistency training for semi-supervised visual tracking",

abstract = "The exploitation of unlabeled videos for visual object tracking has recently drawn increasing attention. However, unreliable pseudo-labels cause an incomplete appearance of the object and an incorrect search region, which hinders bounding box regression learning. To address this issue, we propose a novel semi-supervised learning method, termed pseudo-labeling and multi-frame consistency training (PL-MCT), for visual tracking, which successfully improves the reliability of pseudo-labels and the robustness of the tracker. Specifically, we introduce a pseudo-label evaluation (PLE) module to provide the reliability score of the pseudo-label and design a prediction-training alternation (PTA) strategy to effectively mitigate the bias of noisy pseudo-labels, which contributes to selecting high-quality pseudo-labels as training pairs. Meanwhile, to cope with the appearance variations of objects in complex scenarios, we employ a multi-frame consistency training scheme that introduced an online update head (OUH) to continue training the tracker to learn the signal in the temporal dimension of videos and update online. Extensive experiments demonstrate the effectiveness of the proposed method. Our method (PL-MCT) achieves a precision score of 0.856 on OTB2015 and 0.408 on LaSOT, which achieves advanced performance compared to other unsupervised methods and has comparable results to preceding supervised methods. Project will be available at https://github.com/HYQ-hyq222/PL-MCT.",

keywords = "Consistency training, Pseudo-label, Semi-supervised, Visual tracking",

author = "Yiqian Huang and Shuqi Liu and Fei Dong and Xu Li and Xin Yang and Ya Zhou and Jinxiang Huang and Yong Song",

note = "Publisher Copyright: {\textcopyright} The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024.",

year = "2024",

doi = "10.1007/s00371-024-03651-5",

language = "English",

journal = "Visual Computer",

issn = "0178-2789",

publisher = "Springer Verlag",

}

TY - JOUR

T1 - PL-MCT

T2 - pseudo-labeling and multi-frame consistency training for semi-supervised visual tracking

AU - Huang, Yiqian

AU - Liu, Shuqi

AU - Dong, Fei

AU - Li, Xu

AU - Yang, Xin

AU - Zhou, Ya

AU - Huang, Jinxiang

AU - Song, Yong

N1 - Publisher Copyright: © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024.

PY - 2024

Y1 - 2024

N2 - The exploitation of unlabeled videos for visual object tracking has recently drawn increasing attention. However, unreliable pseudo-labels cause an incomplete appearance of the object and an incorrect search region, which hinders bounding box regression learning. To address this issue, we propose a novel semi-supervised learning method, termed pseudo-labeling and multi-frame consistency training (PL-MCT), for visual tracking, which successfully improves the reliability of pseudo-labels and the robustness of the tracker. Specifically, we introduce a pseudo-label evaluation (PLE) module to provide the reliability score of the pseudo-label and design a prediction-training alternation (PTA) strategy to effectively mitigate the bias of noisy pseudo-labels, which contributes to selecting high-quality pseudo-labels as training pairs. Meanwhile, to cope with the appearance variations of objects in complex scenarios, we employ a multi-frame consistency training scheme that introduced an online update head (OUH) to continue training the tracker to learn the signal in the temporal dimension of videos and update online. Extensive experiments demonstrate the effectiveness of the proposed method. Our method (PL-MCT) achieves a precision score of 0.856 on OTB2015 and 0.408 on LaSOT, which achieves advanced performance compared to other unsupervised methods and has comparable results to preceding supervised methods. Project will be available at https://github.com/HYQ-hyq222/PL-MCT.

AB - The exploitation of unlabeled videos for visual object tracking has recently drawn increasing attention. However, unreliable pseudo-labels cause an incomplete appearance of the object and an incorrect search region, which hinders bounding box regression learning. To address this issue, we propose a novel semi-supervised learning method, termed pseudo-labeling and multi-frame consistency training (PL-MCT), for visual tracking, which successfully improves the reliability of pseudo-labels and the robustness of the tracker. Specifically, we introduce a pseudo-label evaluation (PLE) module to provide the reliability score of the pseudo-label and design a prediction-training alternation (PTA) strategy to effectively mitigate the bias of noisy pseudo-labels, which contributes to selecting high-quality pseudo-labels as training pairs. Meanwhile, to cope with the appearance variations of objects in complex scenarios, we employ a multi-frame consistency training scheme that introduced an online update head (OUH) to continue training the tracker to learn the signal in the temporal dimension of videos and update online. Extensive experiments demonstrate the effectiveness of the proposed method. Our method (PL-MCT) achieves a precision score of 0.856 on OTB2015 and 0.408 on LaSOT, which achieves advanced performance compared to other unsupervised methods and has comparable results to preceding supervised methods. Project will be available at https://github.com/HYQ-hyq222/PL-MCT.

KW - Consistency training

KW - Pseudo-label

KW - Semi-supervised

KW - Visual tracking

UR - http://www.scopus.com/inward/record.url?scp=85205565051&partnerID=8YFLogxK

U2 - 10.1007/s00371-024-03651-5

DO - 10.1007/s00371-024-03651-5

M3 - Article

AN - SCOPUS:85205565051

SN - 0178-2789

JO - Visual Computer

JF - Visual Computer

ER -

PL-MCT: pseudo-labeling and multi-frame consistency training for semi-supervised visual tracking

摘要

访问文件

其它文件与链接

指纹

引用此