TY - JOUR
T1 - Robust Distracter-Resistive Tracker via Learning a Multi-Component Discriminative Dictionary
AU - Shen, Weichao
AU - Wu, Yuwei
AU - Yuan, Junsong
AU - Duan, Lingyu
AU - Zhang, Jian
AU - Jia, Yunde
N1 - Publisher Copyright:
© 1991-2012 IEEE.
PY - 2019/7
Y1 - 2019/7
N2 - Discriminative dictionary learning (DDL) provides an appealing paradigm for appearance modeling in visual tracking. However, most existing DDL-based trackers cannot handle drastic appearance changes, especially for scenarios with background cluster and/or similar object interference. One reason is that they often suffer from the loss of subtle visual information, which is critical to distinguish an object from distracters. In this paper, we explore the use of activations from the convolutional layer of a convolutional neural network to improve the object representation and then propose a robust distracter-resistive tracker via learning a multi-component discriminative dictionary. The proposed method exploits both the intra-class and inter-class visual information to learn shared atoms and the class-specific atoms. By imposing several constraints into the objective function, the learned dictionary is reconstructive, compressive, and discriminative, and thus can better distinguish an object from the background. In addition, our convolutional features have structural information for object localization and balance the discriminative power and semantic information of the object. Tracking is carried out within a Bayesian inference framework where a joint decision measure is used to construct the observation model. To alleviate the drift problem, the reliable tracking results obtained online are accumulated to update the dictionary. Both the qualitative and quantitative results on the CVPR2013 benchmark, the VOT2015 data set, and the SPOT data set demonstrate that our tracker achieves substantially better overall performance against the state-of-the-art approaches.
AB - Discriminative dictionary learning (DDL) provides an appealing paradigm for appearance modeling in visual tracking. However, most existing DDL-based trackers cannot handle drastic appearance changes, especially for scenarios with background cluster and/or similar object interference. One reason is that they often suffer from the loss of subtle visual information, which is critical to distinguish an object from distracters. In this paper, we explore the use of activations from the convolutional layer of a convolutional neural network to improve the object representation and then propose a robust distracter-resistive tracker via learning a multi-component discriminative dictionary. The proposed method exploits both the intra-class and inter-class visual information to learn shared atoms and the class-specific atoms. By imposing several constraints into the objective function, the learned dictionary is reconstructive, compressive, and discriminative, and thus can better distinguish an object from the background. In addition, our convolutional features have structural information for object localization and balance the discriminative power and semantic information of the object. Tracking is carried out within a Bayesian inference framework where a joint decision measure is used to construct the observation model. To alleviate the drift problem, the reliable tracking results obtained online are accumulated to update the dictionary. Both the qualitative and quantitative results on the CVPR2013 benchmark, the VOT2015 data set, and the SPOT data set demonstrate that our tracker achieves substantially better overall performance against the state-of-the-art approaches.
KW - Visual tracking
KW - appearance changes
KW - multi-component discriminative dictionary
KW - multi-object tracking
UR - https://www.scopus.com/pages/publications/85050971066
U2 - 10.1109/TCSVT.2018.2862151
DO - 10.1109/TCSVT.2018.2862151
M3 - Article
AN - SCOPUS:85050971066
SN - 1051-8215
VL - 29
SP - 2012
EP - 2028
JO - IEEE Transactions on Circuits and Systems for Video Technology
JF - IEEE Transactions on Circuits and Systems for Video Technology
IS - 7
M1 - 8424191
ER -