TY - JOUR
T1 - Capturing Relevant Context for Visual Tracking
AU - Zhang, Yuping
AU - Ma, Bo
AU - Wu, Jiahao
AU - Huang, Lianghua
AU - Shen, Jianbing
N1 - Publisher Copyright:
© 1999-2012 IEEE.
PY - 2021
Y1 - 2021
N2 - Studies have shown that contextual information can promote the robustness of trackers. However, trackers based on convolutional neural networks (CNNs) only capture local features, which limits their performance. We propose a novel relevant context block (RCB), which employs graph convolutional networks to capture the relevant context. In particular, it selects the k largest contributors as nodes for each query position (unit) that contain meaningful and discriminative contextual information and updates the nodes by aggregating the differences between the query position and its contributors. This operation can be easily incorporated into the existing networks and can be easily end-to-end trained using a standard backpropagation algorithm. To verify the effectiveness of RCB, we apply it to two trackers, SiamFC and GlobalTrack, respectively, and the two improved trackers are referred to as Siam-RCB and GlobalTrack-RCB. Extensive experiments on OTB, VOT, UAV123, LaSOT, TrackingNet, OxUvA, and VOT2018LT show the superiority of our method. For example, our Siam-RCB outperforms SiamFC by a very large margin (up to 11.2% in the success score and 7.8% in the precision score) on the OTB-100 benchmark.
AB - Studies have shown that contextual information can promote the robustness of trackers. However, trackers based on convolutional neural networks (CNNs) only capture local features, which limits their performance. We propose a novel relevant context block (RCB), which employs graph convolutional networks to capture the relevant context. In particular, it selects the k largest contributors as nodes for each query position (unit) that contain meaningful and discriminative contextual information and updates the nodes by aggregating the differences between the query position and its contributors. This operation can be easily incorporated into the existing networks and can be easily end-to-end trained using a standard backpropagation algorithm. To verify the effectiveness of RCB, we apply it to two trackers, SiamFC and GlobalTrack, respectively, and the two improved trackers are referred to as Siam-RCB and GlobalTrack-RCB. Extensive experiments on OTB, VOT, UAV123, LaSOT, TrackingNet, OxUvA, and VOT2018LT show the superiority of our method. For example, our Siam-RCB outperforms SiamFC by a very large margin (up to 11.2% in the success score and 7.8% in the precision score) on the OTB-100 benchmark.
KW - Local neighborhood graph
KW - long-range dependencies
KW - long-term tracking
KW - visual object tracking
UR - http://www.scopus.com/inward/record.url?scp=85097165544&partnerID=8YFLogxK
U2 - 10.1109/TMM.2020.3038310
DO - 10.1109/TMM.2020.3038310
M3 - Article
AN - SCOPUS:85097165544
SN - 1520-9210
VL - 23
SP - 4232
EP - 4244
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
ER -