TY - GEN
T1 - Online visual tracking with high-order pooling
AU - Yan, Xiyu
AU - Ma, Bo
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/8/28
Y1 - 2017/8/28
N2 - Most local sparse representation models in visual tracking generally contain three components: 1) extracting local descriptors from target region, 2) encoding the extracted local descriptors as mid-level features, 3) aggregating statistics of mid-level features into a signature. Since the last step aggregates only first-order statistics of mid-level features, it is named as First-order Pooling (FP). However, FP lacks highorder statistical information of target. Hence, it couldn't reflect the correlation of features, which leads to poor tracking performance. In this paper, we introduce an appearance model for visual tracking that conducts High-order Pooling (HP) over mid-level features under the framework of sparse coding. Instead of first-order signature, we find that higher-order statistics of mid-level features with additional image information could bring large tracking performance gains. Moreover, a simple but effective updating scheme is adopted to improve the tracker adaptability. Experiments on various challenging videos show that the tracking performance with appearance model using HP is superior to those using FP.
AB - Most local sparse representation models in visual tracking generally contain three components: 1) extracting local descriptors from target region, 2) encoding the extracted local descriptors as mid-level features, 3) aggregating statistics of mid-level features into a signature. Since the last step aggregates only first-order statistics of mid-level features, it is named as First-order Pooling (FP). However, FP lacks highorder statistical information of target. Hence, it couldn't reflect the correlation of features, which leads to poor tracking performance. In this paper, we introduce an appearance model for visual tracking that conducts High-order Pooling (HP) over mid-level features under the framework of sparse coding. Instead of first-order signature, we find that higher-order statistics of mid-level features with additional image information could bring large tracking performance gains. Moreover, a simple but effective updating scheme is adopted to improve the tracker adaptability. Experiments on various challenging videos show that the tracking performance with appearance model using HP is superior to those using FP.
KW - High-order Pooling
KW - Mid-level features
KW - Object tracking
KW - Sparse coding
UR - http://www.scopus.com/inward/record.url?scp=85030214933&partnerID=8YFLogxK
U2 - 10.1109/ICME.2017.8019349
DO - 10.1109/ICME.2017.8019349
M3 - Conference contribution
AN - SCOPUS:85030214933
T3 - Proceedings - IEEE International Conference on Multimedia and Expo
SP - 289
EP - 294
BT - 2017 IEEE International Conference on Multimedia and Expo, ICME 2017
PB - IEEE Computer Society
T2 - 2017 IEEE International Conference on Multimedia and Expo, ICME 2017
Y2 - 10 July 2017 through 14 July 2017
ER -