Metric learning based structural appearance model for robust visual tracking

Yuwei Wu; Bo Ma; Min Yang; Jian Zhang; Yunde Jia

doi:10.1109/TCSVT.2013.2291283

Metric learning based structural appearance model for robust visual tracking

Yuwei Wu, Bo Ma^*, Min Yang, Jian Zhang, Yunde Jia

^*此作品的通讯作者

科研成果: 期刊稿件 › 文章 › 同行评审

52 引用（Scopus）

摘要

Appearance modeling is a key issue for the success of a visual tracker. Sparse representation based appearance modeling has received an increasing amount of interest in recent years. However, most of existing work utilizes reconstruction errors to compute the observation likelihood under the generative framework, which may give poor performance, especially for significant appearance variations. In this paper, we advocate an approach to visual tracking that seeks an appropriate metric in the feature space of sparse codes and propose a metric learning based structural appearance model for more accurate matching of different appearances. This structural representation is acquired by performing multiscale max pooling on the weighted local sparse codes of image patches. An online multiple instance metric learning algorithm is proposed that learns a discriminative and adaptive metric, thereby better distinguishing the visual object of interest from the background. The multiple instance setting is able to alleviate the drift problem potentially caused by misaligned training examples. Tracking is then carried out within a Bayesian inference framework, in which the learned metric and the structure object representation are used to construct the observation model. Comprehensive experiments on challenging image sequences demonstrate qualitatively and quantitatively that the proposed algorithm outperforms the state-of-the-art methods.

源语言	英语
文章编号	6665059
页（从-至）	865-877
页数	13
期刊	IEEE Transactions on Circuits and Systems for Video Technology
卷	24
期	5
DOI	https://doi.org/10.1109/TCSVT.2013.2291283
出版状态	已出版 - 5月 2014

访问文件

10.1109/TCSVT.2013.2291283

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{057ecc2b221d4d5d80d93a9e9d4e08be,

title = "Metric learning based structural appearance model for robust visual tracking",

abstract = "Appearance modeling is a key issue for the success of a visual tracker. Sparse representation based appearance modeling has received an increasing amount of interest in recent years. However, most of existing work utilizes reconstruction errors to compute the observation likelihood under the generative framework, which may give poor performance, especially for significant appearance variations. In this paper, we advocate an approach to visual tracking that seeks an appropriate metric in the feature space of sparse codes and propose a metric learning based structural appearance model for more accurate matching of different appearances. This structural representation is acquired by performing multiscale max pooling on the weighted local sparse codes of image patches. An online multiple instance metric learning algorithm is proposed that learns a discriminative and adaptive metric, thereby better distinguishing the visual object of interest from the background. The multiple instance setting is able to alleviate the drift problem potentially caused by misaligned training examples. Tracking is then carried out within a Bayesian inference framework, in which the learned metric and the structure object representation are used to construct the observation model. Comprehensive experiments on challenging image sequences demonstrate qualitatively and quantitatively that the proposed algorithm outperforms the state-of-the-art methods.",

keywords = "Appearance modeling, multiple instance metric learning, multiscale max pooling, object tracking, sparse coding",

author = "Yuwei Wu and Bo Ma and Min Yang and Jian Zhang and Yunde Jia",

year = "2014",

month = may,

doi = "10.1109/TCSVT.2013.2291283",

language = "English",

volume = "24",

pages = "865--877",

journal = "IEEE Transactions on Circuits and Systems for Video Technology",

issn = "1051-8215",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "5",

}

TY - JOUR

T1 - Metric learning based structural appearance model for robust visual tracking

AU - Wu, Yuwei

AU - Ma, Bo

AU - Yang, Min

AU - Zhang, Jian

AU - Jia, Yunde

PY - 2014/5

Y1 - 2014/5

N2 - Appearance modeling is a key issue for the success of a visual tracker. Sparse representation based appearance modeling has received an increasing amount of interest in recent years. However, most of existing work utilizes reconstruction errors to compute the observation likelihood under the generative framework, which may give poor performance, especially for significant appearance variations. In this paper, we advocate an approach to visual tracking that seeks an appropriate metric in the feature space of sparse codes and propose a metric learning based structural appearance model for more accurate matching of different appearances. This structural representation is acquired by performing multiscale max pooling on the weighted local sparse codes of image patches. An online multiple instance metric learning algorithm is proposed that learns a discriminative and adaptive metric, thereby better distinguishing the visual object of interest from the background. The multiple instance setting is able to alleviate the drift problem potentially caused by misaligned training examples. Tracking is then carried out within a Bayesian inference framework, in which the learned metric and the structure object representation are used to construct the observation model. Comprehensive experiments on challenging image sequences demonstrate qualitatively and quantitatively that the proposed algorithm outperforms the state-of-the-art methods.

AB - Appearance modeling is a key issue for the success of a visual tracker. Sparse representation based appearance modeling has received an increasing amount of interest in recent years. However, most of existing work utilizes reconstruction errors to compute the observation likelihood under the generative framework, which may give poor performance, especially for significant appearance variations. In this paper, we advocate an approach to visual tracking that seeks an appropriate metric in the feature space of sparse codes and propose a metric learning based structural appearance model for more accurate matching of different appearances. This structural representation is acquired by performing multiscale max pooling on the weighted local sparse codes of image patches. An online multiple instance metric learning algorithm is proposed that learns a discriminative and adaptive metric, thereby better distinguishing the visual object of interest from the background. The multiple instance setting is able to alleviate the drift problem potentially caused by misaligned training examples. Tracking is then carried out within a Bayesian inference framework, in which the learned metric and the structure object representation are used to construct the observation model. Comprehensive experiments on challenging image sequences demonstrate qualitatively and quantitatively that the proposed algorithm outperforms the state-of-the-art methods.

KW - Appearance modeling

KW - multiple instance metric learning

KW - multiscale max pooling

KW - object tracking

KW - sparse coding

UR - http://www.scopus.com/inward/record.url?scp=84900529184&partnerID=8YFLogxK

U2 - 10.1109/TCSVT.2013.2291283

DO - 10.1109/TCSVT.2013.2291283

M3 - Article

AN - SCOPUS:84900529184

SN - 1051-8215

VL - 24

SP - 865

EP - 877

JO - IEEE Transactions on Circuits and Systems for Video Technology

JF - IEEE Transactions on Circuits and Systems for Video Technology

IS - 5

M1 - 6665059

ER -

Metric learning based structural appearance model for robust visual tracking

摘要

访问文件

其它文件与链接

指纹

引用此