Temporal dynamic appearance modeling for online multi-person tracking

Min Yang; Yunde Jia

doi:10.1016/j.cviu.2016.05.003

Temporal dynamic appearance modeling for online multi-person tracking

Min Yang^*, Yunde Jia

^*Corresponding author for this work

School of Foreign Languages

Beijing Institute of Technology

Research output: Contribution to journal › Article › peer-review

53 Citations (Scopus)

Abstract

Robust online multi-person tracking requires the correct associations of online detection responses with existing trajectories. We address this problem by developing a novel appearance modeling approach to provide accurate appearance affinities to guide data association. In contrast to most existing algorithms that only consider the spatial structure of human appearances, we exploit the temporal dynamic characteristics within temporal appearance sequences to discriminate different persons. The temporal dynamic makes a sufficient complement to the spatial structure of varying appearances in the feature space, which significantly improves the affinity measurement between trajectories and detections. We propose a feature selection algorithm to describe the appearance variations with mid-level semantic features, and demonstrate its usefulness in terms of temporal dynamic appearance modeling. Moreover, the appearance model is learned incrementally by alternatively evaluating newly-observed appearances and adjusting the model parameters to be suitable for online tracking. Reliable tracking of multiple persons in complex scenes is achieved by incorporating the learned model into an online tracking-by-detection framework. Our experiments on the challenging benchmark MOTChallenge 2015 [L. Leal-Taixé, A. Milan, I. Reid, S. Roth, K. Schindler, MOTChallenge 2015: Towards a benchmark for multi-target tracking, arXiv preprint arXiv:1504.01942.] demonstrate that our method outperforms the state-of-the-art multi-person tracking algorithms.

Original language	English
Pages (from-to)	16-28
Number of pages	13
Journal	Computer Vision and Image Understanding
Volume	153
DOIs	https://doi.org/10.1016/j.cviu.2016.05.003
Publication status	Published - 1 Dec 2016

Keywords

Appearance modeling
Feature selection
Incremental learning
Online multi-person tracking
Temporal dynamic

Access to Document

10.1016/j.cviu.2016.05.003

Cite this

@article{dfba720669764129b539d7b7d78f0be3,

title = "Temporal dynamic appearance modeling for online multi-person tracking",

abstract = "Robust online multi-person tracking requires the correct associations of online detection responses with existing trajectories. We address this problem by developing a novel appearance modeling approach to provide accurate appearance affinities to guide data association. In contrast to most existing algorithms that only consider the spatial structure of human appearances, we exploit the temporal dynamic characteristics within temporal appearance sequences to discriminate different persons. The temporal dynamic makes a sufficient complement to the spatial structure of varying appearances in the feature space, which significantly improves the affinity measurement between trajectories and detections. We propose a feature selection algorithm to describe the appearance variations with mid-level semantic features, and demonstrate its usefulness in terms of temporal dynamic appearance modeling. Moreover, the appearance model is learned incrementally by alternatively evaluating newly-observed appearances and adjusting the model parameters to be suitable for online tracking. Reliable tracking of multiple persons in complex scenes is achieved by incorporating the learned model into an online tracking-by-detection framework. Our experiments on the challenging benchmark MOTChallenge 2015 [L. Leal-Taix{\'e}, A. Milan, I. Reid, S. Roth, K. Schindler, MOTChallenge 2015: Towards a benchmark for multi-target tracking, arXiv preprint arXiv:1504.01942.] demonstrate that our method outperforms the state-of-the-art multi-person tracking algorithms.",

keywords = "Appearance modeling, Feature selection, Incremental learning, Online multi-person tracking, Temporal dynamic",

author = "Min Yang and Yunde Jia",

note = "Publisher Copyright: {\textcopyright} 2016 Elsevier Inc.",

year = "2016",

month = dec,

day = "1",

doi = "10.1016/j.cviu.2016.05.003",

language = "English",

volume = "153",

pages = "16--28",

journal = "Computer Vision and Image Understanding",

issn = "1077-3142",

publisher = "Academic Press Inc.",

}

TY - JOUR

T1 - Temporal dynamic appearance modeling for online multi-person tracking

AU - Yang, Min

AU - Jia, Yunde

PY - 2016/12/1

Y1 - 2016/12/1

N2 - Robust online multi-person tracking requires the correct associations of online detection responses with existing trajectories. We address this problem by developing a novel appearance modeling approach to provide accurate appearance affinities to guide data association. In contrast to most existing algorithms that only consider the spatial structure of human appearances, we exploit the temporal dynamic characteristics within temporal appearance sequences to discriminate different persons. The temporal dynamic makes a sufficient complement to the spatial structure of varying appearances in the feature space, which significantly improves the affinity measurement between trajectories and detections. We propose a feature selection algorithm to describe the appearance variations with mid-level semantic features, and demonstrate its usefulness in terms of temporal dynamic appearance modeling. Moreover, the appearance model is learned incrementally by alternatively evaluating newly-observed appearances and adjusting the model parameters to be suitable for online tracking. Reliable tracking of multiple persons in complex scenes is achieved by incorporating the learned model into an online tracking-by-detection framework. Our experiments on the challenging benchmark MOTChallenge 2015 [L. Leal-Taixé, A. Milan, I. Reid, S. Roth, K. Schindler, MOTChallenge 2015: Towards a benchmark for multi-target tracking, arXiv preprint arXiv:1504.01942.] demonstrate that our method outperforms the state-of-the-art multi-person tracking algorithms.

AB - Robust online multi-person tracking requires the correct associations of online detection responses with existing trajectories. We address this problem by developing a novel appearance modeling approach to provide accurate appearance affinities to guide data association. In contrast to most existing algorithms that only consider the spatial structure of human appearances, we exploit the temporal dynamic characteristics within temporal appearance sequences to discriminate different persons. The temporal dynamic makes a sufficient complement to the spatial structure of varying appearances in the feature space, which significantly improves the affinity measurement between trajectories and detections. We propose a feature selection algorithm to describe the appearance variations with mid-level semantic features, and demonstrate its usefulness in terms of temporal dynamic appearance modeling. Moreover, the appearance model is learned incrementally by alternatively evaluating newly-observed appearances and adjusting the model parameters to be suitable for online tracking. Reliable tracking of multiple persons in complex scenes is achieved by incorporating the learned model into an online tracking-by-detection framework. Our experiments on the challenging benchmark MOTChallenge 2015 [L. Leal-Taixé, A. Milan, I. Reid, S. Roth, K. Schindler, MOTChallenge 2015: Towards a benchmark for multi-target tracking, arXiv preprint arXiv:1504.01942.] demonstrate that our method outperforms the state-of-the-art multi-person tracking algorithms.

KW - Appearance modeling

KW - Feature selection

KW - Incremental learning

KW - Online multi-person tracking

KW - Temporal dynamic

UR - http://www.scopus.com/inward/record.url?scp=84975451483&partnerID=8YFLogxK

U2 - 10.1016/j.cviu.2016.05.003

DO - 10.1016/j.cviu.2016.05.003

M3 - Article

AN - SCOPUS:84975451483

SN - 1077-3142

VL - 153

SP - 16

EP - 28

JO - Computer Vision and Image Understanding

JF - Computer Vision and Image Understanding

ER -

Temporal dynamic appearance modeling for online multi-person tracking

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this