View-invariant action recognition using latent kernelized structural SVM

Xinxiao Wu; Yunde Jia

doi:10.1007/978-3-642-33715-4_30

View-invariant action recognition using latent kernelized structural SVM

Xinxiao Wu^*, Yunde Jia

^*Corresponding author for this work

School of Computer Science and Technology

Beijing Institute of Technology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

31 Citations (Scopus)

Abstract

This paper goes beyond recognizing human actions from a fixed view and focuses on action recognition from an arbitrary view. A novel learning algorithm, called latent kernelized structural SVM, is proposed for the view-invariant action recognition, which extends the kernelized structural SVM framework to include latent variables. Due to the changing and frequently unknown positions of the camera, we regard the view label of action as a latent variable and implicitly infer it during both learning and inference. Motivated by the geometric correlation between different views and semantic correlation between different action classes, we additionally propose a mid-level correlation feature which describes an action video by a set of decision values from the pre-learned classifiers of all the action classes from all the views. Each decision value captures both geometric and semantic correlations between the action video and the corresponding action class from the corresponding view. After that, we combine the low-level visual cue, mid-level correlation description, and high-level label information into a novel nonlinear kernel under the latent kernelized structural SVM framework. Extensive experiments on multi-view IXMAS and MuHAVi action datasets demonstrate that our method generally achieves higher recognition accuracy than other state-of-the-art methods.

Original language	English
Title of host publication	Computer Vision, ECCV 2012 - 12th European Conference on Computer Vision, Proceedings
Pages	411-424
Number of pages	14
Edition	PART 5
DOIs	https://doi.org/10.1007/978-3-642-33715-4_30
Publication status	Published - 2012
Event	12th European Conference on Computer Vision, ECCV 2012 - Florence, Italy Duration: 7 Oct 2012 → 13 Oct 2012

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Number	PART 5
Volume	7576 LNCS
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	12th European Conference on Computer Vision, ECCV 2012
Country/Territory	Italy
City	Florence
Period	7/10/12 → 13/10/12

Keywords

View-invariant action recognition
correlation feature
latent kernelized structural SVM
multiple level features

Access to Document

10.1007/978-3-642-33715-4_30

Cite this

Wu, X., & Jia, Y. (2012). View-invariant action recognition using latent kernelized structural SVM. In Computer Vision, ECCV 2012 - 12th European Conference on Computer Vision, Proceedings (PART 5 ed., pp. 411-424). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7576 LNCS, No. PART 5). https://doi.org/10.1007/978-3-642-33715-4_30

@inproceedings{d61f38016a964add871fc9b2c54d5e19,

title = "View-invariant action recognition using latent kernelized structural SVM",

abstract = "This paper goes beyond recognizing human actions from a fixed view and focuses on action recognition from an arbitrary view. A novel learning algorithm, called latent kernelized structural SVM, is proposed for the view-invariant action recognition, which extends the kernelized structural SVM framework to include latent variables. Due to the changing and frequently unknown positions of the camera, we regard the view label of action as a latent variable and implicitly infer it during both learning and inference. Motivated by the geometric correlation between different views and semantic correlation between different action classes, we additionally propose a mid-level correlation feature which describes an action video by a set of decision values from the pre-learned classifiers of all the action classes from all the views. Each decision value captures both geometric and semantic correlations between the action video and the corresponding action class from the corresponding view. After that, we combine the low-level visual cue, mid-level correlation description, and high-level label information into a novel nonlinear kernel under the latent kernelized structural SVM framework. Extensive experiments on multi-view IXMAS and MuHAVi action datasets demonstrate that our method generally achieves higher recognition accuracy than other state-of-the-art methods.",

keywords = "View-invariant action recognition, correlation feature, latent kernelized structural SVM, multiple level features",

author = "Xinxiao Wu and Yunde Jia",

year = "2012",

doi = "10.1007/978-3-642-33715-4_30",

language = "English",

isbn = "9783642337147",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

number = "PART 5",

pages = "411--424",

booktitle = "Computer Vision, ECCV 2012 - 12th European Conference on Computer Vision, Proceedings",

edition = "PART 5",

note = "12th European Conference on Computer Vision, ECCV 2012 ; Conference date: 07-10-2012 Through 13-10-2012",

}

Wu, X & Jia, Y 2012, View-invariant action recognition using latent kernelized structural SVM. in Computer Vision, ECCV 2012 - 12th European Conference on Computer Vision, Proceedings. PART 5 edn, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), no. PART 5, vol. 7576 LNCS, pp. 411-424, 12th European Conference on Computer Vision, ECCV 2012, Florence, Italy, 7/10/12. https://doi.org/10.1007/978-3-642-33715-4_30

View-invariant action recognition using latent kernelized structural SVM. / Wu, Xinxiao; Jia, Yunde.
Computer Vision, ECCV 2012 - 12th European Conference on Computer Vision, Proceedings. PART 5. ed. 2012. p. 411-424 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7576 LNCS, No. PART 5).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - View-invariant action recognition using latent kernelized structural SVM

AU - Wu, Xinxiao

AU - Jia, Yunde

PY - 2012

Y1 - 2012

N2 - This paper goes beyond recognizing human actions from a fixed view and focuses on action recognition from an arbitrary view. A novel learning algorithm, called latent kernelized structural SVM, is proposed for the view-invariant action recognition, which extends the kernelized structural SVM framework to include latent variables. Due to the changing and frequently unknown positions of the camera, we regard the view label of action as a latent variable and implicitly infer it during both learning and inference. Motivated by the geometric correlation between different views and semantic correlation between different action classes, we additionally propose a mid-level correlation feature which describes an action video by a set of decision values from the pre-learned classifiers of all the action classes from all the views. Each decision value captures both geometric and semantic correlations between the action video and the corresponding action class from the corresponding view. After that, we combine the low-level visual cue, mid-level correlation description, and high-level label information into a novel nonlinear kernel under the latent kernelized structural SVM framework. Extensive experiments on multi-view IXMAS and MuHAVi action datasets demonstrate that our method generally achieves higher recognition accuracy than other state-of-the-art methods.

AB - This paper goes beyond recognizing human actions from a fixed view and focuses on action recognition from an arbitrary view. A novel learning algorithm, called latent kernelized structural SVM, is proposed for the view-invariant action recognition, which extends the kernelized structural SVM framework to include latent variables. Due to the changing and frequently unknown positions of the camera, we regard the view label of action as a latent variable and implicitly infer it during both learning and inference. Motivated by the geometric correlation between different views and semantic correlation between different action classes, we additionally propose a mid-level correlation feature which describes an action video by a set of decision values from the pre-learned classifiers of all the action classes from all the views. Each decision value captures both geometric and semantic correlations between the action video and the corresponding action class from the corresponding view. After that, we combine the low-level visual cue, mid-level correlation description, and high-level label information into a novel nonlinear kernel under the latent kernelized structural SVM framework. Extensive experiments on multi-view IXMAS and MuHAVi action datasets demonstrate that our method generally achieves higher recognition accuracy than other state-of-the-art methods.

KW - View-invariant action recognition

KW - correlation feature

KW - latent kernelized structural SVM

KW - multiple level features

UR - http://www.scopus.com/inward/record.url?scp=84867882430&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-33715-4_30

DO - 10.1007/978-3-642-33715-4_30

M3 - Conference contribution

AN - SCOPUS:84867882430

SN - 9783642337147

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 411

EP - 424

BT - Computer Vision, ECCV 2012 - 12th European Conference on Computer Vision, Proceedings

T2 - 12th European Conference on Computer Vision, ECCV 2012

Y2 - 7 October 2012 through 13 October 2012

ER -

Wu X, Jia Y. View-invariant action recognition using latent kernelized structural SVM. In Computer Vision, ECCV 2012 - 12th European Conference on Computer Vision, Proceedings. PART 5 ed. 2012. p. 411-424. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 5). doi: 10.1007/978-3-642-33715-4_30

View-invariant action recognition using latent kernelized structural SVM

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this