Action recognition with motion map 3D network

Yuchao Sun; Xinxiao Wu; Wennan Yu; Feiwu Yu

doi:10.1016/j.neucom.2018.02.028

Action recognition with motion map 3D network

Yuchao Sun, Xinxiao Wu^*, Wennan Yu, Feiwu Yu

^*Corresponding author for this work

School of Computer Science and Technology

Beijing Institute of Technology

Research output: Contribution to journal › Article › peer-review

17 Citations (Scopus)

Abstract

Recently, deep neural networks have demonstrated remarkable progresses for human action recognition in videos. However, most existing deep frameworks can not handle variable-length videos properly, which leads to the degradation in classification performance. In this paper, we propose a Motion Map 3D ConvNet(MM3D), which can represent the content of a video with arbitrary video length by a motion map. In our MM3D model, a novel generation network is proposed to learn a motion map to represent a video clip by iteratively integrating a current video frame into a previous motion map. A discrimination network is also introduced for classifying actions based on the learned motion map. Experiments on the UCF101 and the HMDB51 datasets prove the effectiveness of our method for human action recognition.

Original language	English
Pages (from-to)	33-39
Number of pages	7
Journal	Neurocomputing
Volume	297
DOIs	https://doi.org/10.1016/j.neucom.2018.02.028
Publication status	Published - 5 Jul 2018

Keywords

3D-CNN
Action recognition
Discriminative information
Video analysis

Access to Document

10.1016/j.neucom.2018.02.028

Cite this

Sun, Y., Wu, X., Yu, W., & Yu, F. (2018). Action recognition with motion map 3D network. Neurocomputing, 297, 33-39. https://doi.org/10.1016/j.neucom.2018.02.028

@article{177e9fd621f74aa0af2eb02c1f54a668,

title = "Action recognition with motion map 3D network",

abstract = "Recently, deep neural networks have demonstrated remarkable progresses for human action recognition in videos. However, most existing deep frameworks can not handle variable-length videos properly, which leads to the degradation in classification performance. In this paper, we propose a Motion Map 3D ConvNet(MM3D), which can represent the content of a video with arbitrary video length by a motion map. In our MM3D model, a novel generation network is proposed to learn a motion map to represent a video clip by iteratively integrating a current video frame into a previous motion map. A discrimination network is also introduced for classifying actions based on the learned motion map. Experiments on the UCF101 and the HMDB51 datasets prove the effectiveness of our method for human action recognition.",

keywords = "3D-CNN, Action recognition, Discriminative information, Video analysis",

author = "Yuchao Sun and Xinxiao Wu and Wennan Yu and Feiwu Yu",

note = "Publisher Copyright: {\textcopyright} 2018 Elsevier B.V.",

year = "2018",

month = jul,

day = "5",

doi = "10.1016/j.neucom.2018.02.028",

language = "English",

volume = "297",

pages = "33--39",

journal = "Neurocomputing",

issn = "0925-2312",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - Action recognition with motion map 3D network

AU - Sun, Yuchao

AU - Wu, Xinxiao

AU - Yu, Wennan

AU - Yu, Feiwu

PY - 2018/7/5

Y1 - 2018/7/5

N2 - Recently, deep neural networks have demonstrated remarkable progresses for human action recognition in videos. However, most existing deep frameworks can not handle variable-length videos properly, which leads to the degradation in classification performance. In this paper, we propose a Motion Map 3D ConvNet(MM3D), which can represent the content of a video with arbitrary video length by a motion map. In our MM3D model, a novel generation network is proposed to learn a motion map to represent a video clip by iteratively integrating a current video frame into a previous motion map. A discrimination network is also introduced for classifying actions based on the learned motion map. Experiments on the UCF101 and the HMDB51 datasets prove the effectiveness of our method for human action recognition.

AB - Recently, deep neural networks have demonstrated remarkable progresses for human action recognition in videos. However, most existing deep frameworks can not handle variable-length videos properly, which leads to the degradation in classification performance. In this paper, we propose a Motion Map 3D ConvNet(MM3D), which can represent the content of a video with arbitrary video length by a motion map. In our MM3D model, a novel generation network is proposed to learn a motion map to represent a video clip by iteratively integrating a current video frame into a previous motion map. A discrimination network is also introduced for classifying actions based on the learned motion map. Experiments on the UCF101 and the HMDB51 datasets prove the effectiveness of our method for human action recognition.

KW - 3D-CNN

KW - Action recognition

KW - Discriminative information

KW - Video analysis

UR - http://www.scopus.com/inward/record.url?scp=85044099682&partnerID=8YFLogxK

U2 - 10.1016/j.neucom.2018.02.028

DO - 10.1016/j.neucom.2018.02.028

M3 - Article

AN - SCOPUS:85044099682

SN - 0925-2312

VL - 297

SP - 33

EP - 39

JO - Neurocomputing

JF - Neurocomputing

ER -

Action recognition with motion map 3D network

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this