Recognizing actions in images by fusing multiple body structure cues

Yang Li, Kan Li*, Xinxin Wang

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

17 引用 (Scopus)

摘要

Although Convolutional Neural Networks (CNNs) have made substantial improvements in many computer vision tasks, there remains room for improvements in image-based action recognition due to the limited capability to exploit the body structure information.In this work, we propose a unified deep model to explicitly explore body structure information and fuse multiple body structure cues for robust action recognition in images.In order to fully explore the body structure information, we design the Body Structure Exploration sub-network.It generates two novel body structure cues, Structural Body Parts and Limb Angle Descriptor, which capture structure information of human bodies from the global and local perspectives respectively. And then, we design the Action Classification sub-network to fuse the predictions from multiple body structure cues to obtain precise results. Moreover, we integrate the two sub-networks into a unified model by sharing the bottom convolutional layers, which improves the computational efficiency in both training and testing stages. We comprehensively evaluate our network on the challenging image-based human action datasets, Pascal VOC 2012 Action and Stanford40. Our approach achieves 93.5% and 93.8% mAP respectively, which outperforms all recent approaches in this field.

源语言英语
文章编号107341
期刊Pattern Recognition
104
DOI
出版状态已出版 - 8月 2020

指纹

探究 'Recognizing actions in images by fusing multiple body structure cues' 的科研主题。它们共同构成独一无二的指纹。

引用此