A two-level attention-based interaction model for multi-person activity recognition

Lihua Lu; Huijun Di; Yao Lu; Lin Zhang; Shunzhou Wang

doi:10.1016/j.neucom.2018.09.060

A two-level attention-based interaction model for multi-person activity recognition

Lihua Lu, Huijun Di, Yao Lu^*, Lin Zhang, Shunzhou Wang

^*此作品的通讯作者

计算机学院

Beijing Institute of Technology

科研成果: 期刊稿件 › 文章 › 同行评审

19 引用（Scopus）

摘要

Multi-person activity recognition is a challenging task due to its elusive interactions in activities. We take into account these interactions at two levels. At the individual level, each person behaves depending on both its spatio-temporal features and interactions propagated from others in the scene. At the scene level, the multi-person activity is characterized by interactions between individuals’ actions and the high-level activity. It is worth noting that interactions contribute unequally at both levels. To jointly explore these colorful interactions, we propose a two-level attention-based interaction model relying on two time-varying attention mechanisms. The individual-level attention mechanism conditioned on pose features, exploits various degrees of interactions among individuals in a scene while updating their states at each time step. The scene-level attention mechanism proposes an attention-based pooling strategy to explore various levels of interactions between individuals’ actions and the high-level activity. We ground our model by a modified two-stage Gated Recurrent Units (GRUs) network to handle the long-range temporal variability and consistency. Our end-to-end trainable model takes as inputs a set of person detections in videos or image sequences and predicts labels of multi-person activities. Experimental results demonstrate comparable performance of our model and show the effectiveness of our attention mechanisms.

源语言	英语
页（从-至）	195-205
页数	11
期刊	Neurocomputing
卷	322
DOI	https://doi.org/10.1016/j.neucom.2018.09.060
出版状态	已出版 - 17 12月 2018

访问文件

10.1016/j.neucom.2018.09.060

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{509e8ef860684856a7d62c2c5dcdd4a0,

title = "A two-level attention-based interaction model for multi-person activity recognition",

abstract = "Multi-person activity recognition is a challenging task due to its elusive interactions in activities. We take into account these interactions at two levels. At the individual level, each person behaves depending on both its spatio-temporal features and interactions propagated from others in the scene. At the scene level, the multi-person activity is characterized by interactions between individuals{\textquoteright} actions and the high-level activity. It is worth noting that interactions contribute unequally at both levels. To jointly explore these colorful interactions, we propose a two-level attention-based interaction model relying on two time-varying attention mechanisms. The individual-level attention mechanism conditioned on pose features, exploits various degrees of interactions among individuals in a scene while updating their states at each time step. The scene-level attention mechanism proposes an attention-based pooling strategy to explore various levels of interactions between individuals{\textquoteright} actions and the high-level activity. We ground our model by a modified two-stage Gated Recurrent Units (GRUs) network to handle the long-range temporal variability and consistency. Our end-to-end trainable model takes as inputs a set of person detections in videos or image sequences and predicts labels of multi-person activities. Experimental results demonstrate comparable performance of our model and show the effectiveness of our attention mechanisms.",

keywords = "Attention mechanism, Individual level, Multi-person activity recognition, Scene level",

author = "Lihua Lu and Huijun Di and Yao Lu and Lin Zhang and Shunzhou Wang",

note = "Publisher Copyright: {\textcopyright} 2018 Elsevier B.V.",

year = "2018",

month = dec,

day = "17",

doi = "10.1016/j.neucom.2018.09.060",

language = "English",

volume = "322",

pages = "195--205",

journal = "Neurocomputing",

issn = "0925-2312",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - A two-level attention-based interaction model for multi-person activity recognition

AU - Lu, Lihua

AU - Di, Huijun

AU - Lu, Yao

AU - Zhang, Lin

AU - Wang, Shunzhou

PY - 2018/12/17

Y1 - 2018/12/17

N2 - Multi-person activity recognition is a challenging task due to its elusive interactions in activities. We take into account these interactions at two levels. At the individual level, each person behaves depending on both its spatio-temporal features and interactions propagated from others in the scene. At the scene level, the multi-person activity is characterized by interactions between individuals’ actions and the high-level activity. It is worth noting that interactions contribute unequally at both levels. To jointly explore these colorful interactions, we propose a two-level attention-based interaction model relying on two time-varying attention mechanisms. The individual-level attention mechanism conditioned on pose features, exploits various degrees of interactions among individuals in a scene while updating their states at each time step. The scene-level attention mechanism proposes an attention-based pooling strategy to explore various levels of interactions between individuals’ actions and the high-level activity. We ground our model by a modified two-stage Gated Recurrent Units (GRUs) network to handle the long-range temporal variability and consistency. Our end-to-end trainable model takes as inputs a set of person detections in videos or image sequences and predicts labels of multi-person activities. Experimental results demonstrate comparable performance of our model and show the effectiveness of our attention mechanisms.

AB - Multi-person activity recognition is a challenging task due to its elusive interactions in activities. We take into account these interactions at two levels. At the individual level, each person behaves depending on both its spatio-temporal features and interactions propagated from others in the scene. At the scene level, the multi-person activity is characterized by interactions between individuals’ actions and the high-level activity. It is worth noting that interactions contribute unequally at both levels. To jointly explore these colorful interactions, we propose a two-level attention-based interaction model relying on two time-varying attention mechanisms. The individual-level attention mechanism conditioned on pose features, exploits various degrees of interactions among individuals in a scene while updating their states at each time step. The scene-level attention mechanism proposes an attention-based pooling strategy to explore various levels of interactions between individuals’ actions and the high-level activity. We ground our model by a modified two-stage Gated Recurrent Units (GRUs) network to handle the long-range temporal variability and consistency. Our end-to-end trainable model takes as inputs a set of person detections in videos or image sequences and predicts labels of multi-person activities. Experimental results demonstrate comparable performance of our model and show the effectiveness of our attention mechanisms.

KW - Attention mechanism

KW - Individual level

KW - Multi-person activity recognition

KW - Scene level

UR - http://www.scopus.com/inward/record.url?scp=85054601020&partnerID=8YFLogxK

U2 - 10.1016/j.neucom.2018.09.060

DO - 10.1016/j.neucom.2018.09.060

M3 - Article

AN - SCOPUS:85054601020

SN - 0925-2312

VL - 322

SP - 195

EP - 205

JO - Neurocomputing

JF - Neurocomputing

ER -

A two-level attention-based interaction model for multi-person activity recognition

摘要

访问文件

其它文件与链接

指纹

引用此