Video person re-identification with global statistic pooling and self-attention distillation

Gaojie Lin, Sanyuan Zhao*, Jianbing Shen

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

10 引用 (Scopus)

摘要

Most existing methods for video person re-identification apply spatial-temporal global average or attention pooling to aggregate frame-level feature and get video-level feature. The obtained video-level feature models only the first-order statistics of the appearance feature from holistic video, resulting in limited representation capability of the feature network. In this paper, we propose a novel Global Statistic Pooling network (GSPnet) which takes full advantage of the second-order information for enhancing modeling capability. Firstly, a novel global statistic pooling module is proposed to summarize both the first- and second-order statistics across frame-level feature, and then transfer them into a compact and robust video-level feature embedding. Secondly, a statistic-based attention block is incorporated into multiple stages of convolutional networks to fully explore the second-order representations from low- to high-level features. To enhance the representation learning ability and further boost re-identification (re-ID) performance, we also propose a multi-level self-attention distillation training scheme, which squeezes the knowledge learned in the deeper portion of the networks into the shallow ones. Extensive experimental results have demonstrated the effectiveness and superiority of our approach on four popular video person re-ID datasets.

源语言英语
页(从-至)777-789
页数13
期刊Neurocomputing
453
DOI
出版状态已出版 - 17 9月 2021

指纹

探究 'Video person re-identification with global statistic pooling and self-attention distillation' 的科研主题。它们共同构成独一无二的指纹。

引用此