AA-RGTCN: reciprocal global temporal convolution network with adaptive alignment for video-based person re-identification

Yanjun Zhang; Yanru Lin; Xu Yang

doi:10.3389/fnins.2024.1329884

AA-RGTCN: reciprocal global temporal convolution network with adaptive alignment for video-based person re-identification

Yanjun Zhang, Yanru Lin, Xu Yang^*

^*此作品的通讯作者

网络空间安全学院

Beijing Institute of Technology

科研成果: 期刊稿件 › 文章 › 同行评审

1 引用（Scopus）

摘要

Person re-identification(Re-ID) aims to retrieve pedestrians under different cameras. Compared with image-based Re-ID, video-based Re-ID extracts features from video sequences that contain both spatial features and temporal features. Existing methods usually focus on the most attractive image parts, and this will lead to redundant spatial description and insufficient temporal description. Other methods that take temporal clues into consideration usually ignore misalignment between frames and only focus on a fixed length of one given sequence. In this study, we proposed a Reciprocal Global Temporal Convolution Network with Adaptive Alignment(AA-RGTCN). The structure could address the drawback of misalignment between frames and model discriminative temporal representation. Specifically, the Adaptive Alignment block is designed to shift each frame adaptively to its best position for temporal modeling. Then, we proposed the Reciprocal Global Temporal Convolution Network to model robust temporal features across different time intervals along both normal and inverted time order. The experimental results show that our AA-RGTCN can achieve 85.9% mAP and 91.0% Rank-1 on MARS, 90.6% Rank-1 on iLIDS-VID, and 96.6% Rank-1 on PRID-2011, indicating we could gain better performance than other state-of-the-art approaches.

源语言	英语
文章编号	1329884
期刊	Frontiers in Neuroscience
卷	18
DOI	https://doi.org/10.3389/fnins.2024.1329884
出版状态	已出版 - 2024

访问文件

10.3389/fnins.2024.1329884

其它文件与链接

链接到 Scopus 的出版物

引用此

Zhang, Y., Lin, Y., & Yang, X. (2024). AA-RGTCN: reciprocal global temporal convolution network with adaptive alignment for video-based person re-identification. Frontiers in Neuroscience, 18, 文章 1329884. https://doi.org/10.3389/fnins.2024.1329884

@article{fc3752b6fc61441c9e2dcac98fe3a146,

title = "AA-RGTCN: reciprocal global temporal convolution network with adaptive alignment for video-based person re-identification",

abstract = "Person re-identification(Re-ID) aims to retrieve pedestrians under different cameras. Compared with image-based Re-ID, video-based Re-ID extracts features from video sequences that contain both spatial features and temporal features. Existing methods usually focus on the most attractive image parts, and this will lead to redundant spatial description and insufficient temporal description. Other methods that take temporal clues into consideration usually ignore misalignment between frames and only focus on a fixed length of one given sequence. In this study, we proposed a Reciprocal Global Temporal Convolution Network with Adaptive Alignment(AA-RGTCN). The structure could address the drawback of misalignment between frames and model discriminative temporal representation. Specifically, the Adaptive Alignment block is designed to shift each frame adaptively to its best position for temporal modeling. Then, we proposed the Reciprocal Global Temporal Convolution Network to model robust temporal features across different time intervals along both normal and inverted time order. The experimental results show that our AA-RGTCN can achieve 85.9% mAP and 91.0% Rank-1 on MARS, 90.6% Rank-1 on iLIDS-VID, and 96.6% Rank-1 on PRID-2011, indicating we could gain better performance than other state-of-the-art approaches.",

keywords = "convolutional neural network, frame alignment, image recognition, temporal modeling, video person re-identification",

author = "Yanjun Zhang and Yanru Lin and Xu Yang",

note = "Publisher Copyright: Copyright {\textcopyright} 2024 Zhang, Lin and Yang.",

year = "2024",

doi = "10.3389/fnins.2024.1329884",

language = "English",

volume = "18",

journal = "Frontiers in Neuroscience",

issn = "1662-4548",

publisher = "Frontiers Media SA",

}

TY - JOUR

T1 - AA-RGTCN

T2 - reciprocal global temporal convolution network with adaptive alignment for video-based person re-identification

AU - Zhang, Yanjun

AU - Lin, Yanru

AU - Yang, Xu

PY - 2024

Y1 - 2024

N2 - Person re-identification(Re-ID) aims to retrieve pedestrians under different cameras. Compared with image-based Re-ID, video-based Re-ID extracts features from video sequences that contain both spatial features and temporal features. Existing methods usually focus on the most attractive image parts, and this will lead to redundant spatial description and insufficient temporal description. Other methods that take temporal clues into consideration usually ignore misalignment between frames and only focus on a fixed length of one given sequence. In this study, we proposed a Reciprocal Global Temporal Convolution Network with Adaptive Alignment(AA-RGTCN). The structure could address the drawback of misalignment between frames and model discriminative temporal representation. Specifically, the Adaptive Alignment block is designed to shift each frame adaptively to its best position for temporal modeling. Then, we proposed the Reciprocal Global Temporal Convolution Network to model robust temporal features across different time intervals along both normal and inverted time order. The experimental results show that our AA-RGTCN can achieve 85.9% mAP and 91.0% Rank-1 on MARS, 90.6% Rank-1 on iLIDS-VID, and 96.6% Rank-1 on PRID-2011, indicating we could gain better performance than other state-of-the-art approaches.

AB - Person re-identification(Re-ID) aims to retrieve pedestrians under different cameras. Compared with image-based Re-ID, video-based Re-ID extracts features from video sequences that contain both spatial features and temporal features. Existing methods usually focus on the most attractive image parts, and this will lead to redundant spatial description and insufficient temporal description. Other methods that take temporal clues into consideration usually ignore misalignment between frames and only focus on a fixed length of one given sequence. In this study, we proposed a Reciprocal Global Temporal Convolution Network with Adaptive Alignment(AA-RGTCN). The structure could address the drawback of misalignment between frames and model discriminative temporal representation. Specifically, the Adaptive Alignment block is designed to shift each frame adaptively to its best position for temporal modeling. Then, we proposed the Reciprocal Global Temporal Convolution Network to model robust temporal features across different time intervals along both normal and inverted time order. The experimental results show that our AA-RGTCN can achieve 85.9% mAP and 91.0% Rank-1 on MARS, 90.6% Rank-1 on iLIDS-VID, and 96.6% Rank-1 on PRID-2011, indicating we could gain better performance than other state-of-the-art approaches.

KW - convolutional neural network

KW - frame alignment

KW - image recognition

KW - temporal modeling

KW - video person re-identification

UR - http://www.scopus.com/inward/record.url?scp=85189492043&partnerID=8YFLogxK

U2 - 10.3389/fnins.2024.1329884

DO - 10.3389/fnins.2024.1329884

M3 - Article

AN - SCOPUS:85189492043

SN - 1662-4548

VL - 18

JO - Frontiers in Neuroscience

JF - Frontiers in Neuroscience

M1 - 1329884

ER -

AA-RGTCN: reciprocal global temporal convolution network with adaptive alignment for video-based person re-identification

摘要

访问文件

其它文件与链接

指纹

引用此