TY - JOUR
T1 - Dual-Semantic Consistency Learning for Visible-Infrared Person Re-Identification
AU - Zhang, Yiyuan
AU - Kang, Yuhao
AU - Zhao, Sanyuan
AU - Shen, Jianbing
N1 - Publisher Copyright:
© 2005-2012 IEEE.
PY - 2023
Y1 - 2023
N2 - Visible-Infrared person Re-Identification (VI-ReID) conducts comprehensive identity analysis on non-overlapping visible and infrared camera sets for intelligent surveillance systems, which face huge instance variations derived from modality discrepancy. Existing methods employ different kinds of network structure to extract modality-invariant features. Differently, we propose a novel framework, named Dual-Semantic Consistency Learning Network (DSCNet), which attributes modality discrepancy to channel-level semantic inconsistency. DSCNet optimizes channel consistency from two aspects, fine-grained inter-channel semantics, and comprehensive inter-modality semantics. Furthermore, we propose Joint Semantics Metric Learning to simultaneously optimize the distribution of the channel-and-modality feature embeddings. It jointly exploits the correlation between channel-specific and modality-specific semantics in a fine-grained manner. We conduct a series of experiments on the SYSU-MM01 and RegDB datasets, which validates that DSCNet delivers superiority compared with current state-of-the-art methods. On the more challenging SYSU-MM01 dataset, our network can achieve 73.89% Rank-1 accuracy and 69.47% mAP value. Our code is available at https://github.com/bitreidgroup/DSCNet.
AB - Visible-Infrared person Re-Identification (VI-ReID) conducts comprehensive identity analysis on non-overlapping visible and infrared camera sets for intelligent surveillance systems, which face huge instance variations derived from modality discrepancy. Existing methods employ different kinds of network structure to extract modality-invariant features. Differently, we propose a novel framework, named Dual-Semantic Consistency Learning Network (DSCNet), which attributes modality discrepancy to channel-level semantic inconsistency. DSCNet optimizes channel consistency from two aspects, fine-grained inter-channel semantics, and comprehensive inter-modality semantics. Furthermore, we propose Joint Semantics Metric Learning to simultaneously optimize the distribution of the channel-and-modality feature embeddings. It jointly exploits the correlation between channel-specific and modality-specific semantics in a fine-grained manner. We conduct a series of experiments on the SYSU-MM01 and RegDB datasets, which validates that DSCNet delivers superiority compared with current state-of-the-art methods. On the more challenging SYSU-MM01 dataset, our network can achieve 73.89% Rank-1 accuracy and 69.47% mAP value. Our code is available at https://github.com/bitreidgroup/DSCNet.
KW - Visible-infrared person re-identification
KW - person re-identification
KW - semantic consistency
UR - http://www.scopus.com/inward/record.url?scp=85144047797&partnerID=8YFLogxK
U2 - 10.1109/TIFS.2022.3224853
DO - 10.1109/TIFS.2022.3224853
M3 - Article
AN - SCOPUS:85144047797
SN - 1556-6013
VL - 18
SP - 1554
EP - 1565
JO - IEEE Transactions on Information Forensics and Security
JF - IEEE Transactions on Information Forensics and Security
ER -