Dual-Semantic Consistency Learning for Visible-Infrared Person Re-Identification

Yiyuan Zhang; Yuhao Kang; Sanyuan Zhao; Jianbing Shen

doi:10.1109/TIFS.2022.3224853

Dual-Semantic Consistency Learning for Visible-Infrared Person Re-Identification

Yiyuan Zhang, Yuhao Kang, Sanyuan Zhao^*, Jianbing Shen

^*此作品的通讯作者

计算机学院

科研成果: 期刊稿件 › 文章 › 同行评审

20 引用（Scopus）

摘要

Visible-Infrared person Re-Identification (VI-ReID) conducts comprehensive identity analysis on non-overlapping visible and infrared camera sets for intelligent surveillance systems, which face huge instance variations derived from modality discrepancy. Existing methods employ different kinds of network structure to extract modality-invariant features. Differently, we propose a novel framework, named Dual-Semantic Consistency Learning Network (DSCNet), which attributes modality discrepancy to channel-level semantic inconsistency. DSCNet optimizes channel consistency from two aspects, fine-grained inter-channel semantics, and comprehensive inter-modality semantics. Furthermore, we propose Joint Semantics Metric Learning to simultaneously optimize the distribution of the channel-and-modality feature embeddings. It jointly exploits the correlation between channel-specific and modality-specific semantics in a fine-grained manner. We conduct a series of experiments on the SYSU-MM01 and RegDB datasets, which validates that DSCNet delivers superiority compared with current state-of-the-art methods. On the more challenging SYSU-MM01 dataset, our network can achieve 73.89% Rank-1 accuracy and 69.47% mAP value. Our code is available at https://github.com/bitreidgroup/DSCNet.

源语言	英语
页（从-至）	1554-1565
页数	12
期刊	IEEE Transactions on Information Forensics and Security
卷	18
DOI	https://doi.org/10.1109/TIFS.2022.3224853
出版状态	已出版 - 2023

访问文件

10.1109/TIFS.2022.3224853

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{4d6bac8d7eca4777a4dd38add8a9fd2e,

title = "Dual-Semantic Consistency Learning for Visible-Infrared Person Re-Identification",

abstract = "Visible-Infrared person Re-Identification (VI-ReID) conducts comprehensive identity analysis on non-overlapping visible and infrared camera sets for intelligent surveillance systems, which face huge instance variations derived from modality discrepancy. Existing methods employ different kinds of network structure to extract modality-invariant features. Differently, we propose a novel framework, named Dual-Semantic Consistency Learning Network (DSCNet), which attributes modality discrepancy to channel-level semantic inconsistency. DSCNet optimizes channel consistency from two aspects, fine-grained inter-channel semantics, and comprehensive inter-modality semantics. Furthermore, we propose Joint Semantics Metric Learning to simultaneously optimize the distribution of the channel-and-modality feature embeddings. It jointly exploits the correlation between channel-specific and modality-specific semantics in a fine-grained manner. We conduct a series of experiments on the SYSU-MM01 and RegDB datasets, which validates that DSCNet delivers superiority compared with current state-of-the-art methods. On the more challenging SYSU-MM01 dataset, our network can achieve 73.89% Rank-1 accuracy and 69.47% mAP value. Our code is available at https://github.com/bitreidgroup/DSCNet.",

keywords = "Visible-infrared person re-identification, person re-identification, semantic consistency",

author = "Yiyuan Zhang and Yuhao Kang and Sanyuan Zhao and Jianbing Shen",

note = "Publisher Copyright: {\textcopyright} 2005-2012 IEEE.",

year = "2023",

doi = "10.1109/TIFS.2022.3224853",

language = "English",

volume = "18",

pages = "1554--1565",

journal = "IEEE Transactions on Information Forensics and Security",

issn = "1556-6013",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Dual-Semantic Consistency Learning for Visible-Infrared Person Re-Identification

AU - Zhang, Yiyuan

AU - Kang, Yuhao

AU - Zhao, Sanyuan

AU - Shen, Jianbing

PY - 2023

Y1 - 2023

N2 - Visible-Infrared person Re-Identification (VI-ReID) conducts comprehensive identity analysis on non-overlapping visible and infrared camera sets for intelligent surveillance systems, which face huge instance variations derived from modality discrepancy. Existing methods employ different kinds of network structure to extract modality-invariant features. Differently, we propose a novel framework, named Dual-Semantic Consistency Learning Network (DSCNet), which attributes modality discrepancy to channel-level semantic inconsistency. DSCNet optimizes channel consistency from two aspects, fine-grained inter-channel semantics, and comprehensive inter-modality semantics. Furthermore, we propose Joint Semantics Metric Learning to simultaneously optimize the distribution of the channel-and-modality feature embeddings. It jointly exploits the correlation between channel-specific and modality-specific semantics in a fine-grained manner. We conduct a series of experiments on the SYSU-MM01 and RegDB datasets, which validates that DSCNet delivers superiority compared with current state-of-the-art methods. On the more challenging SYSU-MM01 dataset, our network can achieve 73.89% Rank-1 accuracy and 69.47% mAP value. Our code is available at https://github.com/bitreidgroup/DSCNet.

AB - Visible-Infrared person Re-Identification (VI-ReID) conducts comprehensive identity analysis on non-overlapping visible and infrared camera sets for intelligent surveillance systems, which face huge instance variations derived from modality discrepancy. Existing methods employ different kinds of network structure to extract modality-invariant features. Differently, we propose a novel framework, named Dual-Semantic Consistency Learning Network (DSCNet), which attributes modality discrepancy to channel-level semantic inconsistency. DSCNet optimizes channel consistency from two aspects, fine-grained inter-channel semantics, and comprehensive inter-modality semantics. Furthermore, we propose Joint Semantics Metric Learning to simultaneously optimize the distribution of the channel-and-modality feature embeddings. It jointly exploits the correlation between channel-specific and modality-specific semantics in a fine-grained manner. We conduct a series of experiments on the SYSU-MM01 and RegDB datasets, which validates that DSCNet delivers superiority compared with current state-of-the-art methods. On the more challenging SYSU-MM01 dataset, our network can achieve 73.89% Rank-1 accuracy and 69.47% mAP value. Our code is available at https://github.com/bitreidgroup/DSCNet.

KW - Visible-infrared person re-identification

KW - person re-identification

KW - semantic consistency

UR - http://www.scopus.com/inward/record.url?scp=85144047797&partnerID=8YFLogxK

U2 - 10.1109/TIFS.2022.3224853

DO - 10.1109/TIFS.2022.3224853

M3 - Article

AN - SCOPUS:85144047797

SN - 1556-6013

VL - 18

SP - 1554

EP - 1565

JO - IEEE Transactions on Information Forensics and Security

JF - IEEE Transactions on Information Forensics and Security

ER -

Dual-Semantic Consistency Learning for Visible-Infrared Person Re-Identification

摘要

访问文件

其它文件与链接

指纹

引用此