Alleviating Modality Bias Training for Infrared-Visible Person Re-Identification

Yan Huang; Qiang Wu; Jingsong Xu; Yi Zhong; Peng Zhang; Zhaoxiang Zhang

doi:10.1109/TMM.2021.3067760

Alleviating Modality Bias Training for Infrared-Visible Person Re-Identification

Yan Huang, Qiang Wu, Jingsong Xu, Yi Zhong^*, Peng Zhang, Zhaoxiang Zhang

^*Corresponding author for this work

School of Information and Electronics

Research output: Contribution to journal › Article › peer-review

40 Citations (Scopus)

Abstract

The task of infrared-visible person re-identification (IV-reID) is to recognize people across two modalities (i.e., RGB and IR). Existing cutting-edge approaches normally use a pair of images that have the same IDs (i.e., ID-tied cross-modality image pairs) and input them into an ImageNet-trained ResNet50. The ResNet50 backbone model can learn shared features across modalities to tolerate modality discrepancies between RGB and IR. This work will unveil a Modality Bias Training (MBT) problem that is less discussed in IV-reID, which will demonstrate that MBT significantly compromises the performance of IV-reID. Due to MBT, IR information can be overwhelmed by RGB information during training when the ResNet50 model is pretrained based on a large amount of RGB images from ImageNet. Thus, the trained models are more inclined to RGB information. Accordingly, the cross-modality generalization ability of the model is also compromised. To tackle this issue, we present a Dual-level Learning Strategy (DLS) that 1) enforces the focus of the network on ID-exclusive (rather than ID-tied) labels of cross-modality image pairs to mitigate the problem of MBT and 2) introduces third modality data that contain both RGB and IR information to further prevent the information from the IR modality from being overwhelmed during training. Our third modality images are generated by a generative adversarial network. A dynamic ID-exclusive Smooth (dIDeS) label is proposed for the generated third modality data. In experiments, comprehensive experiments are carried out to demonstrate the success of DLS in tackling the MBT issue exposed in IV-reID.

Original language	English
Pages (from-to)	1570-1582
Number of pages	13
Journal	IEEE Transactions on Multimedia
Volume	24
DOIs	https://doi.org/10.1109/TMM.2021.3067760
Publication status	Published - 2022

Keywords

Cross modality
modality bias training
person re-identification

Access to Document

10.1109/TMM.2021.3067760

Cite this

@article{f68cae613a3b4b799c16bfe2d3124ba7,

title = "Alleviating Modality Bias Training for Infrared-Visible Person Re-Identification",

abstract = "The task of infrared-visible person re-identification (IV-reID) is to recognize people across two modalities (i.e., RGB and IR). Existing cutting-edge approaches normally use a pair of images that have the same IDs (i.e., ID-tied cross-modality image pairs) and input them into an ImageNet-trained ResNet50. The ResNet50 backbone model can learn shared features across modalities to tolerate modality discrepancies between RGB and IR. This work will unveil a Modality Bias Training (MBT) problem that is less discussed in IV-reID, which will demonstrate that MBT significantly compromises the performance of IV-reID. Due to MBT, IR information can be overwhelmed by RGB information during training when the ResNet50 model is pretrained based on a large amount of RGB images from ImageNet. Thus, the trained models are more inclined to RGB information. Accordingly, the cross-modality generalization ability of the model is also compromised. To tackle this issue, we present a Dual-level Learning Strategy (DLS) that 1) enforces the focus of the network on ID-exclusive (rather than ID-tied) labels of cross-modality image pairs to mitigate the problem of MBT and 2) introduces third modality data that contain both RGB and IR information to further prevent the information from the IR modality from being overwhelmed during training. Our third modality images are generated by a generative adversarial network. A dynamic ID-exclusive Smooth (dIDeS) label is proposed for the generated third modality data. In experiments, comprehensive experiments are carried out to demonstrate the success of DLS in tackling the MBT issue exposed in IV-reID.",

keywords = "Cross modality, modality bias training, person re-identification",

author = "Yan Huang and Qiang Wu and Jingsong Xu and Yi Zhong and Peng Zhang and Zhaoxiang Zhang",

note = "Publisher Copyright: {\textcopyright} 1999-2012 IEEE.",

year = "2022",

doi = "10.1109/TMM.2021.3067760",

language = "English",

volume = "24",

pages = "1570--1582",

journal = "IEEE Transactions on Multimedia",

issn = "1520-9210",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Alleviating Modality Bias Training for Infrared-Visible Person Re-Identification

AU - Huang, Yan

AU - Wu, Qiang

AU - Xu, Jingsong

AU - Zhong, Yi

AU - Zhang, Peng

AU - Zhang, Zhaoxiang

PY - 2022

Y1 - 2022

N2 - The task of infrared-visible person re-identification (IV-reID) is to recognize people across two modalities (i.e., RGB and IR). Existing cutting-edge approaches normally use a pair of images that have the same IDs (i.e., ID-tied cross-modality image pairs) and input them into an ImageNet-trained ResNet50. The ResNet50 backbone model can learn shared features across modalities to tolerate modality discrepancies between RGB and IR. This work will unveil a Modality Bias Training (MBT) problem that is less discussed in IV-reID, which will demonstrate that MBT significantly compromises the performance of IV-reID. Due to MBT, IR information can be overwhelmed by RGB information during training when the ResNet50 model is pretrained based on a large amount of RGB images from ImageNet. Thus, the trained models are more inclined to RGB information. Accordingly, the cross-modality generalization ability of the model is also compromised. To tackle this issue, we present a Dual-level Learning Strategy (DLS) that 1) enforces the focus of the network on ID-exclusive (rather than ID-tied) labels of cross-modality image pairs to mitigate the problem of MBT and 2) introduces third modality data that contain both RGB and IR information to further prevent the information from the IR modality from being overwhelmed during training. Our third modality images are generated by a generative adversarial network. A dynamic ID-exclusive Smooth (dIDeS) label is proposed for the generated third modality data. In experiments, comprehensive experiments are carried out to demonstrate the success of DLS in tackling the MBT issue exposed in IV-reID.

AB - The task of infrared-visible person re-identification (IV-reID) is to recognize people across two modalities (i.e., RGB and IR). Existing cutting-edge approaches normally use a pair of images that have the same IDs (i.e., ID-tied cross-modality image pairs) and input them into an ImageNet-trained ResNet50. The ResNet50 backbone model can learn shared features across modalities to tolerate modality discrepancies between RGB and IR. This work will unveil a Modality Bias Training (MBT) problem that is less discussed in IV-reID, which will demonstrate that MBT significantly compromises the performance of IV-reID. Due to MBT, IR information can be overwhelmed by RGB information during training when the ResNet50 model is pretrained based on a large amount of RGB images from ImageNet. Thus, the trained models are more inclined to RGB information. Accordingly, the cross-modality generalization ability of the model is also compromised. To tackle this issue, we present a Dual-level Learning Strategy (DLS) that 1) enforces the focus of the network on ID-exclusive (rather than ID-tied) labels of cross-modality image pairs to mitigate the problem of MBT and 2) introduces third modality data that contain both RGB and IR information to further prevent the information from the IR modality from being overwhelmed during training. Our third modality images are generated by a generative adversarial network. A dynamic ID-exclusive Smooth (dIDeS) label is proposed for the generated third modality data. In experiments, comprehensive experiments are carried out to demonstrate the success of DLS in tackling the MBT issue exposed in IV-reID.

KW - Cross modality

KW - modality bias training

KW - person re-identification

UR - http://www.scopus.com/inward/record.url?scp=85103277513&partnerID=8YFLogxK

U2 - 10.1109/TMM.2021.3067760

DO - 10.1109/TMM.2021.3067760

M3 - Article

AN - SCOPUS:85103277513

SN - 1520-9210

VL - 24

SP - 1570

EP - 1582

JO - IEEE Transactions on Multimedia

JF - IEEE Transactions on Multimedia

ER -

Alleviating Modality Bias Training for Infrared-Visible Person Re-Identification

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this