TY - JOUR
T1 - Transferring Prior Thermal Knowledge for Snowy Urban Scene Semantic Segmentation
AU - Guo, Xiaodong
AU - Liu, Tong
AU - Mou, Yefeng
AU - Chai, Siyuan
AU - Ren, Bohan
AU - Wang, Yijin
AU - Shi, Wei
AU - Liu, Siyuan
AU - Zhou, Wujie
N1 - Publisher Copyright:
© 2000-2011 IEEE.
PY - 2025
Y1 - 2025
N2 - RGB-thermal (RGB-T) semantic segmentation enables intelligent vehicles to understand environments while operating in urban scenes. However, the research encounters two main challenges: 1) scarcity of training samples under snowy conditions and 2) challenge in applying the model in practice. To address the first challenge, we proposed a publicly accessible RGB-T semantic segmentation dataset in snowy urban scenes (SUS dataset). The SUS dataset comprises 1035 pairs of precisely registered RGB-T images, and provides pixel-level semantic annotations for five categories for all images. To tackle the second challenge, we introduced MCNet-S∗, a novel semantic segmentation model that leverages knowledge distillation (KD). The KD structure consists of an RGB-T teacher model, named MCNet-T, and an RGB student model, named MCNet-S. Within MCNet-T, we proposed a cross-modal dual association (CDA) module to enhance utilization of RGB-T information in snowy urban scenes. Within MCNet-S, a depth-wise separable pyramid (DSP) module was proposed to improve the efficiency of RGB information utilization and align the feature dimensions with those of MCNet-T. Between MCNet-S and MCNet-T, memory-based contrastive learning distillation (MCLD) was proposed to transfer the prior thermal knowledge, improving the segmentation accuracy of MCNet-S and obtaining optimized MCNet-S∗. Extensive experiments on the SUS and MFNet datasets show that the proposed models outperform state-of-the-art models. The SUS dataset and codes are available at https://github.com/xiaodonguo/SUS_dataset.
AB - RGB-thermal (RGB-T) semantic segmentation enables intelligent vehicles to understand environments while operating in urban scenes. However, the research encounters two main challenges: 1) scarcity of training samples under snowy conditions and 2) challenge in applying the model in practice. To address the first challenge, we proposed a publicly accessible RGB-T semantic segmentation dataset in snowy urban scenes (SUS dataset). The SUS dataset comprises 1035 pairs of precisely registered RGB-T images, and provides pixel-level semantic annotations for five categories for all images. To tackle the second challenge, we introduced MCNet-S∗, a novel semantic segmentation model that leverages knowledge distillation (KD). The KD structure consists of an RGB-T teacher model, named MCNet-T, and an RGB student model, named MCNet-S. Within MCNet-T, we proposed a cross-modal dual association (CDA) module to enhance utilization of RGB-T information in snowy urban scenes. Within MCNet-S, a depth-wise separable pyramid (DSP) module was proposed to improve the efficiency of RGB information utilization and align the feature dimensions with those of MCNet-T. Between MCNet-S and MCNet-T, memory-based contrastive learning distillation (MCLD) was proposed to transfer the prior thermal knowledge, improving the segmentation accuracy of MCNet-S and obtaining optimized MCNet-S∗. Extensive experiments on the SUS and MFNet datasets show that the proposed models outperform state-of-the-art models. The SUS dataset and codes are available at https://github.com/xiaodonguo/SUS_dataset.
KW - RGB-thermal
KW - knowledge distillation
KW - semantic segmentation
KW - snowy urban scenes
UR - http://www.scopus.com/inward/record.url?scp=105002685437&partnerID=8YFLogxK
U2 - 10.1109/TITS.2025.3555617
DO - 10.1109/TITS.2025.3555617
M3 - Article
AN - SCOPUS:105002685437
SN - 1524-9050
JO - IEEE Transactions on Intelligent Transportation Systems
JF - IEEE Transactions on Intelligent Transportation Systems
ER -