TY - JOUR
T1 - Multilevel attention imitation knowledge distillation for RGB-thermal transmission line detection
AU - Guo, Xiaodong
AU - Zhou, Wujie
AU - Liu, Tong
N1 - Publisher Copyright:
© 2024 Elsevier Ltd
PY - 2025/1/15
Y1 - 2025/1/15
N2 - Transmission line detection (TLD) plays a crucial role in ensuring the safety and stability of electricity supply. Applying RGB-thermal convolutional neural networks (CNNs) to unmanned aerial vehicles (UAVs) photography is a valuable alternative for diagnosing transmission line faults. However, existing CNNs struggle to generalize to TLD due to the clustered backgrounds and variable weather conditions. In addition, the limited computational resources and storage space of UAVs pose challenges for the lightweight design of models. In the present study, we developed a novel multilevel attention imitation knowledge distillation structure comprising a high-performing teacher model called MAINet-T and a compact student model called MAINet-S. We aimed to 1) improve the accuracy and robustness of TLD and 2) optimize the performance and capacity of the model for deployment on UAVs. The MAINet-T has a three-stage feature aggregation module and a detailed enhancement module to facilitate the processes of multi-modal and multilevel feature complement and interaction. To balance model performance and capacity for deployment, we proposed a novel KD strategy, including response distillation and feature distillation, to obtain an optimized model called MAINet-S*. Within feature distillation, we proposed a multilevel attention imitation module to integrate the advantages of the attention maps in different stages of the encoder. In experiments based on the VITLD dataset, MAINet-S* outperformed 15 state-of-the-art methods, with a 66.2% reduction in the number of weight parameters (Params) and a 69.9% increase in floating-point operations (FLOPs) compared with MAINet-T.
AB - Transmission line detection (TLD) plays a crucial role in ensuring the safety and stability of electricity supply. Applying RGB-thermal convolutional neural networks (CNNs) to unmanned aerial vehicles (UAVs) photography is a valuable alternative for diagnosing transmission line faults. However, existing CNNs struggle to generalize to TLD due to the clustered backgrounds and variable weather conditions. In addition, the limited computational resources and storage space of UAVs pose challenges for the lightweight design of models. In the present study, we developed a novel multilevel attention imitation knowledge distillation structure comprising a high-performing teacher model called MAINet-T and a compact student model called MAINet-S. We aimed to 1) improve the accuracy and robustness of TLD and 2) optimize the performance and capacity of the model for deployment on UAVs. The MAINet-T has a three-stage feature aggregation module and a detailed enhancement module to facilitate the processes of multi-modal and multilevel feature complement and interaction. To balance model performance and capacity for deployment, we proposed a novel KD strategy, including response distillation and feature distillation, to obtain an optimized model called MAINet-S*. Within feature distillation, we proposed a multilevel attention imitation module to integrate the advantages of the attention maps in different stages of the encoder. In experiments based on the VITLD dataset, MAINet-S* outperformed 15 state-of-the-art methods, with a 66.2% reduction in the number of weight parameters (Params) and a 69.9% increase in floating-point operations (FLOPs) compared with MAINet-T.
KW - Convolutional neural networks
KW - Knowledge distillation
KW - Multi-modal
KW - Transmission line detection
UR - http://www.scopus.com/inward/record.url?scp=85204954861&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2024.125406
DO - 10.1016/j.eswa.2024.125406
M3 - Article
AN - SCOPUS:85204954861
SN - 0957-4174
VL - 260
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 125406
ER -