TY - JOUR
T1 - ETDNet
T2 - Efficient Transformer-Based Detection Network for Surface Defect Detection
AU - Zhou, Hantao
AU - Yang, Rui
AU - Hu, Runze
AU - Shu, Chang
AU - Tang, Xiaochu
AU - Li, Xiu
N1 - Publisher Copyright:
© 1963-2012 IEEE.
PY - 2023
Y1 - 2023
N2 - Deep learning (DL)-based surface defect detectors play a crucial role in ensuring product quality during inspection processes. However, accurately and efficiently detecting defects remain challenging due to specific characteristics inherent in defective images, involving a high degree of foreground-background similarity, scale variation, and shape variation. To address this challenge, we propose an efficient transformer-based detection network, ETDNet, consisting of three novel designs to achieve superior performance. First, ETDNet takes a lightweight vision transformer (ViT) to extract representative global features. This approach ensures an accurate feature characterization of defects even with similar backgrounds. Second, a channel-modulated feature pyramid network (CM-FPN) is devised to fuse multilevel features and maintain critical information from corresponding levels. Finally, a novel task-oriented decoupled (TOD) head is introduced to tackle inconsistent representation between classification and regression tasks. The TOD head employs a local feature representation (LFR) module to learn object-aware local features and introduces a global feature representation (GFR) module, based on the attention mechanism, to learn content-aware global features. By integrating these two modules into the head, ETDNet can effectively classify and perceive defects with varying shapes and scales. Extensive experiments on various defect detection datasets demonstrate the effectiveness of the proposed ETDNet. For instance, it achieves AP 46.7% (versus 45.9%) and AP_50~80.2 % (versus 79.1%) with 49 frames/s on NEU-DET. The code is available at https://github.com/zht8506/ETDNet.
AB - Deep learning (DL)-based surface defect detectors play a crucial role in ensuring product quality during inspection processes. However, accurately and efficiently detecting defects remain challenging due to specific characteristics inherent in defective images, involving a high degree of foreground-background similarity, scale variation, and shape variation. To address this challenge, we propose an efficient transformer-based detection network, ETDNet, consisting of three novel designs to achieve superior performance. First, ETDNet takes a lightweight vision transformer (ViT) to extract representative global features. This approach ensures an accurate feature characterization of defects even with similar backgrounds. Second, a channel-modulated feature pyramid network (CM-FPN) is devised to fuse multilevel features and maintain critical information from corresponding levels. Finally, a novel task-oriented decoupled (TOD) head is introduced to tackle inconsistent representation between classification and regression tasks. The TOD head employs a local feature representation (LFR) module to learn object-aware local features and introduces a global feature representation (GFR) module, based on the attention mechanism, to learn content-aware global features. By integrating these two modules into the head, ETDNet can effectively classify and perceive defects with varying shapes and scales. Extensive experiments on various defect detection datasets demonstrate the effectiveness of the proposed ETDNet. For instance, it achieves AP 46.7% (versus 45.9%) and AP_50~80.2 % (versus 79.1%) with 49 frames/s on NEU-DET. The code is available at https://github.com/zht8506/ETDNet.
KW - Attention mechanism
KW - feature fusion
KW - surface defect detection
KW - task-oriented decoupled (TOD) head
KW - transformer
UR - http://www.scopus.com/inward/record.url?scp=85168729597&partnerID=8YFLogxK
U2 - 10.1109/TIM.2023.3307753
DO - 10.1109/TIM.2023.3307753
M3 - Article
AN - SCOPUS:85168729597
SN - 0018-9456
VL - 72
JO - IEEE Transactions on Instrumentation and Measurement
JF - IEEE Transactions on Instrumentation and Measurement
M1 - 2525014
ER -