NRT-YOLO: Improved YOLOv5 Based on Nested Residual Transformer for Tiny Remote Sensing Object Detection

Yukuan Liu; Guanglin He; Zehu Wang; Weizhe Li; Hongfei Huang

doi:10.3390/s22134953

NRT-YOLO: Improved YOLOv5 Based on Nested Residual Transformer for Tiny Remote Sensing Object Detection

Yukuan Liu, Guanglin He^*, Zehu Wang, Weizhe Li, Hongfei Huang

^*Corresponding author for this work

School of Mechatronical Engineering

Beijing Institute of Technology

Research output: Contribution to journal › Article › peer-review

32 Citations (Scopus)

Abstract

To address the problems of tiny objects and high resolution of object detection in remote sensing imagery, the methods with coarse-grained image cropping have been widely studied. However, these methods are always inefficient and complex due to the two-stage architecture and the huge computation for split images. For these reasons, this article employs YOLO and presents an improved architecture, NRT-YOLO. Specifically, the improvements can be summarized as: extra prediction head and related feature fusion layers; novel nested residual Transformer module, C3NRT; nested residual attention module, C3NRA; and multi-scale testing. The C3NRT module presented in this paper could boost accuracy and reduce complexity of the network at the same time. Moreover, the effectiveness of the proposed method is demonstrated by three kinds of experiments. NRT-YOLO achieves 56.9% mAP_0.5 with only 38.1 M parameters in the DOTA dataset, exceeding YOLOv5l by 4.5%. Also, the results of different classifications show its excellent ability to detect small sample objects. As for the C3NRT module, the ablation study and comparison experiment verified that it has the largest contribution to accuracy increment (2.7% in mAP_0.5) among the improvements. In conclusion, NRT-YOLO has excellent performance in accuracy improvement and parameter reduction, which is suitable for tiny remote sensing object detection.

Original language	English
Article number	4953
Journal	Sensors
Volume	22
Issue number	13
DOIs	https://doi.org/10.3390/s22134953
Publication status	Published - 1 Jul 2022

Keywords

YOLOv5
nested residual transformer
remote sensing imagery
tiny object detection

Access to Document

10.3390/s22134953

Cite this

@article{e5c709f9547149cfb2f10eb48e23f829,

title = "NRT-YOLO: Improved YOLOv5 Based on Nested Residual Transformer for Tiny Remote Sensing Object Detection",

abstract = "To address the problems of tiny objects and high resolution of object detection in remote sensing imagery, the methods with coarse-grained image cropping have been widely studied. However, these methods are always inefficient and complex due to the two-stage architecture and the huge computation for split images. For these reasons, this article employs YOLO and presents an improved architecture, NRT-YOLO. Specifically, the improvements can be summarized as: extra prediction head and related feature fusion layers; novel nested residual Transformer module, C3NRT; nested residual attention module, C3NRA; and multi-scale testing. The C3NRT module presented in this paper could boost accuracy and reduce complexity of the network at the same time. Moreover, the effectiveness of the proposed method is demonstrated by three kinds of experiments. NRT-YOLO achieves 56.9% mAP0.5 with only 38.1 M parameters in the DOTA dataset, exceeding YOLOv5l by 4.5%. Also, the results of different classifications show its excellent ability to detect small sample objects. As for the C3NRT module, the ablation study and comparison experiment verified that it has the largest contribution to accuracy increment (2.7% in mAP0.5) among the improvements. In conclusion, NRT-YOLO has excellent performance in accuracy improvement and parameter reduction, which is suitable for tiny remote sensing object detection.",

keywords = "YOLOv5, nested residual transformer, remote sensing imagery, tiny object detection",

author = "Yukuan Liu and Guanglin He and Zehu Wang and Weizhe Li and Hongfei Huang",

note = "Publisher Copyright: {\textcopyright} 2022 by the authors. Licensee MDPI, Basel, Switzerland.",

year = "2022",

month = jul,

day = "1",

doi = "10.3390/s22134953",

language = "English",

volume = "22",

journal = "Sensors",

issn = "1424-8220",

publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",

number = "13",

}

TY - JOUR

T1 - NRT-YOLO

T2 - Improved YOLOv5 Based on Nested Residual Transformer for Tiny Remote Sensing Object Detection

AU - Liu, Yukuan

AU - He, Guanglin

AU - Wang, Zehu

AU - Li, Weizhe

AU - Huang, Hongfei

PY - 2022/7/1

Y1 - 2022/7/1

N2 - To address the problems of tiny objects and high resolution of object detection in remote sensing imagery, the methods with coarse-grained image cropping have been widely studied. However, these methods are always inefficient and complex due to the two-stage architecture and the huge computation for split images. For these reasons, this article employs YOLO and presents an improved architecture, NRT-YOLO. Specifically, the improvements can be summarized as: extra prediction head and related feature fusion layers; novel nested residual Transformer module, C3NRT; nested residual attention module, C3NRA; and multi-scale testing. The C3NRT module presented in this paper could boost accuracy and reduce complexity of the network at the same time. Moreover, the effectiveness of the proposed method is demonstrated by three kinds of experiments. NRT-YOLO achieves 56.9% mAP0.5 with only 38.1 M parameters in the DOTA dataset, exceeding YOLOv5l by 4.5%. Also, the results of different classifications show its excellent ability to detect small sample objects. As for the C3NRT module, the ablation study and comparison experiment verified that it has the largest contribution to accuracy increment (2.7% in mAP0.5) among the improvements. In conclusion, NRT-YOLO has excellent performance in accuracy improvement and parameter reduction, which is suitable for tiny remote sensing object detection.

AB - To address the problems of tiny objects and high resolution of object detection in remote sensing imagery, the methods with coarse-grained image cropping have been widely studied. However, these methods are always inefficient and complex due to the two-stage architecture and the huge computation for split images. For these reasons, this article employs YOLO and presents an improved architecture, NRT-YOLO. Specifically, the improvements can be summarized as: extra prediction head and related feature fusion layers; novel nested residual Transformer module, C3NRT; nested residual attention module, C3NRA; and multi-scale testing. The C3NRT module presented in this paper could boost accuracy and reduce complexity of the network at the same time. Moreover, the effectiveness of the proposed method is demonstrated by three kinds of experiments. NRT-YOLO achieves 56.9% mAP0.5 with only 38.1 M parameters in the DOTA dataset, exceeding YOLOv5l by 4.5%. Also, the results of different classifications show its excellent ability to detect small sample objects. As for the C3NRT module, the ablation study and comparison experiment verified that it has the largest contribution to accuracy increment (2.7% in mAP0.5) among the improvements. In conclusion, NRT-YOLO has excellent performance in accuracy improvement and parameter reduction, which is suitable for tiny remote sensing object detection.

KW - YOLOv5

KW - nested residual transformer

KW - remote sensing imagery

KW - tiny object detection

UR - http://www.scopus.com/inward/record.url?scp=85133137102&partnerID=8YFLogxK

U2 - 10.3390/s22134953

DO - 10.3390/s22134953

M3 - Article

C2 - 35808445

AN - SCOPUS:85133137102

SN - 1424-8220

VL - 22

JO - Sensors

JF - Sensors

IS - 13

M1 - 4953

ER -

NRT-YOLO: Improved YOLOv5 Based on Nested Residual Transformer for Tiny Remote Sensing Object Detection

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this