RMT-YOLOv9s: An Infrared Small Target Detection Method Based on UAV Remote Sensing Images

Keyu Xu; Chengtian Song; Yue Xie; Lizhi Pan; Xiaozheng Gan; Gao Huang

doi:10.1109/LGRS.2024.3484748

RMT-YOLOv9s: An Infrared Small Target Detection Method Based on UAV Remote Sensing Images

Keyu Xu, Chengtian Song^*, Yue Xie, Lizhi Pan, Xiaozheng Gan, Gao Huang^*

^*Corresponding author for this work

School of Mechatronical Engineering

Research output: Contribution to journal › Article › peer-review

Abstract

Unmanned aerial vehicles (UAVs) and infrared imaging technology have numerous applications in civilian fields. To address the issues of low accuracy resulting from complex ground backgrounds, small target size, and limited target features in UAV remote sensing infrared image target detection, we use the YOLOv9s model and the latest retentive networks meet vision transformers (RMTs) technology and propose the RMT-YOLOv9s model for infrared small target detection. First, a convolutional neural network (CNN)-RMT-based backbone is proposed by incorporating the RMT model into the backbone network of YOLOv9s, which extracts both local and global features for small target detection. Then, an improved neck multiscale feature-fusion network RMTELAN-PANet is designed using the novel convolutional RMTELAN module proposed in this letter, which can better capture and use semantic information from feature maps. Finally, efficient multiscale attention (EMA) attention module and upsampling Dysample module are integrated into RMTELAN-PANet to further improve the feature information of small targets. Experiments on the HIT-UAV dataset show that RMT-YOLOv9s outperforms other popular methods in infrared small target detection.

Original language	English
Article number	7002205
Journal	IEEE Geoscience and Remote Sensing Letters
Volume	21
DOIs	https://doi.org/10.1109/LGRS.2024.3484748
Publication status	Published - 2024

Keywords

Dysample
YOLOv9
efficient multiscale attention (EMA)
retentive networks meet vision transformer (RMT) transformer
unmanned aerial vehicle (UAV) infrared target detection

Access to Document

10.1109/LGRS.2024.3484748

Cite this

@article{a092a50d556e4cb09ab15951a9c0763c,

title = "RMT-YOLOv9s: An Infrared Small Target Detection Method Based on UAV Remote Sensing Images",

abstract = "Unmanned aerial vehicles (UAVs) and infrared imaging technology have numerous applications in civilian fields. To address the issues of low accuracy resulting from complex ground backgrounds, small target size, and limited target features in UAV remote sensing infrared image target detection, we use the YOLOv9s model and the latest retentive networks meet vision transformers (RMTs) technology and propose the RMT-YOLOv9s model for infrared small target detection. First, a convolutional neural network (CNN)-RMT-based backbone is proposed by incorporating the RMT model into the backbone network of YOLOv9s, which extracts both local and global features for small target detection. Then, an improved neck multiscale feature-fusion network RMTELAN-PANet is designed using the novel convolutional RMTELAN module proposed in this letter, which can better capture and use semantic information from feature maps. Finally, efficient multiscale attention (EMA) attention module and upsampling Dysample module are integrated into RMTELAN-PANet to further improve the feature information of small targets. Experiments on the HIT-UAV dataset show that RMT-YOLOv9s outperforms other popular methods in infrared small target detection.",

keywords = "Dysample, YOLOv9, efficient multiscale attention (EMA), retentive networks meet vision transformer (RMT) transformer, unmanned aerial vehicle (UAV) infrared target detection",

author = "Keyu Xu and Chengtian Song and Yue Xie and Lizhi Pan and Xiaozheng Gan and Gao Huang",

note = "Publisher Copyright: {\textcopyright} 2004-2012 IEEE.",

year = "2024",

doi = "10.1109/LGRS.2024.3484748",

language = "English",

volume = "21",

journal = "IEEE Geoscience and Remote Sensing Letters",

issn = "1545-598X",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - RMT-YOLOv9s

T2 - An Infrared Small Target Detection Method Based on UAV Remote Sensing Images

AU - Xu, Keyu

AU - Song, Chengtian

AU - Xie, Yue

AU - Pan, Lizhi

AU - Gan, Xiaozheng

AU - Huang, Gao

PY - 2024

Y1 - 2024

N2 - Unmanned aerial vehicles (UAVs) and infrared imaging technology have numerous applications in civilian fields. To address the issues of low accuracy resulting from complex ground backgrounds, small target size, and limited target features in UAV remote sensing infrared image target detection, we use the YOLOv9s model and the latest retentive networks meet vision transformers (RMTs) technology and propose the RMT-YOLOv9s model for infrared small target detection. First, a convolutional neural network (CNN)-RMT-based backbone is proposed by incorporating the RMT model into the backbone network of YOLOv9s, which extracts both local and global features for small target detection. Then, an improved neck multiscale feature-fusion network RMTELAN-PANet is designed using the novel convolutional RMTELAN module proposed in this letter, which can better capture and use semantic information from feature maps. Finally, efficient multiscale attention (EMA) attention module and upsampling Dysample module are integrated into RMTELAN-PANet to further improve the feature information of small targets. Experiments on the HIT-UAV dataset show that RMT-YOLOv9s outperforms other popular methods in infrared small target detection.

AB - Unmanned aerial vehicles (UAVs) and infrared imaging technology have numerous applications in civilian fields. To address the issues of low accuracy resulting from complex ground backgrounds, small target size, and limited target features in UAV remote sensing infrared image target detection, we use the YOLOv9s model and the latest retentive networks meet vision transformers (RMTs) technology and propose the RMT-YOLOv9s model for infrared small target detection. First, a convolutional neural network (CNN)-RMT-based backbone is proposed by incorporating the RMT model into the backbone network of YOLOv9s, which extracts both local and global features for small target detection. Then, an improved neck multiscale feature-fusion network RMTELAN-PANet is designed using the novel convolutional RMTELAN module proposed in this letter, which can better capture and use semantic information from feature maps. Finally, efficient multiscale attention (EMA) attention module and upsampling Dysample module are integrated into RMTELAN-PANet to further improve the feature information of small targets. Experiments on the HIT-UAV dataset show that RMT-YOLOv9s outperforms other popular methods in infrared small target detection.

KW - Dysample

KW - YOLOv9

KW - efficient multiscale attention (EMA)

KW - retentive networks meet vision transformer (RMT) transformer

KW - unmanned aerial vehicle (UAV) infrared target detection

UR - http://www.scopus.com/inward/record.url?scp=85207444213&partnerID=8YFLogxK

U2 - 10.1109/LGRS.2024.3484748

DO - 10.1109/LGRS.2024.3484748

M3 - Article

AN - SCOPUS:85207444213

SN - 1545-598X

VL - 21

JO - IEEE Geoscience and Remote Sensing Letters

JF - IEEE Geoscience and Remote Sensing Letters

M1 - 7002205

ER -

RMT-YOLOv9s: An Infrared Small Target Detection Method Based on UAV Remote Sensing Images

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this