DTSSNet: Dynamic Training Sample Selection Network for UAV Object Detection

Li Chen; Chaoyang Liu; Wei Li; Qizhi Xu; Hongbin Deng

doi:10.1109/TGRS.2023.3348555

DTSSNet: Dynamic Training Sample Selection Network for UAV Object Detection

Li Chen, Chaoyang Liu, Wei Li, Qizhi Xu, Hongbin Deng^*

^*Corresponding author for this work

School of Mechatronical Engineering

Research output: Contribution to journal › Article › peer-review

11 Citations (Scopus)

Abstract

Object detectors often struggle with accuracy and generalization when applied to aerial imagery, primarily due to the following challenges: 1) great scale variation of objects in aerial images: both extremely small and large objects are visible in the same image; and 2) an extreme imbalance of the training sample between positive and negative anchors: there are several positive ground truth (GT) anchors and an abundance of negative anchors. In this article, we propose a dynamic training sample selection network (DTSSNet) to solve the above-mentioned problems in two dimensions. An attention-enhanced feature module (AEFM) is proposed to enhance the basic features by focusing on both channel and semantic information related to targets. This module provides more valuable information for accurately classifying objects of different scales. To tackle the imbalance in training samples, this article implements a dynamic training sample selection (DTSS) module that divides the training samples based on GT information. This module dynamically selects samples, ensuring a more balanced representation of positive and negative anchors, leading to improved learning. Importantly, the combination of AEFM and DTSS does not introduce any additional computational costs. Experimental evaluations on the VisDrone2019-DET dataset demonstrate that DTSSNet outperforms base detectors and generic approaches. Furthermore, the effectiveness of DTSSNet is validated on the UAVDT benchmark dataset, where it achieves state-of-the-art performance.

Original language	English
Article number	5902516
Pages (from-to)	1-16
Number of pages	16
Journal	IEEE Transactions on Geoscience and Remote Sensing
Volume	62
DOIs	https://doi.org/10.1109/TGRS.2023.3348555
Publication status	Published - 2024

Keywords

Attention enhanced feature
dynamic training sample selection (DTSS)
object detection
unmanned aerial vehicle (UAV) aerial imagery

Access to Document

10.1109/TGRS.2023.3348555

Cite this

Chen, L., Liu, C., Li, W., Xu, Q., & Deng, H. (2024). DTSSNet: Dynamic Training Sample Selection Network for UAV Object Detection. IEEE Transactions on Geoscience and Remote Sensing, 62, 1-16. Article 5902516. https://doi.org/10.1109/TGRS.2023.3348555

@article{6f9c0c34751046eba46db0aa1999d309,

title = "DTSSNet: Dynamic Training Sample Selection Network for UAV Object Detection",

abstract = "Object detectors often struggle with accuracy and generalization when applied to aerial imagery, primarily due to the following challenges: 1) great scale variation of objects in aerial images: both extremely small and large objects are visible in the same image; and 2) an extreme imbalance of the training sample between positive and negative anchors: there are several positive ground truth (GT) anchors and an abundance of negative anchors. In this article, we propose a dynamic training sample selection network (DTSSNet) to solve the above-mentioned problems in two dimensions. An attention-enhanced feature module (AEFM) is proposed to enhance the basic features by focusing on both channel and semantic information related to targets. This module provides more valuable information for accurately classifying objects of different scales. To tackle the imbalance in training samples, this article implements a dynamic training sample selection (DTSS) module that divides the training samples based on GT information. This module dynamically selects samples, ensuring a more balanced representation of positive and negative anchors, leading to improved learning. Importantly, the combination of AEFM and DTSS does not introduce any additional computational costs. Experimental evaluations on the VisDrone2019-DET dataset demonstrate that DTSSNet outperforms base detectors and generic approaches. Furthermore, the effectiveness of DTSSNet is validated on the UAVDT benchmark dataset, where it achieves state-of-the-art performance.",

keywords = "Attention enhanced feature, dynamic training sample selection (DTSS), object detection, unmanned aerial vehicle (UAV) aerial imagery",

author = "Li Chen and Chaoyang Liu and Wei Li and Qizhi Xu and Hongbin Deng",

note = "Publisher Copyright: {\textcopyright} 1980-2012 IEEE.",

year = "2024",

doi = "10.1109/TGRS.2023.3348555",

language = "English",

volume = "62",

pages = "1--16",

journal = "IEEE Transactions on Geoscience and Remote Sensing",

issn = "0196-2892",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - DTSSNet

T2 - Dynamic Training Sample Selection Network for UAV Object Detection

AU - Chen, Li

AU - Liu, Chaoyang

AU - Li, Wei

AU - Xu, Qizhi

AU - Deng, Hongbin

PY - 2024

Y1 - 2024

N2 - Object detectors often struggle with accuracy and generalization when applied to aerial imagery, primarily due to the following challenges: 1) great scale variation of objects in aerial images: both extremely small and large objects are visible in the same image; and 2) an extreme imbalance of the training sample between positive and negative anchors: there are several positive ground truth (GT) anchors and an abundance of negative anchors. In this article, we propose a dynamic training sample selection network (DTSSNet) to solve the above-mentioned problems in two dimensions. An attention-enhanced feature module (AEFM) is proposed to enhance the basic features by focusing on both channel and semantic information related to targets. This module provides more valuable information for accurately classifying objects of different scales. To tackle the imbalance in training samples, this article implements a dynamic training sample selection (DTSS) module that divides the training samples based on GT information. This module dynamically selects samples, ensuring a more balanced representation of positive and negative anchors, leading to improved learning. Importantly, the combination of AEFM and DTSS does not introduce any additional computational costs. Experimental evaluations on the VisDrone2019-DET dataset demonstrate that DTSSNet outperforms base detectors and generic approaches. Furthermore, the effectiveness of DTSSNet is validated on the UAVDT benchmark dataset, where it achieves state-of-the-art performance.

AB - Object detectors often struggle with accuracy and generalization when applied to aerial imagery, primarily due to the following challenges: 1) great scale variation of objects in aerial images: both extremely small and large objects are visible in the same image; and 2) an extreme imbalance of the training sample between positive and negative anchors: there are several positive ground truth (GT) anchors and an abundance of negative anchors. In this article, we propose a dynamic training sample selection network (DTSSNet) to solve the above-mentioned problems in two dimensions. An attention-enhanced feature module (AEFM) is proposed to enhance the basic features by focusing on both channel and semantic information related to targets. This module provides more valuable information for accurately classifying objects of different scales. To tackle the imbalance in training samples, this article implements a dynamic training sample selection (DTSS) module that divides the training samples based on GT information. This module dynamically selects samples, ensuring a more balanced representation of positive and negative anchors, leading to improved learning. Importantly, the combination of AEFM and DTSS does not introduce any additional computational costs. Experimental evaluations on the VisDrone2019-DET dataset demonstrate that DTSSNet outperforms base detectors and generic approaches. Furthermore, the effectiveness of DTSSNet is validated on the UAVDT benchmark dataset, where it achieves state-of-the-art performance.

KW - Attention enhanced feature

KW - dynamic training sample selection (DTSS)

KW - object detection

KW - unmanned aerial vehicle (UAV) aerial imagery

UR - http://www.scopus.com/inward/record.url?scp=85181556132&partnerID=8YFLogxK

U2 - 10.1109/TGRS.2023.3348555

DO - 10.1109/TGRS.2023.3348555

M3 - Article

AN - SCOPUS:85181556132

SN - 0196-2892

VL - 62

SP - 1

EP - 16

JO - IEEE Transactions on Geoscience and Remote Sensing

JF - IEEE Transactions on Geoscience and Remote Sensing

M1 - 5902516

ER -

DTSSNet: Dynamic Training Sample Selection Network for UAV Object Detection

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this