TY - JOUR
T1 - EMO2-DETR
T2 - Efficient-Matching Oriented Object Detection With Transformers
AU - Hu, Zibo
AU - Gao, Kun
AU - Zhang, Xiaodian
AU - Wang, Junwei
AU - Wang, Hong
AU - Yang, Zhijia
AU - Li, Chenrui
AU - Li, Wei
N1 - Publisher Copyright:
© 1980-2012 IEEE.
PY - 2023
Y1 - 2023
N2 - Object detection in remote sensing is a challenging task due to the arbitrary orientations of objects and the vast variation in the number of objects within a single image. For instance, one image may contain hundreds of small vehicles, while another may only have a single football field. Recently, DEtection TRansformer (DETR) and its variants have achieved great success in object detection by setting a fixed number of object queries and using bipartite graph matching for one-to-one label assignment. However, we have observed that bipartite graph matching can result in relative redundancy of object queries when the number of objects changes dramatically in an image. This relative redundancy can cause two problems: slower convergence during training and redundant bounding boxes during inference. To analyze the aforementioned problems, we proposed a metric, redundancy of object query (ROQ), to quantitatively analyze the redundancy. Through experiments, we discovered that the reason for the two issues is the difficulty in distinguishing between high-quality negative samples and positive samples. In this article, we proposed efficient-matching oriented object detection with transformers (EMO2-DETR) consisting of three dedicated components to address the aforementioned issues. Specifically, reassign bipartite graph matching (RBGM) is proposed to extract high-quality negative samples from the negative samples. And ignored sample predicted head (ISPH) is proposed to predict high-quality negative samples. Then, reassigned Hungarian loss is used to better involve high-quality negative samples in the update of model parameters. Extensive experiments on DOTAv1 and DOTAv1.5 datasets demonstrated that our proposed method achieves the competitive results.
AB - Object detection in remote sensing is a challenging task due to the arbitrary orientations of objects and the vast variation in the number of objects within a single image. For instance, one image may contain hundreds of small vehicles, while another may only have a single football field. Recently, DEtection TRansformer (DETR) and its variants have achieved great success in object detection by setting a fixed number of object queries and using bipartite graph matching for one-to-one label assignment. However, we have observed that bipartite graph matching can result in relative redundancy of object queries when the number of objects changes dramatically in an image. This relative redundancy can cause two problems: slower convergence during training and redundant bounding boxes during inference. To analyze the aforementioned problems, we proposed a metric, redundancy of object query (ROQ), to quantitatively analyze the redundancy. Through experiments, we discovered that the reason for the two issues is the difficulty in distinguishing between high-quality negative samples and positive samples. In this article, we proposed efficient-matching oriented object detection with transformers (EMO2-DETR) consisting of three dedicated components to address the aforementioned issues. Specifically, reassign bipartite graph matching (RBGM) is proposed to extract high-quality negative samples from the negative samples. And ignored sample predicted head (ISPH) is proposed to predict high-quality negative samples. Then, reassigned Hungarian loss is used to better involve high-quality negative samples in the update of model parameters. Extensive experiments on DOTAv1 and DOTAv1.5 datasets demonstrated that our proposed method achieves the competitive results.
KW - Detection transformer (DETR)
KW - high-quality negative samples
KW - object detection
KW - relative redundancy of object query
KW - remote sensing
UR - http://www.scopus.com/inward/record.url?scp=85166761099&partnerID=8YFLogxK
U2 - 10.1109/TGRS.2023.3300154
DO - 10.1109/TGRS.2023.3300154
M3 - Article
AN - SCOPUS:85166761099
SN - 0196-2892
VL - 61
JO - IEEE Transactions on Geoscience and Remote Sensing
JF - IEEE Transactions on Geoscience and Remote Sensing
M1 - 5616814
ER -