TY - JOUR
T1 - A Scale-Aware and Discriminative Feature Learning Network for Fine-Grained Rigid Object Recognition
AU - Gao, Yangte
AU - Deng, Chenwei
AU - Chen, Liang
AU - Zhu, Zicong
N1 - Publisher Copyright:
© 2008-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - With the rapid development of remote sensing imaging and deep learning technology, the fine-grained recognition of rigid targets has gradually emerged. Rigid targets in remote sensing scenes usually retain relatively stable scale information and apparent structure, providing an adequate basis for their discrimination. However, existing methods need to more reasonably utilize the scale information and apparent structure, which results in scale neglect and insufficient discriminative feature extraction (DFE). In response to the aforementioned challenges, we propose our SD-Net, a training framework for fine-grained recognition tasks of rigid objects in remote sensing scenes. It consists of a fused label learning process based on probability distribution function (PDF) and a DFE branch. The PDF counts the objects' scale information by category, builds a probability model based on the sample distribution, and finally converts it into a soft label form to guide model learning. DFE extracts discriminative features along the channel and spatial dimensions of the feature map based on feature deep mining and wide-ranging. Finally, we propose the FAIR1M-OR dataset, containing 37 fine-grained categories and about 600 000 instances, to verify the method's effectiveness. The experimental results show that introducing only a small number of parameters during training, SD-Net, improves the performance of the models based on the ResNet and ViT by about 4.6 points. The code and dataset will be open source in the future.
AB - With the rapid development of remote sensing imaging and deep learning technology, the fine-grained recognition of rigid targets has gradually emerged. Rigid targets in remote sensing scenes usually retain relatively stable scale information and apparent structure, providing an adequate basis for their discrimination. However, existing methods need to more reasonably utilize the scale information and apparent structure, which results in scale neglect and insufficient discriminative feature extraction (DFE). In response to the aforementioned challenges, we propose our SD-Net, a training framework for fine-grained recognition tasks of rigid objects in remote sensing scenes. It consists of a fused label learning process based on probability distribution function (PDF) and a DFE branch. The PDF counts the objects' scale information by category, builds a probability model based on the sample distribution, and finally converts it into a soft label form to guide model learning. DFE extracts discriminative features along the channel and spatial dimensions of the feature map based on feature deep mining and wide-ranging. Finally, we propose the FAIR1M-OR dataset, containing 37 fine-grained categories and about 600 000 instances, to verify the method's effectiveness. The experimental results show that introducing only a small number of parameters during training, SD-Net, improves the performance of the models based on the ResNet and ViT by about 4.6 points. The code and dataset will be open source in the future.
KW - Discriminative feature (DF)
KW - fine-grained category
KW - object recognition (OR)
KW - remote sensing
KW - rigid body
KW - scale awareness
UR - http://www.scopus.com/inward/record.url?scp=86000377182&partnerID=8YFLogxK
U2 - 10.1109/JSTARS.2024.3484411
DO - 10.1109/JSTARS.2024.3484411
M3 - Article
AN - SCOPUS:86000377182
SN - 1939-1404
VL - 18
SP - 1695
EP - 1705
JO - IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
JF - IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
ER -