Abstract
Convolutional neural networks (CNNs) have been widely used in remote sensing (RS) scene classification tasks due to their remarkable feature representation and inference capability. The complexity of RS images not only brings the challenges of high inter-class similarity and large intra-class diversity, but also introduces the problem that category-relevant regions are insufficiently prominent in feature extraction. Siamese CNNs with feature similarity measurement are chosen in some applications to overcome the former issue, but most ignore the randomness of input sample pairs. This makes the Siamese CNNs not focus enough on challenging samples, which limits the training efficiency. We propose the focal cosine metric (FCM) block that combines the cosine similarity metric and the threshold control to achieve sample selection, thereby completing network learning more efficiently. FCM only permits the misclassified focal samples to participate in similarity measurement based on Siamese CNN. It flexibly mitigates the misclassification caused by the high inter-class similarity and large intra-class diversity. Moreover, the adaptive attention (AA) module is designed to stress the pivotal target regions and assist in the similarity measurement of Siamese CNN. This is realized by adaptively assigning high weights to key targets with learnable guided vectors. It enables the model to focus on the details of intra-class similarities or inter-class differences in sample pairs, and thus reduces the difficulty of model optimization. Encouraging experimental results on three public data sets demonstrate the effectiveness of the novel Siamese CNN-based method with FCM and AA and show its superiority compared to other state-of-the-art scene classification methods.
Original language | English |
---|---|
Pages (from-to) | 84212-84226 |
Number of pages | 15 |
Journal | IEEE Access |
Volume | 10 |
DOIs | |
Publication status | Published - 2022 |
Keywords
- Remote sensing scene classification
- Siamese convolutional neural network
- adaptive attention
- focal cosine metric