TY - JOUR
T1 - Large Kernel Sparse ConvNet Weighted by Multi-Frequency Attention for Remote Sensing Scene Understanding
AU - Wang, Junjie
AU - Li, Wei
AU - Zhang, Mengmeng
AU - Chanussot, Jocelyn
N1 - Publisher Copyright:
© 1980-2012 IEEE.
PY - 2023
Y1 - 2023
N2 - Remote sensing scene understanding is a highly challenging task, and has gradually emerged as a research hotspot in the field of intelligent interpretation of remote sensing data. Recently, the use of convolutional neural networks (CNNs) has been proven to be a fruitful advancement. However, with the emergence of visual transformers (ViTs), the limitations of traditional small convolutional kernels in directly capturing a large receptive field have posed significant challenges to their dominant role. Additionally, the fixed neuron connections between different convolutional layers have weakened the practicality and adaptability of the models. Furthermore, the global average pooling (GAP) also leads to the loss of effective information in the acquired features. In this work, a large kernel sparse ConvNet (LSCNet) weighted by multi-frequency attention (MFA) is proposed. First, unlike traditional CNNs, it utilizes two parallel rectangular convolutional kernels to approximate a large kernel, achieving comparable or even better results than ViTs-based methods. Second, an adaptive sparse optimization strategy is employed to dynamically optimize the fixed neuron connections between different convolutional layers, achieving a favorable connectivity pattern for capturing abstract features more accurately. Finally, a novel MFA module is used to replace GAP, so as to preserve more useful information while weighting the recognition features, thereby enhancing the discriminative and learning abilities of the model. In the conducted experiments, LSCNet achieves the best recognition results on three well-known remote sensing aerial datasets when compared to the state-of-the-art methods (including ViTs-based methods).
AB - Remote sensing scene understanding is a highly challenging task, and has gradually emerged as a research hotspot in the field of intelligent interpretation of remote sensing data. Recently, the use of convolutional neural networks (CNNs) has been proven to be a fruitful advancement. However, with the emergence of visual transformers (ViTs), the limitations of traditional small convolutional kernels in directly capturing a large receptive field have posed significant challenges to their dominant role. Additionally, the fixed neuron connections between different convolutional layers have weakened the practicality and adaptability of the models. Furthermore, the global average pooling (GAP) also leads to the loss of effective information in the acquired features. In this work, a large kernel sparse ConvNet (LSCNet) weighted by multi-frequency attention (MFA) is proposed. First, unlike traditional CNNs, it utilizes two parallel rectangular convolutional kernels to approximate a large kernel, achieving comparable or even better results than ViTs-based methods. Second, an adaptive sparse optimization strategy is employed to dynamically optimize the fixed neuron connections between different convolutional layers, achieving a favorable connectivity pattern for capturing abstract features more accurately. Finally, a novel MFA module is used to replace GAP, so as to preserve more useful information while weighting the recognition features, thereby enhancing the discriminative and learning abilities of the model. In the conducted experiments, LSCNet achieves the best recognition results on three well-known remote sensing aerial datasets when compared to the state-of-the-art methods (including ViTs-based methods).
KW - Adaptive sparse optimization
KW - large kernel convolution
KW - multi-frequency attention (MFA)
KW - remote sensing
KW - scene understanding
UR - http://www.scopus.com/inward/record.url?scp=85178000051&partnerID=8YFLogxK
U2 - 10.1109/TGRS.2023.3333401
DO - 10.1109/TGRS.2023.3333401
M3 - Article
AN - SCOPUS:85178000051
SN - 0196-2892
VL - 61
JO - IEEE Transactions on Geoscience and Remote Sensing
JF - IEEE Transactions on Geoscience and Remote Sensing
M1 - 5626112
ER -