摘要
Scene classification has become an active research area in remote sensing (RS) image interpretation. Recently, Transformer-based methods have shown great potential in modeling global semantic information and have been exploited in RS scene classification. In this letter, we propose a multi-level fusion Swin Transformer (MFST), which integrates a multi-level feature merging (MFM) module and an adaptive feature compression (AFC) module to further boost the performance for RS scene classification. The MFM module narrows the semantic gaps in multi-level features via patch merging in lower-level feature maps and lateral connections in the top-down pathway. The AFC module makes multi-level features have smaller dimensions and more coherent semantic information by adaptive channel reduction. We evaluate the proposed network on the aerial image dataset (AID) and NWPU-RESISC45 (NWPU) datasets, and the classification results reveal that the proposed network outperforms several state-of-the-art (SOTA) methods.
源语言 | 英语 |
---|---|
文章编号 | 6516005 |
期刊 | IEEE Geoscience and Remote Sensing Letters |
卷 | 19 |
DOI | |
出版状态 | 已出版 - 2022 |