TY - JOUR
T1 - FRF-Net
T2 - Land Cover Classification from Large-Scale VHR Optical Remote Sensing Images
AU - Sang, Qianbo
AU - Zhuang, Yin
AU - Dong, Shan
AU - Wang, Guanqun
AU - Chen, He
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2020/6
Y1 - 2020/6
N2 - Deep learning (DL) technique is widely applied in remote sensing (RS) applications because of its outstanding nonlinear feature extraction ability. However, with regard to the issues of large-scale and very high-resolution (VHR) land cover classification, multi-object distributions and clear appearance with large intraclass difference become challenges for refined pixelwise land cover mapping. Focusing on these problems, the letter proposed a novel encoding-to-decoding method called the full receptive field (RF) network (FRF-Net) based on two types of attention mechanism. In the FRF-Net, ResNet-101 is used as the basic backbone. Then, the ensemble feature is generated by encoding the high-level features based on the self-attention mechanism which could achieve full RF to capture long-range semantic. Next, the encoding result is decoded by the fusion attention mechanism combined with the low-level feature to produce a fusion feature which contains a refined semantic description for accurate land cover mapping. Extensive experiments based on the GID and ISPRS data sets proved that the proposed network outperforms the state-of-the-art methods. The FRF-Net achieved 66.71% and 64.17% of the mean of classwise Intersection over Union (mIOU) with smaller computation cost on ISPRS and GID, respectively.
AB - Deep learning (DL) technique is widely applied in remote sensing (RS) applications because of its outstanding nonlinear feature extraction ability. However, with regard to the issues of large-scale and very high-resolution (VHR) land cover classification, multi-object distributions and clear appearance with large intraclass difference become challenges for refined pixelwise land cover mapping. Focusing on these problems, the letter proposed a novel encoding-to-decoding method called the full receptive field (RF) network (FRF-Net) based on two types of attention mechanism. In the FRF-Net, ResNet-101 is used as the basic backbone. Then, the ensemble feature is generated by encoding the high-level features based on the self-attention mechanism which could achieve full RF to capture long-range semantic. Next, the encoding result is decoded by the fusion attention mechanism combined with the low-level feature to produce a fusion feature which contains a refined semantic description for accurate land cover mapping. Extensive experiments based on the GID and ISPRS data sets proved that the proposed network outperforms the state-of-the-art methods. The FRF-Net achieved 66.71% and 64.17% of the mean of classwise Intersection over Union (mIOU) with smaller computation cost on ISPRS and GID, respectively.
KW - Deep learning (DL)
KW - land cover classification
KW - remote sensing (RS)
KW - semantic segmentation
KW - very high resolution (VHR)
UR - http://www.scopus.com/inward/record.url?scp=85085512991&partnerID=8YFLogxK
U2 - 10.1109/LGRS.2019.2938555
DO - 10.1109/LGRS.2019.2938555
M3 - Article
AN - SCOPUS:85085512991
SN - 1545-598X
VL - 17
SP - 1057
EP - 1061
JO - IEEE Geoscience and Remote Sensing Letters
JF - IEEE Geoscience and Remote Sensing Letters
IS - 6
M1 - 8848484
ER -