TY - JOUR
T1 - Variational Self-Distillation for Remote Sensing Scene Classification
AU - Hu, Yutao
AU - Huang, Xin
AU - Luo, Xiaoyan
AU - Han, Jungong
AU - Cao, Xianbin
AU - Zhang, Jun
N1 - Publisher Copyright:
© 1980-2012 IEEE.
PY - 2022
Y1 - 2022
N2 - Supported by deep learning techniques, remote sensing scene classification, a fundamental task in remote image analysis, has recently obtained remarkable progress. However, due to the severe uncertainty and perturbation within an image, it is still a challenging task and remains many unsolved problems. In this article, we note that regular one-hot labels cannot precisely describe remote sensing images, and they fail to provide enough information for supervision and limiting the discriminative feature learning of the network. To solve this problem, we propose a variational self-distillation network (VSDNet), in which the class entanglement information from the prediction vector acts as the supplement to the category information. Then, the exploited information is hierarchically distilled from the deep layers into the shallow parts via a variational knowledge transfer (VKT) module. Notably, the VKT module performs knowledge distillation in a probabilistic way through variational estimation, which enables end-to-end optimization for mutual information and promotes robustness to uncertainty within the image. Extensive experiments on four challenging remote sensing datasets demonstrate that, with a negligible parameter increase, the proposed VSDNet brings a significant performance improvement over different backbone networks and delivers state-of-the-art results.
AB - Supported by deep learning techniques, remote sensing scene classification, a fundamental task in remote image analysis, has recently obtained remarkable progress. However, due to the severe uncertainty and perturbation within an image, it is still a challenging task and remains many unsolved problems. In this article, we note that regular one-hot labels cannot precisely describe remote sensing images, and they fail to provide enough information for supervision and limiting the discriminative feature learning of the network. To solve this problem, we propose a variational self-distillation network (VSDNet), in which the class entanglement information from the prediction vector acts as the supplement to the category information. Then, the exploited information is hierarchically distilled from the deep layers into the shallow parts via a variational knowledge transfer (VKT) module. Notably, the VKT module performs knowledge distillation in a probabilistic way through variational estimation, which enables end-to-end optimization for mutual information and promotes robustness to uncertainty within the image. Extensive experiments on four challenging remote sensing datasets demonstrate that, with a negligible parameter increase, the proposed VSDNet brings a significant performance improvement over different backbone networks and delivers state-of-the-art results.
KW - Class entanglement information
KW - hierarchical knowledge transfer
KW - remote sensing scene classification
KW - self-distillation
UR - http://www.scopus.com/inward/record.url?scp=85135752676&partnerID=8YFLogxK
U2 - 10.1109/TGRS.2022.3194549
DO - 10.1109/TGRS.2022.3194549
M3 - Article
AN - SCOPUS:85135752676
SN - 0196-2892
VL - 60
JO - IEEE Transactions on Geoscience and Remote Sensing
JF - IEEE Transactions on Geoscience and Remote Sensing
M1 - 5627313
ER -