Hierarchical Self-Distilled Feature Learning for Fine-Grained Visual Categorization

Yutao Hu, Xiaolong Jiang, Xuhui Liu, Xiaoyan Luo, Yao Hu, Xianbin Cao, Baochang Zhang, Jun Zhang

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)

Abstract

Fine-grained visual categorization (FGVC) relies on hierarchical features extracted by deep convolutional neural networks (CNNs) to recognize closely alike objects. Particularly, shallow layer features containing rich spatial details are vital for specifying subtle differences between objects but are usually inadequately optimized due to gradient vanishing during backpropagation. In this article, hierarchical self-distillation (HSD) is introduced to generate well-optimized CNNs features for accurate fine-grained categorization. HSD inherits from the widely applied deep supervision and implements multiple intermediate losses for reinforced gradients. Besides that, we observe that the hard (one-hot) labels adopted for intermediate supervision hurt the performance of FGVC by enforcing overstrict supervision. As a solution, HSD seeks self-distillation where soft predictions generated by deeper layers of the network are hierarchically exploited to supervise shallow parts. Moreover, self-information entropy loss (SIELoss) is designed in HSD to adaptively soften intermediate predictions and facilitate better convergence. In addition, the gradient detached fusion (GDF) module is incorporated to produce an ensemble result with multiscale features via effective feature fusion. Extensive experiments on four challenging fine-grained datasets show that, with neglectable parameter increase, the proposed HSD framework and the GDF module both bring significant performance gains over different backbones, which also achieves state-of-the-art classification performance.

Original languageEnglish
Pages (from-to)1-14
Number of pages14
JournalIEEE Transactions on Neural Networks and Learning Systems
DOIs
Publication statusAccepted/In press - 2024
Externally publishedYes

Keywords

  • Annotations
  • Convolutional neural networks
  • Feature extraction
  • Feature fusion
  • Optimization
  • Task analysis
  • Training
  • Visualization
  • fine-grained visual categorization (FGVC)
  • hierarchical feature learning
  • knowledge distillation.

Fingerprint

Dive into the research topics of 'Hierarchical Self-Distilled Feature Learning for Fine-Grained Visual Categorization'. Together they form a unique fingerprint.

Cite this