FPD: Feature Pyramid Knowledge Distillation

Qi Wang, Lu Liu, Wenxin Yu*, Zhiqiang Zhang, Yuxin Liu, Shiyu Cheng, Xuewen Zhang, Jun Gong

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Knowledge distillation is a commonly used method for model compression, aims to compress a powerful yet cumbersome model into a lightweight model without much sacrifice of performance, giving the accuracy of a lightweight model close to that of the cumbersome model. Commonly, the efficient but bulky model is called the teacher model and the lightweight model is called the student model. For this purpose, various approaches have been proposed over the past few years. Some classical distillation methods are mainly based on distilling deep features from the intermediate layer or the logits layer, and some methods combine knowledge distillation with contrastive learning. However, classical distillation methods have a significant gap in feature representation between teacher and student, and contrastive learning distillation methods also need massive diversified data for training. For above these issues, our study aims to narrow the gap in feature representation between teacher and student and obtain more feature representation from images in limited datasets to achieve better performance. In addition, the superiority of our method is all validated on a generalized dataset (CIFAR-100) and a small-scale dataset (CIFAR-10). On CIFAR-100, we achieve 19.21%, 20.01% of top-1 error with Resnet50 and Resnet18, respectively. Especially, Resnet50 and Resnet18 as student model achieves better performance than the pre-trained Resnet152 and Resnet34 teacher model. On CIFAR-10, we perform 4.22% of top-1 error with Resnet-18. Whether on CIFAR-10 or CIFAR-100, we all achieve better performance, and even the student model performs better than the teacher.

Original languageEnglish
Title of host publicationNeural Information Processing - 29th International Conference, ICONIP 2022, Proceedings
EditorsMohammad Tanveer, Sonali Agarwal, Seiichi Ozawa, Asif Ekbal, Adam Jatowt
PublisherSpringer Science and Business Media Deutschland GmbH
Pages100-111
Number of pages12
ISBN (Print)9783031301049
DOIs
Publication statusPublished - 2023
Externally publishedYes
Event29th International Conference on Neural Information Processing, ICONIP 2022 - Virtual, Online
Duration: 22 Nov 202226 Nov 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13623 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference29th International Conference on Neural Information Processing, ICONIP 2022
CityVirtual, Online
Period22/11/2226/11/22

Keywords

  • Feature pyramid distillation
  • Feature pyramid network
  • Knowledge distillation

Fingerprint

Dive into the research topics of 'FPD: Feature Pyramid Knowledge Distillation'. Together they form a unique fingerprint.

Cite this