CDFKD-MFS: Collaborative Data-Free Knowledge Distillation via Multi-Level Feature Sharing

Zhiwei Hao; Yong Luo; Zhi Wang; Han Hu; Jianping An

doi:10.1109/TMM.2022.3192663

CDFKD-MFS: Collaborative Data-Free Knowledge Distillation via Multi-Level Feature Sharing

Zhiwei Hao, Yong Luo, Zhi Wang, Han Hu^*, Jianping An

^*此作品的通讯作者

科研成果: 期刊稿件 › 文章 › 同行评审

18 引用（Scopus）

摘要

Recently, the compression and deployment of powerful deep neural networks (DNNs) on resource-limited edge devices to provide intelligent services have become attractive tasks. Although knowledge distillation (KD) is a feasible solution for compression, its requirement on the original dataset raises privacy concerns. In addition, it is common to integrate multiple pretrained models to achieve satisfactory performance. How to compress multiple models into a tiny model is challenging, especially when the original data are unavailable. To tackle this challenge, we propose a framework termed collaborative data-free knowledge distillation via multi-level feature sharing (CDFKD-MFS), which consists of a multi-header student module, an asymmetric adversarial data-free KD module, and an attention-based aggregation module. In this framework, the student model equipped with a multi-level feature-sharing structure learns from multiple teacher models and is trained together with a generator in an asymmetric adversarial manner. When some real samples are available, the attention module adaptively aggregates predictions of the student headers, which can further improve performance. We conduct extensive experiments on three popular computer visual datasets. In particular, compared with the most competitive alternative, the accuracy of the proposed framework is 1.18% higher on the CIFAR-100 dataset, 1.67% higher on the Caltech-101 dataset, and 2.99% higher on the mini-ImageNet dataset.

源语言	英语
页（从-至）	4262-4274
页数	13
期刊	IEEE Transactions on Multimedia
卷	24
DOI	https://doi.org/10.1109/TMM.2022.3192663
出版状态	已出版 - 2022

访问文件

10.1109/TMM.2022.3192663

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{47b2eb91c6594231b7f75c6351cb463b,

title = "CDFKD-MFS: Collaborative Data-Free Knowledge Distillation via Multi-Level Feature Sharing",

abstract = "Recently, the compression and deployment of powerful deep neural networks (DNNs) on resource-limited edge devices to provide intelligent services have become attractive tasks. Although knowledge distillation (KD) is a feasible solution for compression, its requirement on the original dataset raises privacy concerns. In addition, it is common to integrate multiple pretrained models to achieve satisfactory performance. How to compress multiple models into a tiny model is challenging, especially when the original data are unavailable. To tackle this challenge, we propose a framework termed collaborative data-free knowledge distillation via multi-level feature sharing (CDFKD-MFS), which consists of a multi-header student module, an asymmetric adversarial data-free KD module, and an attention-based aggregation module. In this framework, the student model equipped with a multi-level feature-sharing structure learns from multiple teacher models and is trained together with a generator in an asymmetric adversarial manner. When some real samples are available, the attention module adaptively aggregates predictions of the student headers, which can further improve performance. We conduct extensive experiments on three popular computer visual datasets. In particular, compared with the most competitive alternative, the accuracy of the proposed framework is 1.18% higher on the CIFAR-100 dataset, 1.67% higher on the Caltech-101 dataset, and 2.99% higher on the mini-ImageNet dataset.",

keywords = "Attention, Data-free Distillation, Knowledge Distillation, Model Compression, Multi-teacher Distillation",

author = "Zhiwei Hao and Yong Luo and Zhi Wang and Han Hu and Jianping An",

note = "Publisher Copyright: {\textcopyright} 1999-2012 IEEE.",

year = "2022",

doi = "10.1109/TMM.2022.3192663",

language = "English",

volume = "24",

pages = "4262--4274",

journal = "IEEE Transactions on Multimedia",

issn = "1520-9210",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - CDFKD-MFS

T2 - Collaborative Data-Free Knowledge Distillation via Multi-Level Feature Sharing

AU - Hao, Zhiwei

AU - Luo, Yong

AU - Wang, Zhi

AU - Hu, Han

AU - An, Jianping

PY - 2022

Y1 - 2022

N2 - Recently, the compression and deployment of powerful deep neural networks (DNNs) on resource-limited edge devices to provide intelligent services have become attractive tasks. Although knowledge distillation (KD) is a feasible solution for compression, its requirement on the original dataset raises privacy concerns. In addition, it is common to integrate multiple pretrained models to achieve satisfactory performance. How to compress multiple models into a tiny model is challenging, especially when the original data are unavailable. To tackle this challenge, we propose a framework termed collaborative data-free knowledge distillation via multi-level feature sharing (CDFKD-MFS), which consists of a multi-header student module, an asymmetric adversarial data-free KD module, and an attention-based aggregation module. In this framework, the student model equipped with a multi-level feature-sharing structure learns from multiple teacher models and is trained together with a generator in an asymmetric adversarial manner. When some real samples are available, the attention module adaptively aggregates predictions of the student headers, which can further improve performance. We conduct extensive experiments on three popular computer visual datasets. In particular, compared with the most competitive alternative, the accuracy of the proposed framework is 1.18% higher on the CIFAR-100 dataset, 1.67% higher on the Caltech-101 dataset, and 2.99% higher on the mini-ImageNet dataset.

AB - Recently, the compression and deployment of powerful deep neural networks (DNNs) on resource-limited edge devices to provide intelligent services have become attractive tasks. Although knowledge distillation (KD) is a feasible solution for compression, its requirement on the original dataset raises privacy concerns. In addition, it is common to integrate multiple pretrained models to achieve satisfactory performance. How to compress multiple models into a tiny model is challenging, especially when the original data are unavailable. To tackle this challenge, we propose a framework termed collaborative data-free knowledge distillation via multi-level feature sharing (CDFKD-MFS), which consists of a multi-header student module, an asymmetric adversarial data-free KD module, and an attention-based aggregation module. In this framework, the student model equipped with a multi-level feature-sharing structure learns from multiple teacher models and is trained together with a generator in an asymmetric adversarial manner. When some real samples are available, the attention module adaptively aggregates predictions of the student headers, which can further improve performance. We conduct extensive experiments on three popular computer visual datasets. In particular, compared with the most competitive alternative, the accuracy of the proposed framework is 1.18% higher on the CIFAR-100 dataset, 1.67% higher on the Caltech-101 dataset, and 2.99% higher on the mini-ImageNet dataset.

KW - Attention

KW - Data-free Distillation

KW - Knowledge Distillation

KW - Model Compression

KW - Multi-teacher Distillation

UR - http://www.scopus.com/inward/record.url?scp=85135213314&partnerID=8YFLogxK

U2 - 10.1109/TMM.2022.3192663

DO - 10.1109/TMM.2022.3192663

M3 - Article

AN - SCOPUS:85135213314

SN - 1520-9210

VL - 24

SP - 4262

EP - 4274

JO - IEEE Transactions on Multimedia

JF - IEEE Transactions on Multimedia

ER -

CDFKD-MFS: Collaborative Data-Free Knowledge Distillation via Multi-Level Feature Sharing

摘要

访问文件

其它文件与链接

指纹

引用此