Cross-Dataset Distillation with Multi-tokens for Image Quality Assessment

Timin Gao; Weixuan Jin; Bokai Lai; Zhen Chen; Runze Hu; Yan Zhang; Pingyang Dai

doi:10.1007/978-981-99-8537-1_31

Cross-Dataset Distillation with Multi-tokens for Image Quality Assessment

Timin Gao, Weixuan Jin, Bokai Lai, Zhen Chen, Runze Hu, Yan Zhang^*, Pingyang Dai

^*Corresponding author for this work

School of Information and Electronics

Xiamen University

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

Abstract

No Reference Image Quality Assessment (NR-IQA) aims to accurately evaluate image distortion by simulating human assessment. However, this task is challenging due to the diversity of distortion types and the scarcity of labeled data. To address these issues, we propose a novel attention distillation-based method for NR-IQA. Our approach effectively integrates knowledge from different datasets to enhance the representation of image quality and improve the accuracy of predictions. Specifically, we introduce a distillation token in the Transformer encoder, enabling the student model to learn from the teacher across different datasets. By leveraging knowledge from diverse sources, our model captures essential features related to image distortion and enhances the generalization ability of the model. Furthermore, to refine perceptual information from various perspectives, we introduce multiple class tokens that simulate multiple reviewers. This not only improves the interpretability of the model but also reduces prediction uncertainty. Additionally, we introduce a mechanism called Attention Scoring, which combines the attention-scoring matrix from the encoder with the MLP header behind the decoder to refine the final quality score. Through extensive evaluations of six standard NR-IQA datasets, our method achieves performance comparable to the state-of-the-art NR-IQA approaches. Notably, it achieves SRCC values of 0.932 (compared to 0.892 in TID2013) and 0.964 (compared to 0.946 in CSIQ).

Original language	English
Title of host publication	Pattern Recognition and Computer Vision - 6th Chinese Conference, PRCV 2023, Proceedings
Editors	Qingshan Liu, Hanzi Wang, Rongrong Ji, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang
Publisher	Springer Science and Business Media Deutschland GmbH
Pages	384-395
Number of pages	12
ISBN (Print)	9789819985364
DOIs	https://doi.org/10.1007/978-981-99-8537-1_31
Publication status	Published - 2024
Event	6th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2023 - Xiamen, China Duration: 13 Oct 2023 → 15 Oct 2023

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	14430 LNCS
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	6th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2023
Country/Territory	China
City	Xiamen
Period	13/10/23 → 15/10/23

Keywords

Distillation
Image quality assessment
Transformer

Access to Document

10.1007/978-981-99-8537-1_31

Cite this

Gao, T., Jin, W., Lai, B., Chen, Z., Hu, R., Zhang, Y., & Dai, P. (2024). Cross-Dataset Distillation with Multi-tokens for Image Quality Assessment. In Q. Liu, H. Wang, R. Ji, Z. Ma, W. Zheng, H. Zha, X. Chen, & L. Wang (Eds.), Pattern Recognition and Computer Vision - 6th Chinese Conference, PRCV 2023, Proceedings (pp. 384-395). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 14430 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-99-8537-1_31

Gao, Timin ; Jin, Weixuan ; Lai, Bokai et al. / Cross-Dataset Distillation with Multi-tokens for Image Quality Assessment. Pattern Recognition and Computer Vision - 6th Chinese Conference, PRCV 2023, Proceedings. editor / Qingshan Liu ; Hanzi Wang ; Rongrong Ji ; Zhanyu Ma ; Weishi Zheng ; Hongbin Zha ; Xilin Chen ; Liang Wang. Springer Science and Business Media Deutschland GmbH, 2024. pp. 384-395 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{4f47600c53d54cf99c5c10f16a4b6a60,

title = "Cross-Dataset Distillation with Multi-tokens for Image Quality Assessment",

abstract = "No Reference Image Quality Assessment (NR-IQA) aims to accurately evaluate image distortion by simulating human assessment. However, this task is challenging due to the diversity of distortion types and the scarcity of labeled data. To address these issues, we propose a novel attention distillation-based method for NR-IQA. Our approach effectively integrates knowledge from different datasets to enhance the representation of image quality and improve the accuracy of predictions. Specifically, we introduce a distillation token in the Transformer encoder, enabling the student model to learn from the teacher across different datasets. By leveraging knowledge from diverse sources, our model captures essential features related to image distortion and enhances the generalization ability of the model. Furthermore, to refine perceptual information from various perspectives, we introduce multiple class tokens that simulate multiple reviewers. This not only improves the interpretability of the model but also reduces prediction uncertainty. Additionally, we introduce a mechanism called Attention Scoring, which combines the attention-scoring matrix from the encoder with the MLP header behind the decoder to refine the final quality score. Through extensive evaluations of six standard NR-IQA datasets, our method achieves performance comparable to the state-of-the-art NR-IQA approaches. Notably, it achieves SRCC values of 0.932 (compared to 0.892 in TID2013) and 0.964 (compared to 0.946 in CSIQ).",

keywords = "Distillation, Image quality assessment, Transformer",

author = "Timin Gao and Weixuan Jin and Bokai Lai and Zhen Chen and Runze Hu and Yan Zhang and Pingyang Dai",

note = "Publisher Copyright: {\textcopyright} 2024, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.; 6th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2023 ; Conference date: 13-10-2023 Through 15-10-2023",

year = "2024",

doi = "10.1007/978-981-99-8537-1_31",

language = "English",

isbn = "9789819985364",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "384--395",

editor = "Qingshan Liu and Hanzi Wang and Rongrong Ji and Zhanyu Ma and Weishi Zheng and Hongbin Zha and Xilin Chen and Liang Wang",

booktitle = "Pattern Recognition and Computer Vision - 6th Chinese Conference, PRCV 2023, Proceedings",

address = "Germany",

}

Gao, T, Jin, W, Lai, B, Chen, Z, Hu, R, Zhang, Y & Dai, P 2024, Cross-Dataset Distillation with Multi-tokens for Image Quality Assessment. in Q Liu, H Wang, R Ji, Z Ma, W Zheng, H Zha, X Chen & L Wang (eds), Pattern Recognition and Computer Vision - 6th Chinese Conference, PRCV 2023, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 14430 LNCS, Springer Science and Business Media Deutschland GmbH, pp. 384-395, 6th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2023, Xiamen, China, 13/10/23. https://doi.org/10.1007/978-981-99-8537-1_31

Cross-Dataset Distillation with Multi-tokens for Image Quality Assessment. / Gao, Timin; Jin, Weixuan; Lai, Bokai et al.
Pattern Recognition and Computer Vision - 6th Chinese Conference, PRCV 2023, Proceedings. ed. / Qingshan Liu; Hanzi Wang; Rongrong Ji; Zhanyu Ma; Weishi Zheng; Hongbin Zha; Xilin Chen; Liang Wang. Springer Science and Business Media Deutschland GmbH, 2024. p. 384-395 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 14430 LNCS).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Cross-Dataset Distillation with Multi-tokens for Image Quality Assessment

AU - Gao, Timin

AU - Jin, Weixuan

AU - Lai, Bokai

AU - Chen, Zhen

AU - Hu, Runze

AU - Zhang, Yan

AU - Dai, Pingyang

PY - 2024

Y1 - 2024

N2 - No Reference Image Quality Assessment (NR-IQA) aims to accurately evaluate image distortion by simulating human assessment. However, this task is challenging due to the diversity of distortion types and the scarcity of labeled data. To address these issues, we propose a novel attention distillation-based method for NR-IQA. Our approach effectively integrates knowledge from different datasets to enhance the representation of image quality and improve the accuracy of predictions. Specifically, we introduce a distillation token in the Transformer encoder, enabling the student model to learn from the teacher across different datasets. By leveraging knowledge from diverse sources, our model captures essential features related to image distortion and enhances the generalization ability of the model. Furthermore, to refine perceptual information from various perspectives, we introduce multiple class tokens that simulate multiple reviewers. This not only improves the interpretability of the model but also reduces prediction uncertainty. Additionally, we introduce a mechanism called Attention Scoring, which combines the attention-scoring matrix from the encoder with the MLP header behind the decoder to refine the final quality score. Through extensive evaluations of six standard NR-IQA datasets, our method achieves performance comparable to the state-of-the-art NR-IQA approaches. Notably, it achieves SRCC values of 0.932 (compared to 0.892 in TID2013) and 0.964 (compared to 0.946 in CSIQ).

AB - No Reference Image Quality Assessment (NR-IQA) aims to accurately evaluate image distortion by simulating human assessment. However, this task is challenging due to the diversity of distortion types and the scarcity of labeled data. To address these issues, we propose a novel attention distillation-based method for NR-IQA. Our approach effectively integrates knowledge from different datasets to enhance the representation of image quality and improve the accuracy of predictions. Specifically, we introduce a distillation token in the Transformer encoder, enabling the student model to learn from the teacher across different datasets. By leveraging knowledge from diverse sources, our model captures essential features related to image distortion and enhances the generalization ability of the model. Furthermore, to refine perceptual information from various perspectives, we introduce multiple class tokens that simulate multiple reviewers. This not only improves the interpretability of the model but also reduces prediction uncertainty. Additionally, we introduce a mechanism called Attention Scoring, which combines the attention-scoring matrix from the encoder with the MLP header behind the decoder to refine the final quality score. Through extensive evaluations of six standard NR-IQA datasets, our method achieves performance comparable to the state-of-the-art NR-IQA approaches. Notably, it achieves SRCC values of 0.932 (compared to 0.892 in TID2013) and 0.964 (compared to 0.946 in CSIQ).

KW - Distillation

KW - Image quality assessment

KW - Transformer

UR - http://www.scopus.com/inward/record.url?scp=85181763175&partnerID=8YFLogxK

U2 - 10.1007/978-981-99-8537-1_31

DO - 10.1007/978-981-99-8537-1_31

M3 - Conference contribution

AN - SCOPUS:85181763175

SN - 9789819985364

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 384

EP - 395

BT - Pattern Recognition and Computer Vision - 6th Chinese Conference, PRCV 2023, Proceedings

A2 - Liu, Qingshan

A2 - Wang, Hanzi

A2 - Ji, Rongrong

A2 - Ma, Zhanyu

A2 - Zheng, Weishi

A2 - Zha, Hongbin

A2 - Chen, Xilin

A2 - Wang, Liang

PB - Springer Science and Business Media Deutschland GmbH

T2 - 6th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2023

Y2 - 13 October 2023 through 15 October 2023

ER -

Gao T, Jin W, Lai B, Chen Z, Hu R, Zhang Y et al. Cross-Dataset Distillation with Multi-tokens for Image Quality Assessment. In Liu Q, Wang H, Ji R, Ma Z, Zheng W, Zha H, Chen X, Wang L, editors, Pattern Recognition and Computer Vision - 6th Chinese Conference, PRCV 2023, Proceedings. Springer Science and Business Media Deutschland GmbH. 2024. p. 384-395. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-981-99-8537-1_31