TY - GEN
T1 - Cross-Dataset Distillation with Multi-tokens for Image Quality Assessment
AU - Gao, Timin
AU - Jin, Weixuan
AU - Lai, Bokai
AU - Chen, Zhen
AU - Hu, Runze
AU - Zhang, Yan
AU - Dai, Pingyang
N1 - Publisher Copyright:
© 2024, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
PY - 2024
Y1 - 2024
N2 - No Reference Image Quality Assessment (NR-IQA) aims to accurately evaluate image distortion by simulating human assessment. However, this task is challenging due to the diversity of distortion types and the scarcity of labeled data. To address these issues, we propose a novel attention distillation-based method for NR-IQA. Our approach effectively integrates knowledge from different datasets to enhance the representation of image quality and improve the accuracy of predictions. Specifically, we introduce a distillation token in the Transformer encoder, enabling the student model to learn from the teacher across different datasets. By leveraging knowledge from diverse sources, our model captures essential features related to image distortion and enhances the generalization ability of the model. Furthermore, to refine perceptual information from various perspectives, we introduce multiple class tokens that simulate multiple reviewers. This not only improves the interpretability of the model but also reduces prediction uncertainty. Additionally, we introduce a mechanism called Attention Scoring, which combines the attention-scoring matrix from the encoder with the MLP header behind the decoder to refine the final quality score. Through extensive evaluations of six standard NR-IQA datasets, our method achieves performance comparable to the state-of-the-art NR-IQA approaches. Notably, it achieves SRCC values of 0.932 (compared to 0.892 in TID2013) and 0.964 (compared to 0.946 in CSIQ).
AB - No Reference Image Quality Assessment (NR-IQA) aims to accurately evaluate image distortion by simulating human assessment. However, this task is challenging due to the diversity of distortion types and the scarcity of labeled data. To address these issues, we propose a novel attention distillation-based method for NR-IQA. Our approach effectively integrates knowledge from different datasets to enhance the representation of image quality and improve the accuracy of predictions. Specifically, we introduce a distillation token in the Transformer encoder, enabling the student model to learn from the teacher across different datasets. By leveraging knowledge from diverse sources, our model captures essential features related to image distortion and enhances the generalization ability of the model. Furthermore, to refine perceptual information from various perspectives, we introduce multiple class tokens that simulate multiple reviewers. This not only improves the interpretability of the model but also reduces prediction uncertainty. Additionally, we introduce a mechanism called Attention Scoring, which combines the attention-scoring matrix from the encoder with the MLP header behind the decoder to refine the final quality score. Through extensive evaluations of six standard NR-IQA datasets, our method achieves performance comparable to the state-of-the-art NR-IQA approaches. Notably, it achieves SRCC values of 0.932 (compared to 0.892 in TID2013) and 0.964 (compared to 0.946 in CSIQ).
KW - Distillation
KW - Image quality assessment
KW - Transformer
UR - http://www.scopus.com/inward/record.url?scp=85181763175&partnerID=8YFLogxK
U2 - 10.1007/978-981-99-8537-1_31
DO - 10.1007/978-981-99-8537-1_31
M3 - Conference contribution
AN - SCOPUS:85181763175
SN - 9789819985364
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 384
EP - 395
BT - Pattern Recognition and Computer Vision - 6th Chinese Conference, PRCV 2023, Proceedings
A2 - Liu, Qingshan
A2 - Wang, Hanzi
A2 - Ji, Rongrong
A2 - Ma, Zhanyu
A2 - Zheng, Weishi
A2 - Zha, Hongbin
A2 - Chen, Xilin
A2 - Wang, Liang
PB - Springer Science and Business Media Deutschland GmbH
T2 - 6th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2023
Y2 - 13 October 2023 through 15 October 2023
ER -