TY - JOUR
T1 - MoNET
T2 - No-reference image quality assessment based on a multi-depth output network
AU - Sang, Qingbing
AU - Su, Chenfei
AU - Zhu, Lingying
AU - Liu, Lixiong
AU - Wu, Xiaojun
AU - Bovik, Alan C.
N1 - Publisher Copyright:
© 2021 SPIE and IS and T.
PY - 2021/7/1
Y1 - 2021/7/1
N2 - When deep convolutional neural networks perform feature extraction, the features computed at each layer express different abstractions of visual information. The earlier layers extract highly compact low-level features such as bandpass and directional primitives, whereas deeper layers extract structural features of increasing abstraction, similar to contours, shapes, and edges, becoming less effable as the depth increases. We propose a different kind of end-to-end no-reference (NR) image quality assessment (IQA) model, which is defined as a multi-depth output convolutional neural network (MoNET). It accomplishes this by mapping both shallow and deep features to perceived quality. MoNET delivers three outputs that express shallow (lower-level) and deep (high-level) features, and maps them to subjective quality scores. The multiple outputs are combined into a single, final quality score. MoNET does this by combining the responses of three learning machines, so it may be viewed as a form of ensemble learning. The experimental results on three public image quality databases show that our proposed model achieves better performance than other state-of-the-art NR IQA algorithms.
AB - When deep convolutional neural networks perform feature extraction, the features computed at each layer express different abstractions of visual information. The earlier layers extract highly compact low-level features such as bandpass and directional primitives, whereas deeper layers extract structural features of increasing abstraction, similar to contours, shapes, and edges, becoming less effable as the depth increases. We propose a different kind of end-to-end no-reference (NR) image quality assessment (IQA) model, which is defined as a multi-depth output convolutional neural network (MoNET). It accomplishes this by mapping both shallow and deep features to perceived quality. MoNET delivers three outputs that express shallow (lower-level) and deep (high-level) features, and maps them to subjective quality scores. The multiple outputs are combined into a single, final quality score. MoNET does this by combining the responses of three learning machines, so it may be viewed as a form of ensemble learning. The experimental results on three public image quality databases show that our proposed model achieves better performance than other state-of-the-art NR IQA algorithms.
KW - ensemble learning
KW - image quality assessment
KW - multi-depth output convolutional neural network
KW - no-reference
UR - http://www.scopus.com/inward/record.url?scp=85114433922&partnerID=8YFLogxK
U2 - 10.1117/1.JEI.30.4.043007
DO - 10.1117/1.JEI.30.4.043007
M3 - Article
AN - SCOPUS:85114433922
SN - 1017-9909
VL - 30
JO - Journal of Electronic Imaging
JF - Journal of Electronic Imaging
IS - 4
M1 - 043007
ER -