TY - JOUR
T1 - CCIR
T2 - high fidelity face super-resolution with controllable conditions in diffusion models
AU - Chen, Yaxin
AU - Du, Huiqian
AU - Xie, Min
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024.
PY - 2024/12
Y1 - 2024/12
N2 - Diffusion probabilistic models have demonstrated great potential in producing realistic-looking super-resolution (SR) images. However, the realism doesn’t necessarily guarantee that the SR images are faithful to the ground truth high- resolution images. This paper develops a novel training-free framework namely Iterative Refinement with Controllable Condition (CCIR), for face SR based on controllable prior conditions in diffusion model. The goal is to generate SR images that are both realistic and faithful to the ground truth by controlling the prior conditions. Our framework consists of a pre-trained SR network, Local Implicit Image Function (LIIF), and a pre-trained diffusion model. The LIIF enhances the conditions provided by low-resolution images, while the diffusion model recovers fine details in the SR images. Notably, for the diffusion model, we propose a non-uniform low-pass filtering sampling strategy that dynamically adds controllable conditions to latent features during sampling process. This strategy provides a flexible balance between fidelity and realism in SR images, enabling the restoration of highly similar SR images from the same low-resolution input with different noise samples. Extensive experiments conducted on the benchmark of facial SR task demonstrate CCIR outperforms the state-of-the-art SISR methods, in qualitative and quantitative assessments, particularly in the case of magnifying very-low-resolution images or high-magnification factors.
AB - Diffusion probabilistic models have demonstrated great potential in producing realistic-looking super-resolution (SR) images. However, the realism doesn’t necessarily guarantee that the SR images are faithful to the ground truth high- resolution images. This paper develops a novel training-free framework namely Iterative Refinement with Controllable Condition (CCIR), for face SR based on controllable prior conditions in diffusion model. The goal is to generate SR images that are both realistic and faithful to the ground truth by controlling the prior conditions. Our framework consists of a pre-trained SR network, Local Implicit Image Function (LIIF), and a pre-trained diffusion model. The LIIF enhances the conditions provided by low-resolution images, while the diffusion model recovers fine details in the SR images. Notably, for the diffusion model, we propose a non-uniform low-pass filtering sampling strategy that dynamically adds controllable conditions to latent features during sampling process. This strategy provides a flexible balance between fidelity and realism in SR images, enabling the restoration of highly similar SR images from the same low-resolution input with different noise samples. Extensive experiments conducted on the benchmark of facial SR task demonstrate CCIR outperforms the state-of-the-art SISR methods, in qualitative and quantitative assessments, particularly in the case of magnifying very-low-resolution images or high-magnification factors.
KW - Controllable condition
KW - Diffusion probabilistic model
KW - Sampling strategy
KW - Single image super-resolution
UR - http://www.scopus.com/inward/record.url?scp=85202960023&partnerID=8YFLogxK
U2 - 10.1007/s11760-024-03502-9
DO - 10.1007/s11760-024-03502-9
M3 - Article
AN - SCOPUS:85202960023
SN - 1863-1703
VL - 18
SP - 8707
EP - 8721
JO - Signal, Image and Video Processing
JF - Signal, Image and Video Processing
IS - 12
ER -