Structure-Guided Image Inpainting Based on Multi-Scale Attention Pyramid Network

Jun Gong; Senlin Luo; Wenxin Yu; Liang Nie

doi:10.3390/app14188325

Structure-Guided Image Inpainting Based on Multi-Scale Attention Pyramid Network

Jun Gong, Senlin Luo, Wenxin Yu, Liang Nie^*

^*Corresponding author for this work

Southwest University of Science and Technology

Research output: Contribution to journal › Article › peer-review

Abstract

Current single-view image inpainting methods often suffer from low image information utilization and suboptimal repair outcomes. To address these challenges, this paper introduces a novel image inpainting framework that leverages a structure-guided multi-scale attention pyramid network. This network consists of a structural repair network and a multi-scale attention pyramid semantic repair network. The structural repair component utilizes a dual-branch U-Net network for robust structure prediction under strong constraints. The predicted structural view then serves as auxiliary information for the semantic repair network. This latter network exploits the pyramid structure to extract multi-scale features of the image, which are further refined through an attention feature fusion module. Additionally, a separable gated convolution strategy is employed during feature extraction to minimize the impact of invalid information from missing areas, thereby enhancing the restoration quality. Experiments conducted on standard datasets such as Paris Street View and CelebA demonstrate the superiority of our approach over existing methods through quantitative and qualitative comparisons. Further ablation studies, by incrementally integrating proposed mechanisms into a baseline model, substantiate the effectiveness of our multi-view restoration strategy, separable gated convolution, and multi-scale attention feature fusion.

Original language	English
Article number	8325
Journal	Applied Sciences (Switzerland)
Volume	14
Issue number	18
DOIs	https://doi.org/10.3390/app14188325
Publication status	Published - Sept 2024

Keywords

image inpainting
image processing
multi-scale attention pyramid
multi-view inpainting

Access to Document

10.3390/app14188325

Cite this

@article{94a0d24fa192488884a130020c189641,

title = "Structure-Guided Image Inpainting Based on Multi-Scale Attention Pyramid Network",

abstract = "Current single-view image inpainting methods often suffer from low image information utilization and suboptimal repair outcomes. To address these challenges, this paper introduces a novel image inpainting framework that leverages a structure-guided multi-scale attention pyramid network. This network consists of a structural repair network and a multi-scale attention pyramid semantic repair network. The structural repair component utilizes a dual-branch U-Net network for robust structure prediction under strong constraints. The predicted structural view then serves as auxiliary information for the semantic repair network. This latter network exploits the pyramid structure to extract multi-scale features of the image, which are further refined through an attention feature fusion module. Additionally, a separable gated convolution strategy is employed during feature extraction to minimize the impact of invalid information from missing areas, thereby enhancing the restoration quality. Experiments conducted on standard datasets such as Paris Street View and CelebA demonstrate the superiority of our approach over existing methods through quantitative and qualitative comparisons. Further ablation studies, by incrementally integrating proposed mechanisms into a baseline model, substantiate the effectiveness of our multi-view restoration strategy, separable gated convolution, and multi-scale attention feature fusion.",

keywords = "image inpainting, image processing, multi-scale attention pyramid, multi-view inpainting",

author = "Jun Gong and Senlin Luo and Wenxin Yu and Liang Nie",

note = "Publisher Copyright: {\textcopyright} 2024 by the authors.",

year = "2024",

month = sep,

doi = "10.3390/app14188325",

language = "English",

volume = "14",

journal = "Applied Sciences (Switzerland)",

issn = "2076-3417",

publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",

number = "18",

}

TY - JOUR

T1 - Structure-Guided Image Inpainting Based on Multi-Scale Attention Pyramid Network

AU - Gong, Jun

AU - Luo, Senlin

AU - Yu, Wenxin

AU - Nie, Liang

PY - 2024/9

Y1 - 2024/9

N2 - Current single-view image inpainting methods often suffer from low image information utilization and suboptimal repair outcomes. To address these challenges, this paper introduces a novel image inpainting framework that leverages a structure-guided multi-scale attention pyramid network. This network consists of a structural repair network and a multi-scale attention pyramid semantic repair network. The structural repair component utilizes a dual-branch U-Net network for robust structure prediction under strong constraints. The predicted structural view then serves as auxiliary information for the semantic repair network. This latter network exploits the pyramid structure to extract multi-scale features of the image, which are further refined through an attention feature fusion module. Additionally, a separable gated convolution strategy is employed during feature extraction to minimize the impact of invalid information from missing areas, thereby enhancing the restoration quality. Experiments conducted on standard datasets such as Paris Street View and CelebA demonstrate the superiority of our approach over existing methods through quantitative and qualitative comparisons. Further ablation studies, by incrementally integrating proposed mechanisms into a baseline model, substantiate the effectiveness of our multi-view restoration strategy, separable gated convolution, and multi-scale attention feature fusion.

AB - Current single-view image inpainting methods often suffer from low image information utilization and suboptimal repair outcomes. To address these challenges, this paper introduces a novel image inpainting framework that leverages a structure-guided multi-scale attention pyramid network. This network consists of a structural repair network and a multi-scale attention pyramid semantic repair network. The structural repair component utilizes a dual-branch U-Net network for robust structure prediction under strong constraints. The predicted structural view then serves as auxiliary information for the semantic repair network. This latter network exploits the pyramid structure to extract multi-scale features of the image, which are further refined through an attention feature fusion module. Additionally, a separable gated convolution strategy is employed during feature extraction to minimize the impact of invalid information from missing areas, thereby enhancing the restoration quality. Experiments conducted on standard datasets such as Paris Street View and CelebA demonstrate the superiority of our approach over existing methods through quantitative and qualitative comparisons. Further ablation studies, by incrementally integrating proposed mechanisms into a baseline model, substantiate the effectiveness of our multi-view restoration strategy, separable gated convolution, and multi-scale attention feature fusion.

KW - image inpainting

KW - image processing

KW - multi-scale attention pyramid

KW - multi-view inpainting

UR - http://www.scopus.com/inward/record.url?scp=85205301098&partnerID=8YFLogxK

U2 - 10.3390/app14188325

DO - 10.3390/app14188325

M3 - Article

AN - SCOPUS:85205301098

SN - 2076-3417

VL - 14

JO - Applied Sciences (Switzerland)

JF - Applied Sciences (Switzerland)

IS - 18

M1 - 8325

ER -

Structure-Guided Image Inpainting Based on Multi-Scale Attention Pyramid Network

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this