Structure-Guided Image Inpainting Based on Multi-Scale Attention Pyramid Network

Jun Gong, Senlin Luo, Wenxin Yu, Liang Nie*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Current single-view image inpainting methods often suffer from low image information utilization and suboptimal repair outcomes. To address these challenges, this paper introduces a novel image inpainting framework that leverages a structure-guided multi-scale attention pyramid network. This network consists of a structural repair network and a multi-scale attention pyramid semantic repair network. The structural repair component utilizes a dual-branch U-Net network for robust structure prediction under strong constraints. The predicted structural view then serves as auxiliary information for the semantic repair network. This latter network exploits the pyramid structure to extract multi-scale features of the image, which are further refined through an attention feature fusion module. Additionally, a separable gated convolution strategy is employed during feature extraction to minimize the impact of invalid information from missing areas, thereby enhancing the restoration quality. Experiments conducted on standard datasets such as Paris Street View and CelebA demonstrate the superiority of our approach over existing methods through quantitative and qualitative comparisons. Further ablation studies, by incrementally integrating proposed mechanisms into a baseline model, substantiate the effectiveness of our multi-view restoration strategy, separable gated convolution, and multi-scale attention feature fusion.

Original languageEnglish
Article number8325
JournalApplied Sciences (Switzerland)
Volume14
Issue number18
DOIs
Publication statusPublished - Sept 2024

Keywords

  • image inpainting
  • image processing
  • multi-scale attention pyramid
  • multi-view inpainting

Fingerprint

Dive into the research topics of 'Structure-Guided Image Inpainting Based on Multi-Scale Attention Pyramid Network'. Together they form a unique fingerprint.

Cite this