Image Inpainting with Semantic-Aware Transformer

Shiyu Chen; Wenxin Yu; Qi Wang; Jun Gong; Peng Chen

doi:10.1109/ICASSP49357.2023.10095496

Image Inpainting with Semantic-Aware Transformer

Shiyu Chen, Wenxin Yu^*, Qi Wang, Jun Gong, Peng Chen

^*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

1 Citation (Scopus)

Abstract

Image inpainting has made huge strides benefiting from the advantages of convolutional neural networks (CNNs) in understanding high-level semantics. Recently, some studies have applied transformers to the visual field to solve the problem that the convolution kernel cannot attend to longdistance information. However, unlike other vision tasks, there is much interference from damaged information in image inpainting tasks. We propose a new Semantic-Aware Transformer, which in addition to including a self-attention block like previous vision transformers, also has a block for learning semantics from QSVM. Specifically, to provide more valid information, we design a Quantized Semantic Vector Memory (QSVM) that encodes and saves semantic features in images as quantized vectors in latent space. Experiments on different datasets demonstrate the effectiveness and superiority of our method compared with the existing state-of-the-art.

Original language	English
Title of host publication	ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings
Publisher	Institute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)	9781728163277
DOIs	https://doi.org/10.1109/ICASSP49357.2023.10095496
Publication status	Published - 2023
Externally published	Yes
Event	48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023 - Rhodes Island, Greece Duration: 4 Jun 2023 → 10 Jun 2023

Publication series

Name	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume	2023-June
ISSN (Print)	1520-6149

Conference

Conference	48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023
Country/Territory	Greece
City	Rhodes Island
Period	4/06/23 → 10/06/23

Keywords

Computer Vision
Image Inpainting
VQ-VAE
Vision Transformer

Access to Document

10.1109/ICASSP49357.2023.10095496

Cite this

Chen, S., Yu, W., Wang, Q., Gong, J., & Chen, P. (2023). Image Inpainting with Semantic-Aware Transformer. In ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2023-June). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP49357.2023.10095496

@inproceedings{5a4c0dc15fbd45819f0d3940a8da177a,

title = "Image Inpainting with Semantic-Aware Transformer",

abstract = "Image inpainting has made huge strides benefiting from the advantages of convolutional neural networks (CNNs) in understanding high-level semantics. Recently, some studies have applied transformers to the visual field to solve the problem that the convolution kernel cannot attend to longdistance information. However, unlike other vision tasks, there is much interference from damaged information in image inpainting tasks. We propose a new Semantic-Aware Transformer, which in addition to including a self-attention block like previous vision transformers, also has a block for learning semantics from QSVM. Specifically, to provide more valid information, we design a Quantized Semantic Vector Memory (QSVM) that encodes and saves semantic features in images as quantized vectors in latent space. Experiments on different datasets demonstrate the effectiveness and superiority of our method compared with the existing state-of-the-art.",

keywords = "Computer Vision, Image Inpainting, VQ-VAE, Vision Transformer",

author = "Shiyu Chen and Wenxin Yu and Qi Wang and Jun Gong and Peng Chen",

note = "Publisher Copyright: {\textcopyright} 2023 IEEE.; 48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023 ; Conference date: 04-06-2023 Through 10-06-2023",

year = "2023",

doi = "10.1109/ICASSP49357.2023.10095496",

language = "English",

series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

booktitle = "ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings",

address = "United States",

}

Chen, S, Yu, W, Wang, Q, Gong, J & Chen, P 2023, Image Inpainting with Semantic-Aware Transformer. in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, vol. 2023-June, Institute of Electrical and Electronics Engineers Inc., 48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023, Rhodes Island, Greece, 4/06/23. https://doi.org/10.1109/ICASSP49357.2023.10095496

Image Inpainting with Semantic-Aware Transformer. / Chen, Shiyu; Yu, Wenxin; Wang, Qi et al.
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings. Institute of Electrical and Electronics Engineers Inc., 2023. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2023-June).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Image Inpainting with Semantic-Aware Transformer

AU - Chen, Shiyu

AU - Yu, Wenxin

AU - Wang, Qi

AU - Gong, Jun

AU - Chen, Peng

PY - 2023

Y1 - 2023

N2 - Image inpainting has made huge strides benefiting from the advantages of convolutional neural networks (CNNs) in understanding high-level semantics. Recently, some studies have applied transformers to the visual field to solve the problem that the convolution kernel cannot attend to longdistance information. However, unlike other vision tasks, there is much interference from damaged information in image inpainting tasks. We propose a new Semantic-Aware Transformer, which in addition to including a self-attention block like previous vision transformers, also has a block for learning semantics from QSVM. Specifically, to provide more valid information, we design a Quantized Semantic Vector Memory (QSVM) that encodes and saves semantic features in images as quantized vectors in latent space. Experiments on different datasets demonstrate the effectiveness and superiority of our method compared with the existing state-of-the-art.

AB - Image inpainting has made huge strides benefiting from the advantages of convolutional neural networks (CNNs) in understanding high-level semantics. Recently, some studies have applied transformers to the visual field to solve the problem that the convolution kernel cannot attend to longdistance information. However, unlike other vision tasks, there is much interference from damaged information in image inpainting tasks. We propose a new Semantic-Aware Transformer, which in addition to including a self-attention block like previous vision transformers, also has a block for learning semantics from QSVM. Specifically, to provide more valid information, we design a Quantized Semantic Vector Memory (QSVM) that encodes and saves semantic features in images as quantized vectors in latent space. Experiments on different datasets demonstrate the effectiveness and superiority of our method compared with the existing state-of-the-art.

KW - Computer Vision

KW - Image Inpainting

KW - VQ-VAE

KW - Vision Transformer

UR - http://www.scopus.com/inward/record.url?scp=85177575114&partnerID=8YFLogxK

U2 - 10.1109/ICASSP49357.2023.10095496

DO - 10.1109/ICASSP49357.2023.10095496

M3 - Conference contribution

AN - SCOPUS:85177575114

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

BT - ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023

Y2 - 4 June 2023 through 10 June 2023

ER -

Chen S, Yu W, Wang Q, Gong J, Chen P. Image Inpainting with Semantic-Aware Transformer. In ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings. Institute of Electrical and Electronics Engineers Inc. 2023. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). doi: 10.1109/ICASSP49357.2023.10095496

Image Inpainting with Semantic-Aware Transformer

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this