Detail-Preserving Transformer for Light Field Image Super-resolution

Shunzhou Wang; Tianfei Zhou; Yao Lu; Huijun Di

doi:10.1609/aaai.v36i3.20153

Detail-Preserving Transformer for Light Field Image Super-resolution

Shunzhou Wang, Tianfei Zhou, Yao Lu^*, Huijun Di

^*Corresponding author for this work

School of Computer Science and Technology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

97 Citations (Scopus)

Abstract

Recently, numerous algorithms have been developed to tackle the problem of light field super-resolution (LFSR), i.e., super-resolving low-resolution light fields to gain high-resolution views. Despite delivering encouraging results, these approaches are all convolution-based, and are naturally weak in global relation modeling of sub-aperture images necessarily to characterize the inherent structure of light fields. In this paper, we put forth a novel formulation built upon Transformers, by treating LFSR as a sequence-to-sequence reconstruction task. In particular, our model regards sub-aperture images of each vertical or horizontal angular view as a sequence, and establishes long-range geometric dependencies within each sequence via a spatial-angular locally-enhanced self-attention layer, which maintains the locality of each sub-aperture image as well. Additionally, to better recover image details, we propose a detail-preserving Transformer (termed as DPT), by leveraging gradient maps of light field to guide the sequence learning. DPT consists of two branches, with each associated with a Transformer for learning from an original or gradient image sequence. The two branches are finally fused to obtain comprehensive feature representations for reconstruction. Evaluations are conducted on a number of light field datasets, including real-world scenes and synthetic data. The proposed method achieves superior performance comparing with other state-of-the-art schemes. Our code is publicly available at: https://github.com/BITszwang/DPT.

Original language	English
Title of host publication	AAAI-22 Technical Tracks 3
Publisher	Association for the Advancement of Artificial Intelligence
Pages	2522-2530
Number of pages	9
ISBN (Electronic)	1577358767, 9781577358763
DOIs	https://doi.org/10.1609/aaai.v36i3.20153
Publication status	Published - 30 Jun 2022
Event	36th AAAI Conference on Artificial Intelligence, AAAI 2022 - Virtual, Online Duration: 22 Feb 2022 → 1 Mar 2022

Publication series

Name	Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022
Volume	36

Conference

Conference	36th AAAI Conference on Artificial Intelligence, AAAI 2022
City	Virtual, Online
Period	22/02/22 → 1/03/22

Access to Document

10.1609/aaai.v36i3.20153

Cite this

Wang, S., Zhou, T., Lu, Y., & Di, H. (2022). Detail-Preserving Transformer for Light Field Image Super-resolution. In AAAI-22 Technical Tracks 3 (pp. 2522-2530). (Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022; Vol. 36). Association for the Advancement of Artificial Intelligence. https://doi.org/10.1609/aaai.v36i3.20153

@inproceedings{4f7d722094194fe4a76b5c6bd1fbd561,

title = "Detail-Preserving Transformer for Light Field Image Super-resolution",

abstract = "Recently, numerous algorithms have been developed to tackle the problem of light field super-resolution (LFSR), i.e., super-resolving low-resolution light fields to gain high-resolution views. Despite delivering encouraging results, these approaches are all convolution-based, and are naturally weak in global relation modeling of sub-aperture images necessarily to characterize the inherent structure of light fields. In this paper, we put forth a novel formulation built upon Transformers, by treating LFSR as a sequence-to-sequence reconstruction task. In particular, our model regards sub-aperture images of each vertical or horizontal angular view as a sequence, and establishes long-range geometric dependencies within each sequence via a spatial-angular locally-enhanced self-attention layer, which maintains the locality of each sub-aperture image as well. Additionally, to better recover image details, we propose a detail-preserving Transformer (termed as DPT), by leveraging gradient maps of light field to guide the sequence learning. DPT consists of two branches, with each associated with a Transformer for learning from an original or gradient image sequence. The two branches are finally fused to obtain comprehensive feature representations for reconstruction. Evaluations are conducted on a number of light field datasets, including real-world scenes and synthetic data. The proposed method achieves superior performance comparing with other state-of-the-art schemes. Our code is publicly available at: https://github.com/BITszwang/DPT.",

author = "Shunzhou Wang and Tianfei Zhou and Yao Lu and Huijun Di",

note = "Publisher Copyright: Copyright {\textcopyright} 2022, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.; 36th AAAI Conference on Artificial Intelligence, AAAI 2022 ; Conference date: 22-02-2022 Through 01-03-2022",

year = "2022",

month = jun,

day = "30",

doi = "10.1609/aaai.v36i3.20153",

language = "English",

series = "Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022",

publisher = "Association for the Advancement of Artificial Intelligence",

pages = "2522--2530",

booktitle = "AAAI-22 Technical Tracks 3",

}

Wang, S, Zhou, T, Lu, Y & Di, H 2022, Detail-Preserving Transformer for Light Field Image Super-resolution. in AAAI-22 Technical Tracks 3. Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022, vol. 36, Association for the Advancement of Artificial Intelligence, pp. 2522-2530, 36th AAAI Conference on Artificial Intelligence, AAAI 2022, Virtual, Online, 22/02/22. https://doi.org/10.1609/aaai.v36i3.20153

Detail-Preserving Transformer for Light Field Image Super-resolution. / Wang, Shunzhou; Zhou, Tianfei; Lu, Yao et al.
AAAI-22 Technical Tracks 3. Association for the Advancement of Artificial Intelligence, 2022. p. 2522-2530 (Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022; Vol. 36).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Detail-Preserving Transformer for Light Field Image Super-resolution

AU - Wang, Shunzhou

AU - Zhou, Tianfei

AU - Lu, Yao

AU - Di, Huijun

PY - 2022/6/30

Y1 - 2022/6/30

N2 - Recently, numerous algorithms have been developed to tackle the problem of light field super-resolution (LFSR), i.e., super-resolving low-resolution light fields to gain high-resolution views. Despite delivering encouraging results, these approaches are all convolution-based, and are naturally weak in global relation modeling of sub-aperture images necessarily to characterize the inherent structure of light fields. In this paper, we put forth a novel formulation built upon Transformers, by treating LFSR as a sequence-to-sequence reconstruction task. In particular, our model regards sub-aperture images of each vertical or horizontal angular view as a sequence, and establishes long-range geometric dependencies within each sequence via a spatial-angular locally-enhanced self-attention layer, which maintains the locality of each sub-aperture image as well. Additionally, to better recover image details, we propose a detail-preserving Transformer (termed as DPT), by leveraging gradient maps of light field to guide the sequence learning. DPT consists of two branches, with each associated with a Transformer for learning from an original or gradient image sequence. The two branches are finally fused to obtain comprehensive feature representations for reconstruction. Evaluations are conducted on a number of light field datasets, including real-world scenes and synthetic data. The proposed method achieves superior performance comparing with other state-of-the-art schemes. Our code is publicly available at: https://github.com/BITszwang/DPT.

AB - Recently, numerous algorithms have been developed to tackle the problem of light field super-resolution (LFSR), i.e., super-resolving low-resolution light fields to gain high-resolution views. Despite delivering encouraging results, these approaches are all convolution-based, and are naturally weak in global relation modeling of sub-aperture images necessarily to characterize the inherent structure of light fields. In this paper, we put forth a novel formulation built upon Transformers, by treating LFSR as a sequence-to-sequence reconstruction task. In particular, our model regards sub-aperture images of each vertical or horizontal angular view as a sequence, and establishes long-range geometric dependencies within each sequence via a spatial-angular locally-enhanced self-attention layer, which maintains the locality of each sub-aperture image as well. Additionally, to better recover image details, we propose a detail-preserving Transformer (termed as DPT), by leveraging gradient maps of light field to guide the sequence learning. DPT consists of two branches, with each associated with a Transformer for learning from an original or gradient image sequence. The two branches are finally fused to obtain comprehensive feature representations for reconstruction. Evaluations are conducted on a number of light field datasets, including real-world scenes and synthetic data. The proposed method achieves superior performance comparing with other state-of-the-art schemes. Our code is publicly available at: https://github.com/BITszwang/DPT.

UR - http://www.scopus.com/inward/record.url?scp=85124232816&partnerID=8YFLogxK

U2 - 10.1609/aaai.v36i3.20153

DO - 10.1609/aaai.v36i3.20153

M3 - Conference contribution

AN - SCOPUS:85124232816

T3 - Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022

SP - 2522

EP - 2530

BT - AAAI-22 Technical Tracks 3

PB - Association for the Advancement of Artificial Intelligence

T2 - 36th AAAI Conference on Artificial Intelligence, AAAI 2022

Y2 - 22 February 2022 through 1 March 2022

ER -

Detail-Preserving Transformer for Light Field Image Super-resolution

Abstract

Publication series

Conference

Access to Document

Other files and links

Fingerprint

Cite this