Detail-Preserving Transformer for Light Field Image Super-resolution

Shunzhou Wang; Tianfei Zhou; Yao Lu; Huijun Di

Detail-Preserving Transformer for Light Field Image Super-resolution

Shunzhou Wang, Tianfei Zhou, Yao Lu^*, Huijun Di

^*此作品的通讯作者

计算机学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

84 引用（Scopus）

摘要

Recently, numerous algorithms have been developed to tackle the problem of light field super-resolution (LFSR), i.e., super-resolving low-resolution light fields to gain high-resolution views. Despite delivering encouraging results, these approaches are all convolution-based, and are naturally weak in global relation modeling of sub-aperture images necessarily to characterize the inherent structure of light fields. In this paper, we put forth a novel formulation built upon Transformers, by treating LFSR as a sequence-to-sequence reconstruction task. In particular, our model regards sub-aperture images of each vertical or horizontal angular view as a sequence, and establishes long-range geometric dependencies within each sequence via a spatial-angular locally-enhanced self-attention layer, which maintains the locality of each sub-aperture image as well. Additionally, to better recover image details, we propose a detail-preserving Transformer (termed as DPT), by leveraging gradient maps of light field to guide the sequence learning. DPT consists of two branches, with each associated with a Transformer for learning from an original or gradient image sequence. The two branches are finally fused to obtain comprehensive feature representations for reconstruction. Evaluations are conducted on a number of light field datasets, including real-world scenes and synthetic data. The proposed method achieves superior performance comparing with other state-of-the-art schemes. Our code is publicly available at: https://github.com/BITszwang/DPT.

源语言	英语
主期刊名	AAAI-22 Technical Tracks 3
出版商	Association for the Advancement of Artificial Intelligence
页	2522-2530
页数	9
ISBN（电子版）	1577358767, 9781577358763
出版状态	已出版 - 30 6月 2022
活动	36th AAAI Conference on Artificial Intelligence, AAAI 2022 - Virtual, Online 期限: 22 2月 2022 → 1 3月 2022

出版系列

姓名	Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022
卷	36

会议

会议	36th AAAI Conference on Artificial Intelligence, AAAI 2022
市	Virtual, Online
时期	22/02/22 → 1/03/22

其它文件与链接

链接到 Scopus 的出版物

引用此

@inproceedings{4f7d722094194fe4a76b5c6bd1fbd561,

title = "Detail-Preserving Transformer for Light Field Image Super-resolution",

abstract = "Recently, numerous algorithms have been developed to tackle the problem of light field super-resolution (LFSR), i.e., super-resolving low-resolution light fields to gain high-resolution views. Despite delivering encouraging results, these approaches are all convolution-based, and are naturally weak in global relation modeling of sub-aperture images necessarily to characterize the inherent structure of light fields. In this paper, we put forth a novel formulation built upon Transformers, by treating LFSR as a sequence-to-sequence reconstruction task. In particular, our model regards sub-aperture images of each vertical or horizontal angular view as a sequence, and establishes long-range geometric dependencies within each sequence via a spatial-angular locally-enhanced self-attention layer, which maintains the locality of each sub-aperture image as well. Additionally, to better recover image details, we propose a detail-preserving Transformer (termed as DPT), by leveraging gradient maps of light field to guide the sequence learning. DPT consists of two branches, with each associated with a Transformer for learning from an original or gradient image sequence. The two branches are finally fused to obtain comprehensive feature representations for reconstruction. Evaluations are conducted on a number of light field datasets, including real-world scenes and synthetic data. The proposed method achieves superior performance comparing with other state-of-the-art schemes. Our code is publicly available at: https://github.com/BITszwang/DPT.",

author = "Shunzhou Wang and Tianfei Zhou and Yao Lu and Huijun Di",

note = "Publisher Copyright: Copyright {\textcopyright} 2022, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.; 36th AAAI Conference on Artificial Intelligence, AAAI 2022 ; Conference date: 22-02-2022 Through 01-03-2022",

year = "2022",

month = jun,

day = "30",

language = "English",

series = "Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022",

publisher = "Association for the Advancement of Artificial Intelligence",

pages = "2522--2530",

booktitle = "AAAI-22 Technical Tracks 3",

}

Wang, S, Zhou, T, Lu, Y & Di, H 2022, Detail-Preserving Transformer for Light Field Image Super-resolution. 在 AAAI-22 Technical Tracks 3. Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022, 卷 36, Association for the Advancement of Artificial Intelligence, 页码 2522-2530, 36th AAAI Conference on Artificial Intelligence, AAAI 2022, Virtual, Online, 22/02/22.

Detail-Preserving Transformer for Light Field Image Super-resolution. / Wang, Shunzhou; Zhou, Tianfei; Lu, Yao 等.
AAAI-22 Technical Tracks 3. Association for the Advancement of Artificial Intelligence, 2022. 页码 2522-2530 (Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022; 卷 36).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Detail-Preserving Transformer for Light Field Image Super-resolution

AU - Wang, Shunzhou

AU - Zhou, Tianfei

AU - Lu, Yao

AU - Di, Huijun

PY - 2022/6/30

Y1 - 2022/6/30

N2 - Recently, numerous algorithms have been developed to tackle the problem of light field super-resolution (LFSR), i.e., super-resolving low-resolution light fields to gain high-resolution views. Despite delivering encouraging results, these approaches are all convolution-based, and are naturally weak in global relation modeling of sub-aperture images necessarily to characterize the inherent structure of light fields. In this paper, we put forth a novel formulation built upon Transformers, by treating LFSR as a sequence-to-sequence reconstruction task. In particular, our model regards sub-aperture images of each vertical or horizontal angular view as a sequence, and establishes long-range geometric dependencies within each sequence via a spatial-angular locally-enhanced self-attention layer, which maintains the locality of each sub-aperture image as well. Additionally, to better recover image details, we propose a detail-preserving Transformer (termed as DPT), by leveraging gradient maps of light field to guide the sequence learning. DPT consists of two branches, with each associated with a Transformer for learning from an original or gradient image sequence. The two branches are finally fused to obtain comprehensive feature representations for reconstruction. Evaluations are conducted on a number of light field datasets, including real-world scenes and synthetic data. The proposed method achieves superior performance comparing with other state-of-the-art schemes. Our code is publicly available at: https://github.com/BITszwang/DPT.

AB - Recently, numerous algorithms have been developed to tackle the problem of light field super-resolution (LFSR), i.e., super-resolving low-resolution light fields to gain high-resolution views. Despite delivering encouraging results, these approaches are all convolution-based, and are naturally weak in global relation modeling of sub-aperture images necessarily to characterize the inherent structure of light fields. In this paper, we put forth a novel formulation built upon Transformers, by treating LFSR as a sequence-to-sequence reconstruction task. In particular, our model regards sub-aperture images of each vertical or horizontal angular view as a sequence, and establishes long-range geometric dependencies within each sequence via a spatial-angular locally-enhanced self-attention layer, which maintains the locality of each sub-aperture image as well. Additionally, to better recover image details, we propose a detail-preserving Transformer (termed as DPT), by leveraging gradient maps of light field to guide the sequence learning. DPT consists of two branches, with each associated with a Transformer for learning from an original or gradient image sequence. The two branches are finally fused to obtain comprehensive feature representations for reconstruction. Evaluations are conducted on a number of light field datasets, including real-world scenes and synthetic data. The proposed method achieves superior performance comparing with other state-of-the-art schemes. Our code is publicly available at: https://github.com/BITszwang/DPT.

UR - http://www.scopus.com/inward/record.url?scp=85124232816&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85124232816

T3 - Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022

SP - 2522

EP - 2530

BT - AAAI-22 Technical Tracks 3

PB - Association for the Advancement of Artificial Intelligence

T2 - 36th AAAI Conference on Artificial Intelligence, AAAI 2022

Y2 - 22 February 2022 through 1 March 2022

ER -

Detail-Preserving Transformer for Light Field Image Super-resolution

摘要

出版系列

会议

其它文件与链接

指纹

引用此