Hybrid Spectral Denoising Transformer with Guided Attention

Zeqiang Lai; Chenggang Yan; Ying Fu

doi:10.1109/ICCV51070.2023.01201

Hybrid Spectral Denoising Transformer with Guided Attention

Zeqiang Lai, Chenggang Yan, Ying Fu^*

^*此作品的通讯作者

计算机学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

4 引用（Scopus）

摘要

In this paper, we present a Hybrid Spectral Denoising Transformer (HSDT) for hyperspectral image denoising. Challenges in adapting transformer for HSI arise from the capabilities to tackle existing limitations of CNN-based methods in capturing the global and local spatial-spectral correlations while maintaining efficiency and flexibility. To address these issues, we introduce a hybrid approach that combines the advantages of both models with a Spatial-Spectral Separable Convolution (S3Conv), Guided Spectral Self-Attention (GSSA), and Self-Modulated Feed-Forward Network (SM-FFN). Our S3Conv works as a lightweight alternative to 3D convolution, which extracts more spatial-spectral correlated features while keeping the flexibility to tackle HSIs with an arbitrary number of bands. These features are then adaptively processed by GSSA which performs 3D self-attention across the spectral bands, guided by a set of learnable queries that encode the spectral signatures. This not only enriches our model with powerful capabilities for identifying global spectral correlations but also maintains linear complexity. Moreover, our SM-FFN proposes the self-modulation that intensifies the activations of more informative regions, which further strengthens the aggregated features. Extensive experiments are conducted on various datasets under both simulated and real-world noise, and it shows that our HSDT significantly outperforms the existing state-of-the-art methods while maintaining low computational overhead. Code is at https://github.com/Zeqiang-Lai/HSDT.

源语言	英语
主期刊名	Proceedings - 2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023
出版商	Institute of Electrical and Electronics Engineers Inc.
页	13019-13029
页数	11
ISBN（电子版）	9798350307184
DOI	https://doi.org/10.1109/ICCV51070.2023.01201
出版状态	已出版 - 2023
活动	2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023 - Paris, 法国期限: 2 10月 2023 → 6 10月 2023

出版系列

姓名	Proceedings of the IEEE International Conference on Computer Vision
ISSN（印刷版）	1550-5499

会议

会议	2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023
国家/地区	法国
市	Paris
时期	2/10/23 → 6/10/23

访问文件

10.1109/ICCV51070.2023.01201

其它文件与链接

链接到 Scopus 的出版物

引用此

Lai, Z., Yan, C., & Fu, Y. (2023). Hybrid Spectral Denoising Transformer with Guided Attention. 在 Proceedings - 2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023 (页码 13019-13029). (Proceedings of the IEEE International Conference on Computer Vision). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICCV51070.2023.01201

@inproceedings{3f61e77032694c249f9e39eb7d92c5b9,

title = "Hybrid Spectral Denoising Transformer with Guided Attention",

abstract = "In this paper, we present a Hybrid Spectral Denoising Transformer (HSDT) for hyperspectral image denoising. Challenges in adapting transformer for HSI arise from the capabilities to tackle existing limitations of CNN-based methods in capturing the global and local spatial-spectral correlations while maintaining efficiency and flexibility. To address these issues, we introduce a hybrid approach that combines the advantages of both models with a Spatial-Spectral Separable Convolution (S3Conv), Guided Spectral Self-Attention (GSSA), and Self-Modulated Feed-Forward Network (SM-FFN). Our S3Conv works as a lightweight alternative to 3D convolution, which extracts more spatial-spectral correlated features while keeping the flexibility to tackle HSIs with an arbitrary number of bands. These features are then adaptively processed by GSSA which performs 3D self-attention across the spectral bands, guided by a set of learnable queries that encode the spectral signatures. This not only enriches our model with powerful capabilities for identifying global spectral correlations but also maintains linear complexity. Moreover, our SM-FFN proposes the self-modulation that intensifies the activations of more informative regions, which further strengthens the aggregated features. Extensive experiments are conducted on various datasets under both simulated and real-world noise, and it shows that our HSDT significantly outperforms the existing state-of-the-art methods while maintaining low computational overhead. Code is at https://github.com/Zeqiang-Lai/HSDT.",

author = "Zeqiang Lai and Chenggang Yan and Ying Fu",

note = "Publisher Copyright: {\textcopyright} 2023 IEEE.; 2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023 ; Conference date: 02-10-2023 Through 06-10-2023",

year = "2023",

doi = "10.1109/ICCV51070.2023.01201",

language = "English",

series = "Proceedings of the IEEE International Conference on Computer Vision",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "13019--13029",

booktitle = "Proceedings - 2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023",

address = "United States",

}

Lai, Z, Yan, C & Fu, Y 2023, Hybrid Spectral Denoising Transformer with Guided Attention. 在 Proceedings - 2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023. Proceedings of the IEEE International Conference on Computer Vision, Institute of Electrical and Electronics Engineers Inc., 页码 13019-13029, 2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023, Paris, 法国, 2/10/23. https://doi.org/10.1109/ICCV51070.2023.01201

Hybrid Spectral Denoising Transformer with Guided Attention. / Lai, Zeqiang; Yan, Chenggang; Fu, Ying.
Proceedings - 2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023. Institute of Electrical and Electronics Engineers Inc., 2023. 页码 13019-13029 (Proceedings of the IEEE International Conference on Computer Vision).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Hybrid Spectral Denoising Transformer with Guided Attention

AU - Lai, Zeqiang

AU - Yan, Chenggang

AU - Fu, Ying

PY - 2023

Y1 - 2023

N2 - In this paper, we present a Hybrid Spectral Denoising Transformer (HSDT) for hyperspectral image denoising. Challenges in adapting transformer for HSI arise from the capabilities to tackle existing limitations of CNN-based methods in capturing the global and local spatial-spectral correlations while maintaining efficiency and flexibility. To address these issues, we introduce a hybrid approach that combines the advantages of both models with a Spatial-Spectral Separable Convolution (S3Conv), Guided Spectral Self-Attention (GSSA), and Self-Modulated Feed-Forward Network (SM-FFN). Our S3Conv works as a lightweight alternative to 3D convolution, which extracts more spatial-spectral correlated features while keeping the flexibility to tackle HSIs with an arbitrary number of bands. These features are then adaptively processed by GSSA which performs 3D self-attention across the spectral bands, guided by a set of learnable queries that encode the spectral signatures. This not only enriches our model with powerful capabilities for identifying global spectral correlations but also maintains linear complexity. Moreover, our SM-FFN proposes the self-modulation that intensifies the activations of more informative regions, which further strengthens the aggregated features. Extensive experiments are conducted on various datasets under both simulated and real-world noise, and it shows that our HSDT significantly outperforms the existing state-of-the-art methods while maintaining low computational overhead. Code is at https://github.com/Zeqiang-Lai/HSDT.

AB - In this paper, we present a Hybrid Spectral Denoising Transformer (HSDT) for hyperspectral image denoising. Challenges in adapting transformer for HSI arise from the capabilities to tackle existing limitations of CNN-based methods in capturing the global and local spatial-spectral correlations while maintaining efficiency and flexibility. To address these issues, we introduce a hybrid approach that combines the advantages of both models with a Spatial-Spectral Separable Convolution (S3Conv), Guided Spectral Self-Attention (GSSA), and Self-Modulated Feed-Forward Network (SM-FFN). Our S3Conv works as a lightweight alternative to 3D convolution, which extracts more spatial-spectral correlated features while keeping the flexibility to tackle HSIs with an arbitrary number of bands. These features are then adaptively processed by GSSA which performs 3D self-attention across the spectral bands, guided by a set of learnable queries that encode the spectral signatures. This not only enriches our model with powerful capabilities for identifying global spectral correlations but also maintains linear complexity. Moreover, our SM-FFN proposes the self-modulation that intensifies the activations of more informative regions, which further strengthens the aggregated features. Extensive experiments are conducted on various datasets under both simulated and real-world noise, and it shows that our HSDT significantly outperforms the existing state-of-the-art methods while maintaining low computational overhead. Code is at https://github.com/Zeqiang-Lai/HSDT.

UR - http://www.scopus.com/inward/record.url?scp=85188243443&partnerID=8YFLogxK

U2 - 10.1109/ICCV51070.2023.01201

DO - 10.1109/ICCV51070.2023.01201

M3 - Conference contribution

AN - SCOPUS:85188243443

T3 - Proceedings of the IEEE International Conference on Computer Vision

SP - 13019

EP - 13029

BT - Proceedings - 2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023

Y2 - 2 October 2023 through 6 October 2023

ER -

Hybrid Spectral Denoising Transformer with Guided Attention

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此