TY - JOUR
T1 - Alternating attention Transformer for single image deraining
AU - Yang, Dawei
AU - He, Xin
AU - Zhang, Ruiheng
N1 - Publisher Copyright:
© 2023 Elsevier Inc.
PY - 2023/9
Y1 - 2023/9
N2 - Recently, Transformer-based network architectures have achieved significant improvements over convolutional neural networks (CNNs) in the field of single image deraining, due to the powerful ability of modeling non-local information. However, these approaches employ all token similarities between queries and keys to aggregate global features based on a dense self-attention mechanism, which may neglect to focus on the most relevant information and induce blurry effect by the irrelevant representations. To alleviate the above issues, we propose an effective alternating attention Transformer (called AAT) for boosting image deraining performance. Specifically, we only select the most useful similarity values based on top-k approximate calculation to achieve sparse self-attention. In our framework, the representational capability of Transformer is significantly improved by alternately applying dense and sparse self-attention blocks. In addition, we insert a multi-dilconv feed-forward network to replace the native MLP into our proposed AAT, in order to better characterize the multi-scale rain streaks distribution. To compensate for the lack of modeling of Transformer backbone on local features, we introduce the local feature refinement block for achieving high-quality derained results. Extensive experiments on benchmark datasets demonstrate the effectiveness of our proposed method. The source codes will be released.
AB - Recently, Transformer-based network architectures have achieved significant improvements over convolutional neural networks (CNNs) in the field of single image deraining, due to the powerful ability of modeling non-local information. However, these approaches employ all token similarities between queries and keys to aggregate global features based on a dense self-attention mechanism, which may neglect to focus on the most relevant information and induce blurry effect by the irrelevant representations. To alleviate the above issues, we propose an effective alternating attention Transformer (called AAT) for boosting image deraining performance. Specifically, we only select the most useful similarity values based on top-k approximate calculation to achieve sparse self-attention. In our framework, the representational capability of Transformer is significantly improved by alternately applying dense and sparse self-attention blocks. In addition, we insert a multi-dilconv feed-forward network to replace the native MLP into our proposed AAT, in order to better characterize the multi-scale rain streaks distribution. To compensate for the lack of modeling of Transformer backbone on local features, we introduce the local feature refinement block for achieving high-quality derained results. Extensive experiments on benchmark datasets demonstrate the effectiveness of our proposed method. The source codes will be released.
KW - Dense self-attention
KW - Image restoration
KW - Rain removal
KW - Single image deraining
KW - Sparse self-attention
KW - Vision Transformers
UR - http://www.scopus.com/inward/record.url?scp=85165938044&partnerID=8YFLogxK
U2 - 10.1016/j.dsp.2023.104144
DO - 10.1016/j.dsp.2023.104144
M3 - Review article
AN - SCOPUS:85165938044
SN - 1051-2004
VL - 141
JO - Digital Signal Processing: A Review Journal
JF - Digital Signal Processing: A Review Journal
M1 - 104144
ER -