TY - JOUR
T1 - MAS-PD
T2 - Transferable Adversarial Attack against Vision-Transformers-Based SAR Image Classification Task
AU - Zheng, Boshi
AU - Liu, Jiabin
AU - Li, Yunjie
AU - Li, Yan
AU - Qin, Zhen
N1 - Publisher Copyright:
© 2008-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Synthetic aperture radar (SAR) is widely used in civil and military fields. With advancements in vision transformer (ViT) research, these models have become increasingly important in SAR image classification due to their remarkable performance. Therefore, effectively interfering with the classification results of enemy radar systems has become a crucial factor in ensuring battlefield security. Adversarial attacks offer a potential solution, as they can significantly mislead models and cause incorrect predictions. However, recent research on adversarial examples focus on the vulnerability of Convolutional Neural Network (CNN) models, while the attack on transformer models has not been extensively studied. Considering that ViTs differ from CNNs due to its unique multi-head self-attention (MSA) mechanism and its approach of segmenting images into patches for input, this paper proposes a"MAS-PD"black-box adversarial attack method targeting these two mechanisms in ViTs. Firstly, to target the MSA mechanism, we propose the Momentum Attention Skipping (MAS) attack. By skipping the attention gradient during backpropagation and using momentum to avoid local maxima during gradient ascent, our method enhances the transferability of adversarial attacks across different models. Secondly, we apply dropout on input patches in each iteration, achieving higher attack success rates compared to using all patches. We compare our method with four traditional adversarial attack techniques across different model architectures, including CNNs and ViTs, using the publicly available MSTAR SAR dataset. The experimental results show that our method achieves an average Attack Success Rate (ASR) of 68.82% across ViTs, while other methods achieve no more than 50% ASR on average. When applied to CNNs, our method also achieves an average ASR of 67.14%, compared to less than 40% ASR for other methods. The experiment results demonstrate that our algorithm significantly enhances transferability between ViTs and from ViTs to CNNs in SAR image classification tasks.
AB - Synthetic aperture radar (SAR) is widely used in civil and military fields. With advancements in vision transformer (ViT) research, these models have become increasingly important in SAR image classification due to their remarkable performance. Therefore, effectively interfering with the classification results of enemy radar systems has become a crucial factor in ensuring battlefield security. Adversarial attacks offer a potential solution, as they can significantly mislead models and cause incorrect predictions. However, recent research on adversarial examples focus on the vulnerability of Convolutional Neural Network (CNN) models, while the attack on transformer models has not been extensively studied. Considering that ViTs differ from CNNs due to its unique multi-head self-attention (MSA) mechanism and its approach of segmenting images into patches for input, this paper proposes a"MAS-PD"black-box adversarial attack method targeting these two mechanisms in ViTs. Firstly, to target the MSA mechanism, we propose the Momentum Attention Skipping (MAS) attack. By skipping the attention gradient during backpropagation and using momentum to avoid local maxima during gradient ascent, our method enhances the transferability of adversarial attacks across different models. Secondly, we apply dropout on input patches in each iteration, achieving higher attack success rates compared to using all patches. We compare our method with four traditional adversarial attack techniques across different model architectures, including CNNs and ViTs, using the publicly available MSTAR SAR dataset. The experimental results show that our method achieves an average Attack Success Rate (ASR) of 68.82% across ViTs, while other methods achieve no more than 50% ASR on average. When applied to CNNs, our method also achieves an average ASR of 67.14%, compared to less than 40% ASR for other methods. The experiment results demonstrate that our algorithm significantly enhances transferability between ViTs and from ViTs to CNNs in SAR image classification tasks.
KW - Adversarial attack
KW - black-box attack
KW - synthetic aperture radar (SAR)
KW - vision transformers (ViTs)
UR - http://www.scopus.com/inward/record.url?scp=85219274728&partnerID=8YFLogxK
U2 - 10.1109/JSTARS.2025.3546271
DO - 10.1109/JSTARS.2025.3546271
M3 - Article
AN - SCOPUS:85219274728
SN - 1939-1404
JO - IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
JF - IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
ER -