TY - JOUR
T1 - Black-Box Targeted Adversarial Attack on Segment Anything (SAM)
AU - Zheng, Sheng
AU - Zhang, Chaoning
AU - Hao, Xinhong
N1 - Publisher Copyright:
© 1999-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Deep recognition models are widely vulnerable to adversarial examples, which change the model output by adding quasi-imperceptible perturbation to the image input. Recently, Segment Anything Model (SAM) has emerged to become a popular foundation model in computer vision due to its impressive generalization to unseen data and tasks. Realizing flexible attacks on SAM is beneficial for understanding the robustness of SAM in the adversarial context. To this end, this work aims to achieve a targeted adversarial attack (TAA) on SAM. Specifically, under a specific prompt, the goal is to make the predicted mask of an adversarial example resemble that of a given target image. The task of TAA on SAM has been realized in the white-box setup by assuming access to prompt and model, which is thus less practical. To address the issue of prompt dependence, we propose a simple yet effective approach by only attacking the image encoder. Moreover, we propose a novel regularization loss to enhance the cross-model transferability by increasing the feature dominance of adversarial images over random natural images. Extensive experiments verify the effectiveness of our proposed method to conduct a successful black-box TAA on SAM.
AB - Deep recognition models are widely vulnerable to adversarial examples, which change the model output by adding quasi-imperceptible perturbation to the image input. Recently, Segment Anything Model (SAM) has emerged to become a popular foundation model in computer vision due to its impressive generalization to unseen data and tasks. Realizing flexible attacks on SAM is beneficial for understanding the robustness of SAM in the adversarial context. To this end, this work aims to achieve a targeted adversarial attack (TAA) on SAM. Specifically, under a specific prompt, the goal is to make the predicted mask of an adversarial example resemble that of a given target image. The task of TAA on SAM has been realized in the white-box setup by assuming access to prompt and model, which is thus less practical. To address the issue of prompt dependence, we propose a simple yet effective approach by only attacking the image encoder. Moreover, we propose a novel regularization loss to enhance the cross-model transferability by increasing the feature dominance of adversarial images over random natural images. Extensive experiments verify the effectiveness of our proposed method to conduct a successful black-box TAA on SAM.
KW - black-box
KW - practical
KW - robustness
KW - segment anything model (SAM)
KW - Targeted adversarial attack
UR - http://www.scopus.com/inward/record.url?scp=105002264877&partnerID=8YFLogxK
U2 - 10.1109/TMM.2024.3521769
DO - 10.1109/TMM.2024.3521769
M3 - Article
AN - SCOPUS:105002264877
SN - 1520-9210
VL - 27
SP - 1901
EP - 1913
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
ER -