Unsupervised Adversarial Example Detection of Vision Transformers for Trustworthy Edge Computing

Jiaxing Li*, Tan Yu’An, Jie Yang, Zhengdao Li, Heng Ye, Chenxiao Xia, Yuanzhang Li*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Many edge computing applications based on computer vision have harnessed the power of deep learning. As an emerging deep learning model for vision, Vision Transformer models have recently achieved record-breaking performance in various vision tasks. But many recent studies on the robustness of the Vision Transformer have shown that the Vision Transformer is still vulnerable to adversarial attacks and is easily affected by adversarial attacks, causing the model to misclassify the input. In this work, we ask an intriguing question: “Can Adversarial Perturbations against Vision Transformers be detected with model explanations?” Driven by this question, we observe that benign samples and adversarial examples have different attribution maps after applying the Grad-CAM interpretability method on the Vision Transformer model. We demonstrate that an adversarial example is a Feature Shift of the input data, which leads to an Attention Deviation of the visual model. We propose a framework for capturing the Attention Deviation of vision models to defend against adversarial attacks. Furthermore, experiments show that our model achieves expectative results.

Original languageEnglish
Article number220
JournalACM Transactions on Multimedia Computing, Communications and Applications
Volume21
Issue number8
DOIs
Publication statusPublished - 13 Aug 2025

Keywords

  • adversarial defense
  • Adversary attack
  • model explanation
  • trustworthy edge computing
  • vision transformer

Fingerprint

Dive into the research topics of 'Unsupervised Adversarial Example Detection of Vision Transformers for Trustworthy Edge Computing'. Together they form a unique fingerprint.

Cite this