Toward feature space adversarial attack in the frequency domain

Yajie Wang; Yu an Tan; Haoran Lyu; Shangbo Wu; Yuhang Zhao; Yuanzhang Li

doi:10.1002/int.23031

Toward feature space adversarial attack in the frequency domain

Yajie Wang, Yu an Tan, Haoran Lyu, Shangbo Wu, Yuhang Zhao, Yuanzhang Li^*

^*Corresponding author for this work

Beijing Institute of Technology

Research output: Contribution to journal › Article › peer-review

4 Citations (Scopus)

Abstract

Recent researchers have shown that deep neural networks (DNNs) are vulnerable to adversarial exemplars, making them unsuitable for security-critical applications. Transferability of adversarial examples is crucial for attacking black-box models, which facilitates adversarial attacks in more practical scenarios. We propose a novel adversarial attack with high transferability. Unlike existing attacks that directly modify the input pixels, our attack is executed in the feature space. More specifically, we corrupt the abstract features by maximizing the feature distance between the adversarial example and clean images with a perceptual similarity network, inducing model misclassification. In addition, we apply a spectral transformation to the input, thus narrowing the search space in the frequency domain to enhance the transferability of adversarial examples. The disruption of crucial features in a specific frequency component achieves greater transferability. Extensive evaluations illustrate that our approach is easily compatible with many existing frameworks for transfer attacks and can significantly improve the baseline performance of black-box attacks. Moreover, we can obtain a higher fooling rate even if the model has a defense technique. We achieve a maximum black-box fooling rate of 61.70% on the defense model. Our work indicates that existing pixel space defense techniques are difficult to guarantee the robustness of the feature space, and the feature space from a frequency perspective is promising for developing more robust models.

Original language	English
Pages (from-to)	11019-11036
Number of pages	18
Journal	International Journal of Intelligent Systems
Volume	37
Issue number	12
DOIs	https://doi.org/10.1002/int.23031
Publication status	Published - Dec 2022

Keywords

adversarial examples
black-box attack
computer
deep neural networks
transfer attack

Access to Document

10.1002/int.23031

Cite this

Wang, Y., Tan, Y. A., Lyu, H., Wu, S., Zhao, Y., & Li, Y. (2022). Toward feature space adversarial attack in the frequency domain. International Journal of Intelligent Systems, 37(12), 11019-11036. https://doi.org/10.1002/int.23031

@article{c489df2189204c80a0df4df58a8aa1fc,

title = "Toward feature space adversarial attack in the frequency domain",

abstract = "Recent researchers have shown that deep neural networks (DNNs) are vulnerable to adversarial exemplars, making them unsuitable for security-critical applications. Transferability of adversarial examples is crucial for attacking black-box models, which facilitates adversarial attacks in more practical scenarios. We propose a novel adversarial attack with high transferability. Unlike existing attacks that directly modify the input pixels, our attack is executed in the feature space. More specifically, we corrupt the abstract features by maximizing the feature distance between the adversarial example and clean images with a perceptual similarity network, inducing model misclassification. In addition, we apply a spectral transformation to the input, thus narrowing the search space in the frequency domain to enhance the transferability of adversarial examples. The disruption of crucial features in a specific frequency component achieves greater transferability. Extensive evaluations illustrate that our approach is easily compatible with many existing frameworks for transfer attacks and can significantly improve the baseline performance of black-box attacks. Moreover, we can obtain a higher fooling rate even if the model has a defense technique. We achieve a maximum black-box fooling rate of 61.70% on the defense model. Our work indicates that existing pixel space defense techniques are difficult to guarantee the robustness of the feature space, and the feature space from a frequency perspective is promising for developing more robust models.",

keywords = "adversarial examples, black-box attack, computer, deep neural networks, transfer attack",

author = "Yajie Wang and Tan, {Yu an} and Haoran Lyu and Shangbo Wu and Yuhang Zhao and Yuanzhang Li",

note = "Publisher Copyright: {\textcopyright} 2022 Wiley Periodicals LLC.",

year = "2022",

month = dec,

doi = "10.1002/int.23031",

language = "English",

volume = "37",

pages = "11019--11036",

journal = "International Journal of Intelligent Systems",

issn = "0884-8173",

publisher = "John Wiley and Sons Inc.",

number = "12",

}

TY - JOUR

T1 - Toward feature space adversarial attack in the frequency domain

AU - Wang, Yajie

AU - Tan, Yu an

AU - Lyu, Haoran

AU - Wu, Shangbo

AU - Zhao, Yuhang

AU - Li, Yuanzhang

PY - 2022/12

Y1 - 2022/12

N2 - Recent researchers have shown that deep neural networks (DNNs) are vulnerable to adversarial exemplars, making them unsuitable for security-critical applications. Transferability of adversarial examples is crucial for attacking black-box models, which facilitates adversarial attacks in more practical scenarios. We propose a novel adversarial attack with high transferability. Unlike existing attacks that directly modify the input pixels, our attack is executed in the feature space. More specifically, we corrupt the abstract features by maximizing the feature distance between the adversarial example and clean images with a perceptual similarity network, inducing model misclassification. In addition, we apply a spectral transformation to the input, thus narrowing the search space in the frequency domain to enhance the transferability of adversarial examples. The disruption of crucial features in a specific frequency component achieves greater transferability. Extensive evaluations illustrate that our approach is easily compatible with many existing frameworks for transfer attacks and can significantly improve the baseline performance of black-box attacks. Moreover, we can obtain a higher fooling rate even if the model has a defense technique. We achieve a maximum black-box fooling rate of 61.70% on the defense model. Our work indicates that existing pixel space defense techniques are difficult to guarantee the robustness of the feature space, and the feature space from a frequency perspective is promising for developing more robust models.

AB - Recent researchers have shown that deep neural networks (DNNs) are vulnerable to adversarial exemplars, making them unsuitable for security-critical applications. Transferability of adversarial examples is crucial for attacking black-box models, which facilitates adversarial attacks in more practical scenarios. We propose a novel adversarial attack with high transferability. Unlike existing attacks that directly modify the input pixels, our attack is executed in the feature space. More specifically, we corrupt the abstract features by maximizing the feature distance between the adversarial example and clean images with a perceptual similarity network, inducing model misclassification. In addition, we apply a spectral transformation to the input, thus narrowing the search space in the frequency domain to enhance the transferability of adversarial examples. The disruption of crucial features in a specific frequency component achieves greater transferability. Extensive evaluations illustrate that our approach is easily compatible with many existing frameworks for transfer attacks and can significantly improve the baseline performance of black-box attacks. Moreover, we can obtain a higher fooling rate even if the model has a defense technique. We achieve a maximum black-box fooling rate of 61.70% on the defense model. Our work indicates that existing pixel space defense techniques are difficult to guarantee the robustness of the feature space, and the feature space from a frequency perspective is promising for developing more robust models.

KW - adversarial examples

KW - black-box attack

KW - computer

KW - deep neural networks

KW - transfer attack

UR - http://www.scopus.com/inward/record.url?scp=85136619206&partnerID=8YFLogxK

U2 - 10.1002/int.23031

DO - 10.1002/int.23031

M3 - Article

AN - SCOPUS:85136619206

SN - 0884-8173

VL - 37

SP - 11019

EP - 11036

JO - International Journal of Intelligent Systems

JF - International Journal of Intelligent Systems

IS - 12

ER -

Toward feature space adversarial attack in the frequency domain

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this