Two-way feature-aligned and attention-rectified adversarial training

Haitao Zhang; Fan Jia; Quanxin Zhang; Yahong Han; Xiaohui Kuang; Yu An Tan

doi:10.1109/ICME46284.2020.9102777

Two-way feature-aligned and attention-rectified adversarial training

Haitao Zhang, Fan Jia, Quanxin Zhang, Yahong Han, Xiaohui Kuang, Yu An Tan

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

2 Citations (Scopus)

Abstract

Adversarial training increases robustness by augmenting training data with adversarial examples. However, vanilla adversarial training may be overfitting to certain adversarial attacks. Small perturbations in images bring in error which is gradually amplified when forwarded through the model so that the error leads to wrong classification. Besides, small perturbations will also distract classifier's attention to significant features that are relevant to the true label. In this paper, we propose a novel two-way feature-aligned and attention-rectified adversarial training (FAAR) to improve adversarial training (AT). FAAR utilizes two-way feature alignment and attention rectification to mitigate the problems mentioned above. FAAR effectively suppresses perturbations in lowlevel, high-level and global features by moving features of perturbed images towards those of clean images with twoway feature alignment. It also leads the model into focusing more on useful features which are correlated with true label through rectifying gradient-weighted attention. Besides, feature alignment activates attention rectification by reducing perturbations in high-level feature. Our proposed method FAAR surpasses other existing AT methods in three aspects. First, it pushes the model to keep invariant when dealing with different adversarial attacks and different magnitude of perturbations. Second, it can be applied to any convolution neural networks. Third, the training process is end-to-end. For experiments, FAAR shows promising defense performance on CIFAR-10 and ImageNet.

Original language	English
Title of host publication	2020 IEEE International Conference on Multimedia and Expo, ICME 2020
Publisher	IEEE Computer Society
ISBN (Electronic)	9781728113319
DOIs	https://doi.org/10.1109/ICME46284.2020.9102777
Publication status	Published - Jul 2020
Event	2020 IEEE International Conference on Multimedia and Expo, ICME 2020 - London, United Kingdom Duration: 6 Jul 2020 → 10 Jul 2020

Publication series

Name	Proceedings - IEEE International Conference on Multimedia and Expo
Volume	2020-July
ISSN (Print)	1945-7871
ISSN (Electronic)	1945-788X

Conference

Conference	2020 IEEE International Conference on Multimedia and Expo, ICME 2020
Country/Territory	United Kingdom
City	London
Period	6/07/20 → 10/07/20

Keywords

Adversarial training
Attention rectification
Feature alignment

Access to Document

10.1109/ICME46284.2020.9102777

Cite this

Zhang, H., Jia, F., Zhang, Q., Han, Y., Kuang, X., & Tan, Y. A. (2020). Two-way feature-aligned and attention-rectified adversarial training. In 2020 IEEE International Conference on Multimedia and Expo, ICME 2020 Article 9102777 (Proceedings - IEEE International Conference on Multimedia and Expo; Vol. 2020-July). IEEE Computer Society. https://doi.org/10.1109/ICME46284.2020.9102777

@inproceedings{6eabbaa0750c426193f8703b83b9403c,

title = "Two-way feature-aligned and attention-rectified adversarial training",

abstract = "Adversarial training increases robustness by augmenting training data with adversarial examples. However, vanilla adversarial training may be overfitting to certain adversarial attacks. Small perturbations in images bring in error which is gradually amplified when forwarded through the model so that the error leads to wrong classification. Besides, small perturbations will also distract classifier's attention to significant features that are relevant to the true label. In this paper, we propose a novel two-way feature-aligned and attention-rectified adversarial training (FAAR) to improve adversarial training (AT). FAAR utilizes two-way feature alignment and attention rectification to mitigate the problems mentioned above. FAAR effectively suppresses perturbations in lowlevel, high-level and global features by moving features of perturbed images towards those of clean images with twoway feature alignment. It also leads the model into focusing more on useful features which are correlated with true label through rectifying gradient-weighted attention. Besides, feature alignment activates attention rectification by reducing perturbations in high-level feature. Our proposed method FAAR surpasses other existing AT methods in three aspects. First, it pushes the model to keep invariant when dealing with different adversarial attacks and different magnitude of perturbations. Second, it can be applied to any convolution neural networks. Third, the training process is end-to-end. For experiments, FAAR shows promising defense performance on CIFAR-10 and ImageNet.",

keywords = "Adversarial training, Attention rectification, Feature alignment",

author = "Haitao Zhang and Fan Jia and Quanxin Zhang and Yahong Han and Xiaohui Kuang and Tan, {Yu An}",

note = "Publisher Copyright: {\textcopyright} 2020 IEEE.; 2020 IEEE International Conference on Multimedia and Expo, ICME 2020 ; Conference date: 06-07-2020 Through 10-07-2020",

year = "2020",

month = jul,

doi = "10.1109/ICME46284.2020.9102777",

language = "English",

series = "Proceedings - IEEE International Conference on Multimedia and Expo",

publisher = "IEEE Computer Society",

booktitle = "2020 IEEE International Conference on Multimedia and Expo, ICME 2020",

address = "United States",

}

Zhang, H, Jia, F, Zhang, Q, Han, Y, Kuang, X & Tan, YA 2020, Two-way feature-aligned and attention-rectified adversarial training. in 2020 IEEE International Conference on Multimedia and Expo, ICME 2020., 9102777, Proceedings - IEEE International Conference on Multimedia and Expo, vol. 2020-July, IEEE Computer Society, 2020 IEEE International Conference on Multimedia and Expo, ICME 2020, London, United Kingdom, 6/07/20. https://doi.org/10.1109/ICME46284.2020.9102777

Two-way feature-aligned and attention-rectified adversarial training. / Zhang, Haitao; Jia, Fan; Zhang, Quanxin et al.
2020 IEEE International Conference on Multimedia and Expo, ICME 2020. IEEE Computer Society, 2020. 9102777 (Proceedings - IEEE International Conference on Multimedia and Expo; Vol. 2020-July).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Two-way feature-aligned and attention-rectified adversarial training

AU - Zhang, Haitao

AU - Jia, Fan

AU - Zhang, Quanxin

AU - Han, Yahong

AU - Kuang, Xiaohui

AU - Tan, Yu An

PY - 2020/7

Y1 - 2020/7

N2 - Adversarial training increases robustness by augmenting training data with adversarial examples. However, vanilla adversarial training may be overfitting to certain adversarial attacks. Small perturbations in images bring in error which is gradually amplified when forwarded through the model so that the error leads to wrong classification. Besides, small perturbations will also distract classifier's attention to significant features that are relevant to the true label. In this paper, we propose a novel two-way feature-aligned and attention-rectified adversarial training (FAAR) to improve adversarial training (AT). FAAR utilizes two-way feature alignment and attention rectification to mitigate the problems mentioned above. FAAR effectively suppresses perturbations in lowlevel, high-level and global features by moving features of perturbed images towards those of clean images with twoway feature alignment. It also leads the model into focusing more on useful features which are correlated with true label through rectifying gradient-weighted attention. Besides, feature alignment activates attention rectification by reducing perturbations in high-level feature. Our proposed method FAAR surpasses other existing AT methods in three aspects. First, it pushes the model to keep invariant when dealing with different adversarial attacks and different magnitude of perturbations. Second, it can be applied to any convolution neural networks. Third, the training process is end-to-end. For experiments, FAAR shows promising defense performance on CIFAR-10 and ImageNet.

AB - Adversarial training increases robustness by augmenting training data with adversarial examples. However, vanilla adversarial training may be overfitting to certain adversarial attacks. Small perturbations in images bring in error which is gradually amplified when forwarded through the model so that the error leads to wrong classification. Besides, small perturbations will also distract classifier's attention to significant features that are relevant to the true label. In this paper, we propose a novel two-way feature-aligned and attention-rectified adversarial training (FAAR) to improve adversarial training (AT). FAAR utilizes two-way feature alignment and attention rectification to mitigate the problems mentioned above. FAAR effectively suppresses perturbations in lowlevel, high-level and global features by moving features of perturbed images towards those of clean images with twoway feature alignment. It also leads the model into focusing more on useful features which are correlated with true label through rectifying gradient-weighted attention. Besides, feature alignment activates attention rectification by reducing perturbations in high-level feature. Our proposed method FAAR surpasses other existing AT methods in three aspects. First, it pushes the model to keep invariant when dealing with different adversarial attacks and different magnitude of perturbations. Second, it can be applied to any convolution neural networks. Third, the training process is end-to-end. For experiments, FAAR shows promising defense performance on CIFAR-10 and ImageNet.

KW - Adversarial training

KW - Attention rectification

KW - Feature alignment

UR - http://www.scopus.com/inward/record.url?scp=85090381059&partnerID=8YFLogxK

U2 - 10.1109/ICME46284.2020.9102777

DO - 10.1109/ICME46284.2020.9102777

M3 - Conference contribution

AN - SCOPUS:85090381059

T3 - Proceedings - IEEE International Conference on Multimedia and Expo

BT - 2020 IEEE International Conference on Multimedia and Expo, ICME 2020

PB - IEEE Computer Society

T2 - 2020 IEEE International Conference on Multimedia and Expo, ICME 2020

Y2 - 6 July 2020 through 10 July 2020

ER -

Two-way feature-aligned and attention-rectified adversarial training

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this