Two-way feature-aligned and attention-rectified adversarial training

Haitao Zhang; Fan Jia; Quanxin Zhang; Yahong Han; Xiaohui Kuang; Yu An Tan

doi:10.1109/ICME46284.2020.9102777

Two-way feature-aligned and attention-rectified adversarial training

Haitao Zhang, Fan Jia, Quanxin Zhang, Yahong Han, Xiaohui Kuang, Yu An Tan

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

2 引用（Scopus）

摘要

Adversarial training increases robustness by augmenting training data with adversarial examples. However, vanilla adversarial training may be overfitting to certain adversarial attacks. Small perturbations in images bring in error which is gradually amplified when forwarded through the model so that the error leads to wrong classification. Besides, small perturbations will also distract classifier's attention to significant features that are relevant to the true label. In this paper, we propose a novel two-way feature-aligned and attention-rectified adversarial training (FAAR) to improve adversarial training (AT). FAAR utilizes two-way feature alignment and attention rectification to mitigate the problems mentioned above. FAAR effectively suppresses perturbations in lowlevel, high-level and global features by moving features of perturbed images towards those of clean images with twoway feature alignment. It also leads the model into focusing more on useful features which are correlated with true label through rectifying gradient-weighted attention. Besides, feature alignment activates attention rectification by reducing perturbations in high-level feature. Our proposed method FAAR surpasses other existing AT methods in three aspects. First, it pushes the model to keep invariant when dealing with different adversarial attacks and different magnitude of perturbations. Second, it can be applied to any convolution neural networks. Third, the training process is end-to-end. For experiments, FAAR shows promising defense performance on CIFAR-10 and ImageNet.

源语言	英语
主期刊名	2020 IEEE International Conference on Multimedia and Expo, ICME 2020
出版商	IEEE Computer Society
ISBN（电子版）	9781728113319
DOI	https://doi.org/10.1109/ICME46284.2020.9102777
出版状态	已出版 - 7月 2020
活动	2020 IEEE International Conference on Multimedia and Expo, ICME 2020 - London, 英国期限: 6 7月 2020 → 10 7月 2020

出版系列

姓名	Proceedings - IEEE International Conference on Multimedia and Expo
卷	2020-July
ISSN（印刷版）	1945-7871
ISSN（电子版）	1945-788X

会议

会议	2020 IEEE International Conference on Multimedia and Expo, ICME 2020
国家/地区	英国
市	London
时期	6/07/20 → 10/07/20

访问文件

10.1109/ICME46284.2020.9102777

其它文件与链接

链接到 Scopus 的出版物

引用此

Zhang, H., Jia, F., Zhang, Q., Han, Y., Kuang, X., & Tan, Y. A. (2020). Two-way feature-aligned and attention-rectified adversarial training. 在 2020 IEEE International Conference on Multimedia and Expo, ICME 2020 文章 9102777 (Proceedings - IEEE International Conference on Multimedia and Expo; 卷 2020-July). IEEE Computer Society. https://doi.org/10.1109/ICME46284.2020.9102777

@inproceedings{6eabbaa0750c426193f8703b83b9403c,

title = "Two-way feature-aligned and attention-rectified adversarial training",

abstract = "Adversarial training increases robustness by augmenting training data with adversarial examples. However, vanilla adversarial training may be overfitting to certain adversarial attacks. Small perturbations in images bring in error which is gradually amplified when forwarded through the model so that the error leads to wrong classification. Besides, small perturbations will also distract classifier's attention to significant features that are relevant to the true label. In this paper, we propose a novel two-way feature-aligned and attention-rectified adversarial training (FAAR) to improve adversarial training (AT). FAAR utilizes two-way feature alignment and attention rectification to mitigate the problems mentioned above. FAAR effectively suppresses perturbations in lowlevel, high-level and global features by moving features of perturbed images towards those of clean images with twoway feature alignment. It also leads the model into focusing more on useful features which are correlated with true label through rectifying gradient-weighted attention. Besides, feature alignment activates attention rectification by reducing perturbations in high-level feature. Our proposed method FAAR surpasses other existing AT methods in three aspects. First, it pushes the model to keep invariant when dealing with different adversarial attacks and different magnitude of perturbations. Second, it can be applied to any convolution neural networks. Third, the training process is end-to-end. For experiments, FAAR shows promising defense performance on CIFAR-10 and ImageNet.",

keywords = "Adversarial training, Attention rectification, Feature alignment",

author = "Haitao Zhang and Fan Jia and Quanxin Zhang and Yahong Han and Xiaohui Kuang and Tan, {Yu An}",

note = "Publisher Copyright: {\textcopyright} 2020 IEEE.; 2020 IEEE International Conference on Multimedia and Expo, ICME 2020 ; Conference date: 06-07-2020 Through 10-07-2020",

year = "2020",

month = jul,

doi = "10.1109/ICME46284.2020.9102777",

language = "English",

series = "Proceedings - IEEE International Conference on Multimedia and Expo",

publisher = "IEEE Computer Society",

booktitle = "2020 IEEE International Conference on Multimedia and Expo, ICME 2020",

address = "United States",

}

Zhang, H, Jia, F, Zhang, Q, Han, Y, Kuang, X & Tan, YA 2020, Two-way feature-aligned and attention-rectified adversarial training. 在 2020 IEEE International Conference on Multimedia and Expo, ICME 2020., 9102777, Proceedings - IEEE International Conference on Multimedia and Expo, 卷 2020-July, IEEE Computer Society, 2020 IEEE International Conference on Multimedia and Expo, ICME 2020, London, 英国, 6/07/20. https://doi.org/10.1109/ICME46284.2020.9102777

Two-way feature-aligned and attention-rectified adversarial training. / Zhang, Haitao; Jia, Fan; Zhang, Quanxin 等.
2020 IEEE International Conference on Multimedia and Expo, ICME 2020. IEEE Computer Society, 2020. 9102777 (Proceedings - IEEE International Conference on Multimedia and Expo; 卷 2020-July).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Two-way feature-aligned and attention-rectified adversarial training

AU - Zhang, Haitao

AU - Jia, Fan

AU - Zhang, Quanxin

AU - Han, Yahong

AU - Kuang, Xiaohui

AU - Tan, Yu An

PY - 2020/7

Y1 - 2020/7

N2 - Adversarial training increases robustness by augmenting training data with adversarial examples. However, vanilla adversarial training may be overfitting to certain adversarial attacks. Small perturbations in images bring in error which is gradually amplified when forwarded through the model so that the error leads to wrong classification. Besides, small perturbations will also distract classifier's attention to significant features that are relevant to the true label. In this paper, we propose a novel two-way feature-aligned and attention-rectified adversarial training (FAAR) to improve adversarial training (AT). FAAR utilizes two-way feature alignment and attention rectification to mitigate the problems mentioned above. FAAR effectively suppresses perturbations in lowlevel, high-level and global features by moving features of perturbed images towards those of clean images with twoway feature alignment. It also leads the model into focusing more on useful features which are correlated with true label through rectifying gradient-weighted attention. Besides, feature alignment activates attention rectification by reducing perturbations in high-level feature. Our proposed method FAAR surpasses other existing AT methods in three aspects. First, it pushes the model to keep invariant when dealing with different adversarial attacks and different magnitude of perturbations. Second, it can be applied to any convolution neural networks. Third, the training process is end-to-end. For experiments, FAAR shows promising defense performance on CIFAR-10 and ImageNet.

AB - Adversarial training increases robustness by augmenting training data with adversarial examples. However, vanilla adversarial training may be overfitting to certain adversarial attacks. Small perturbations in images bring in error which is gradually amplified when forwarded through the model so that the error leads to wrong classification. Besides, small perturbations will also distract classifier's attention to significant features that are relevant to the true label. In this paper, we propose a novel two-way feature-aligned and attention-rectified adversarial training (FAAR) to improve adversarial training (AT). FAAR utilizes two-way feature alignment and attention rectification to mitigate the problems mentioned above. FAAR effectively suppresses perturbations in lowlevel, high-level and global features by moving features of perturbed images towards those of clean images with twoway feature alignment. It also leads the model into focusing more on useful features which are correlated with true label through rectifying gradient-weighted attention. Besides, feature alignment activates attention rectification by reducing perturbations in high-level feature. Our proposed method FAAR surpasses other existing AT methods in three aspects. First, it pushes the model to keep invariant when dealing with different adversarial attacks and different magnitude of perturbations. Second, it can be applied to any convolution neural networks. Third, the training process is end-to-end. For experiments, FAAR shows promising defense performance on CIFAR-10 and ImageNet.

KW - Adversarial training

KW - Attention rectification

KW - Feature alignment

UR - http://www.scopus.com/inward/record.url?scp=85090381059&partnerID=8YFLogxK

U2 - 10.1109/ICME46284.2020.9102777

DO - 10.1109/ICME46284.2020.9102777

M3 - Conference contribution

AN - SCOPUS:85090381059

T3 - Proceedings - IEEE International Conference on Multimedia and Expo

BT - 2020 IEEE International Conference on Multimedia and Expo, ICME 2020

PB - IEEE Computer Society

T2 - 2020 IEEE International Conference on Multimedia and Expo, ICME 2020

Y2 - 6 July 2020 through 10 July 2020

ER -

Two-way feature-aligned and attention-rectified adversarial training

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此