A Certified Radius-Guided Attack Framework to Image Segmentation Models

Wenjie Qu; Youqi Li; Binghui Wang

doi:10.1109/EuroSP57164.2023.00021

A Certified Radius-Guided Attack Framework to Image Segmentation Models

Wenjie Qu^*, Youqi Li, Binghui Wang

^*此作品的通讯作者

计算机学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

1 引用（Scopus）

摘要

Image segmentation is an important problem in many safety-critical applications such as medical imaging and autonomous driving. Recent studies show that modern image segmentation models are vulnerable to adversarial perturbations, while existing attack methods mainly follow the idea of attacking image classification models. We argue that image segmentation and classification have inherent differences, and design an attack framework specially for image segmentation models. Our goal is to thoroughly explore the vulnerabilities of modern segmentation models, i.e., aiming to misclassify as many pixels as possible under a perturbation budget in both white-box and black-box settings.Our attack framework is inspired by certified radius, which was originally used by defenders to defend against adversarial perturbations to classification models. We are the first, from the attacker perspective, to leverage the properties of certified radius and propose a certified radius guided attack framework against image segmentation models. Specifically, we first adapt randomized smoothing, the state-of-the-art certification method for classification models, to derive the pixel's certified radius. A larger certified radius of a pixel means the pixel is theoretically more robust to adversarial perturbations. This observation inspires us to focus more on disrupting pixels with relatively smaller certified radii. Accordingly, we design a pixel-wise certified radius guided loss, when plugged into any existing white-box attack, yields our certified radius-guided white-box attack.Next, we propose the first black-box attack to image segmentation models via bandit. A key challenge is no gradient information is available. To address it, we design a novel gradient estimator, based on bandit feedback, which is query-efficient and provably unbiased and stable. We use this gradient estimator to design a projected bandit gradient descent (PBGD) attack. We further use pixels' certified radii and design a certified radius-guided PBGD (CR-PBGD) attack. We prove our PBGD and CR-PBGD attacks can achieve asymptotically optimal attack performance with an optimal rate. We evaluate our certified-radius guided white-box and black-box attacks on multiple modern image segmentation models and datasets. Our results validate the effectiveness of our certified radius-guided attack framework.

源语言	英语
主期刊名	Proceedings - 8th IEEE European Symposium on Security and Privacy, Euro S and P 2023
出版商	Institute of Electrical and Electronics Engineers Inc.
页	200-220
页数	21
ISBN（电子版）	9781665465120
DOI	https://doi.org/10.1109/EuroSP57164.2023.00021
出版状态	已出版 - 2023
活动	8th IEEE European Symposium on Security and Privacy, Euro S and P 2023 - Delft, 荷兰期限: 3 7月 2023 → 7 7月 2023

出版系列

姓名	Proceedings - 8th IEEE European Symposium on Security and Privacy, Euro S and P 2023

会议

会议	8th IEEE European Symposium on Security and Privacy, Euro S and P 2023
国家/地区	荷兰
市	Delft
时期	3/07/23 → 7/07/23

访问文件

10.1109/EuroSP57164.2023.00021

其它文件与链接

链接到 Scopus 的出版物

引用此

Qu, W., Li, Y., & Wang, B. (2023). A Certified Radius-Guided Attack Framework to Image Segmentation Models. 在 Proceedings - 8th IEEE European Symposium on Security and Privacy, Euro S and P 2023 (页码 200-220). (Proceedings - 8th IEEE European Symposium on Security and Privacy, Euro S and P 2023). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/EuroSP57164.2023.00021

@inproceedings{9632a27da713413c86a4debdc511b46c,

title = "A Certified Radius-Guided Attack Framework to Image Segmentation Models",

abstract = "Image segmentation is an important problem in many safety-critical applications such as medical imaging and autonomous driving. Recent studies show that modern image segmentation models are vulnerable to adversarial perturbations, while existing attack methods mainly follow the idea of attacking image classification models. We argue that image segmentation and classification have inherent differences, and design an attack framework specially for image segmentation models. Our goal is to thoroughly explore the vulnerabilities of modern segmentation models, i.e., aiming to misclassify as many pixels as possible under a perturbation budget in both white-box and black-box settings.Our attack framework is inspired by certified radius, which was originally used by defenders to defend against adversarial perturbations to classification models. We are the first, from the attacker perspective, to leverage the properties of certified radius and propose a certified radius guided attack framework against image segmentation models. Specifically, we first adapt randomized smoothing, the state-of-the-art certification method for classification models, to derive the pixel's certified radius. A larger certified radius of a pixel means the pixel is theoretically more robust to adversarial perturbations. This observation inspires us to focus more on disrupting pixels with relatively smaller certified radii. Accordingly, we design a pixel-wise certified radius guided loss, when plugged into any existing white-box attack, yields our certified radius-guided white-box attack.Next, we propose the first black-box attack to image segmentation models via bandit. A key challenge is no gradient information is available. To address it, we design a novel gradient estimator, based on bandit feedback, which is query-efficient and provably unbiased and stable. We use this gradient estimator to design a projected bandit gradient descent (PBGD) attack. We further use pixels' certified radii and design a certified radius-guided PBGD (CR-PBGD) attack. We prove our PBGD and CR-PBGD attacks can achieve asymptotically optimal attack performance with an optimal rate. We evaluate our certified-radius guided white-box and black-box attacks on multiple modern image segmentation models and datasets. Our results validate the effectiveness of our certified radius-guided attack framework.",

author = "Wenjie Qu and Youqi Li and Binghui Wang",

note = "Publisher Copyright: {\textcopyright} 2023 IEEE.; 8th IEEE European Symposium on Security and Privacy, Euro S and P 2023 ; Conference date: 03-07-2023 Through 07-07-2023",

year = "2023",

doi = "10.1109/EuroSP57164.2023.00021",

language = "English",

series = "Proceedings - 8th IEEE European Symposium on Security and Privacy, Euro S and P 2023",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "200--220",

booktitle = "Proceedings - 8th IEEE European Symposium on Security and Privacy, Euro S and P 2023",

address = "United States",

}

Qu, W, Li, Y & Wang, B 2023, A Certified Radius-Guided Attack Framework to Image Segmentation Models. 在 Proceedings - 8th IEEE European Symposium on Security and Privacy, Euro S and P 2023. Proceedings - 8th IEEE European Symposium on Security and Privacy, Euro S and P 2023, Institute of Electrical and Electronics Engineers Inc., 页码 200-220, 8th IEEE European Symposium on Security and Privacy, Euro S and P 2023, Delft, 荷兰, 3/07/23. https://doi.org/10.1109/EuroSP57164.2023.00021

A Certified Radius-Guided Attack Framework to Image Segmentation Models. / Qu, Wenjie; Li, Youqi; Wang, Binghui.
Proceedings - 8th IEEE European Symposium on Security and Privacy, Euro S and P 2023. Institute of Electrical and Electronics Engineers Inc., 2023. 页码 200-220 (Proceedings - 8th IEEE European Symposium on Security and Privacy, Euro S and P 2023).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - A Certified Radius-Guided Attack Framework to Image Segmentation Models

AU - Qu, Wenjie

AU - Li, Youqi

AU - Wang, Binghui

PY - 2023

Y1 - 2023

N2 - Image segmentation is an important problem in many safety-critical applications such as medical imaging and autonomous driving. Recent studies show that modern image segmentation models are vulnerable to adversarial perturbations, while existing attack methods mainly follow the idea of attacking image classification models. We argue that image segmentation and classification have inherent differences, and design an attack framework specially for image segmentation models. Our goal is to thoroughly explore the vulnerabilities of modern segmentation models, i.e., aiming to misclassify as many pixels as possible under a perturbation budget in both white-box and black-box settings.Our attack framework is inspired by certified radius, which was originally used by defenders to defend against adversarial perturbations to classification models. We are the first, from the attacker perspective, to leverage the properties of certified radius and propose a certified radius guided attack framework against image segmentation models. Specifically, we first adapt randomized smoothing, the state-of-the-art certification method for classification models, to derive the pixel's certified radius. A larger certified radius of a pixel means the pixel is theoretically more robust to adversarial perturbations. This observation inspires us to focus more on disrupting pixels with relatively smaller certified radii. Accordingly, we design a pixel-wise certified radius guided loss, when plugged into any existing white-box attack, yields our certified radius-guided white-box attack.Next, we propose the first black-box attack to image segmentation models via bandit. A key challenge is no gradient information is available. To address it, we design a novel gradient estimator, based on bandit feedback, which is query-efficient and provably unbiased and stable. We use this gradient estimator to design a projected bandit gradient descent (PBGD) attack. We further use pixels' certified radii and design a certified radius-guided PBGD (CR-PBGD) attack. We prove our PBGD and CR-PBGD attacks can achieve asymptotically optimal attack performance with an optimal rate. We evaluate our certified-radius guided white-box and black-box attacks on multiple modern image segmentation models and datasets. Our results validate the effectiveness of our certified radius-guided attack framework.

AB - Image segmentation is an important problem in many safety-critical applications such as medical imaging and autonomous driving. Recent studies show that modern image segmentation models are vulnerable to adversarial perturbations, while existing attack methods mainly follow the idea of attacking image classification models. We argue that image segmentation and classification have inherent differences, and design an attack framework specially for image segmentation models. Our goal is to thoroughly explore the vulnerabilities of modern segmentation models, i.e., aiming to misclassify as many pixels as possible under a perturbation budget in both white-box and black-box settings.Our attack framework is inspired by certified radius, which was originally used by defenders to defend against adversarial perturbations to classification models. We are the first, from the attacker perspective, to leverage the properties of certified radius and propose a certified radius guided attack framework against image segmentation models. Specifically, we first adapt randomized smoothing, the state-of-the-art certification method for classification models, to derive the pixel's certified radius. A larger certified radius of a pixel means the pixel is theoretically more robust to adversarial perturbations. This observation inspires us to focus more on disrupting pixels with relatively smaller certified radii. Accordingly, we design a pixel-wise certified radius guided loss, when plugged into any existing white-box attack, yields our certified radius-guided white-box attack.Next, we propose the first black-box attack to image segmentation models via bandit. A key challenge is no gradient information is available. To address it, we design a novel gradient estimator, based on bandit feedback, which is query-efficient and provably unbiased and stable. We use this gradient estimator to design a projected bandit gradient descent (PBGD) attack. We further use pixels' certified radii and design a certified radius-guided PBGD (CR-PBGD) attack. We prove our PBGD and CR-PBGD attacks can achieve asymptotically optimal attack performance with an optimal rate. We evaluate our certified-radius guided white-box and black-box attacks on multiple modern image segmentation models and datasets. Our results validate the effectiveness of our certified radius-guided attack framework.

UR - http://www.scopus.com/inward/record.url?scp=85168122293&partnerID=8YFLogxK

U2 - 10.1109/EuroSP57164.2023.00021

DO - 10.1109/EuroSP57164.2023.00021

M3 - Conference contribution

AN - SCOPUS:85168122293

T3 - Proceedings - 8th IEEE European Symposium on Security and Privacy, Euro S and P 2023

SP - 200

EP - 220

BT - Proceedings - 8th IEEE European Symposium on Security and Privacy, Euro S and P 2023

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 8th IEEE European Symposium on Security and Privacy, Euro S and P 2023

Y2 - 3 July 2023 through 7 July 2023

ER -

Qu W, Li Y, Wang B. A Certified Radius-Guided Attack Framework to Image Segmentation Models. 在 Proceedings - 8th IEEE European Symposium on Security and Privacy, Euro S and P 2023. Institute of Electrical and Electronics Engineers Inc. 2023. 页码 200-220. (Proceedings - 8th IEEE European Symposium on Security and Privacy, Euro S and P 2023). doi: 10.1109/EuroSP57164.2023.00021

A Certified Radius-Guided Attack Framework to Image Segmentation Models

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此