少查询文本分类模型的黑盒对抗样本攻击

Translated title of the contribution: Black-Box Adversarial Sample Attack for Query-Less Text Classification Models

Senlin Luo, Yao Cheng, Yunwei Wan, Limin Pan*, Xinshuai Li

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

To reduce the concealment of attacks for black-box text classification models, existing methods make generally use of frequent queries. The problems of the tradition methods express as the following. Firstly when making few queries, the black-box model can difficultly generate samples to cross the target decision boundary of models, impacting the attack success rate (ASR) significantly. And, when querying the target model with word by word, the number of queries increases linearly with the length of texts. Also, limiting the number of queries, it can result in low ASR. Besides, based on thesaurus stock method, the positional features of perturbing words are lack correspondingly, causing a difficulty to capture its contextual relevance, causing a low text similarity due to semantics change drastically, making it hard to deceive target models and impacting ASR. In this paper, to reach a double goal of high similarity and ASR, a new method was proposed based on the combination of dynamic masking and diffusion language model. Some adversarial samples were demonstrated, reducing the number of queries with a high ASR. Taking experiments with multiple data, the results show that the number of queries can be reduced by 50% on average with a high ASR, making it valuable for adversarial training task.

Translated title of the contributionBlack-Box Adversarial Sample Attack for Query-Less Text Classification Models
Original languageChinese (Traditional)
Pages (from-to)1277-1286
Number of pages10
JournalBeijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology
Volume44
Issue number12
DOIs
Publication statusPublished - Dec 2024

Cite this