Abstract
To reduce the concealment of attacks for black-box text classification models, existing methods make generally use of frequent queries. The problems of the tradition methods express as the following. Firstly when making few queries, the black-box model can difficultly generate samples to cross the target decision boundary of models, impacting the attack success rate (ASR) significantly. And, when querying the target model with word by word, the number of queries increases linearly with the length of texts. Also, limiting the number of queries, it can result in low ASR. Besides, based on thesaurus stock method, the positional features of perturbing words are lack correspondingly, causing a difficulty to capture its contextual relevance, causing a low text similarity due to semantics change drastically, making it hard to deceive target models and impacting ASR. In this paper, to reach a double goal of high similarity and ASR, a new method was proposed based on the combination of dynamic masking and diffusion language model. Some adversarial samples were demonstrated, reducing the number of queries with a high ASR. Taking experiments with multiple data, the results show that the number of queries can be reduced by 50% on average with a high ASR, making it valuable for adversarial training task.
Translated title of the contribution | Black-Box Adversarial Sample Attack for Query-Less Text Classification Models |
---|---|
Original language | Chinese (Traditional) |
Pages (from-to) | 1277-1286 |
Number of pages | 10 |
Journal | Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology |
Volume | 44 |
Issue number | 12 |
DOIs | |
Publication status | Published - Dec 2024 |