TY - JOUR
T1 - A Systematic Survey on Black-box Attacks in Large Language Models Within Communication Networks
AU - Du, Wenbiao
AU - Xue, Jingfeng
AU - Guo, Wenjie
AU - Yang, Xiuqi
AU - Fu, Yifeng
AU - Wang, Yong
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Large Language Models (LLMs) exhibit extraordinary competence in language comprehension and generation. However, as these models find increasing adoption in communication systems, they also become vulnerable to adversarial threats. Such attacks exploit model response mechanisms to generate malicious content or execute jailbreak attempts. Given the paramount importance of reliability and security in communication networks, this issue has garnered considerable attention. Of particular concern are black-box attacks, which circumvent conventional defense strategies by exploiting input-output interactions without requiring internal model knowledge or parameter access. These attacks are highly clandestine and demonstrate substantial practical feasibility, with possible repercussions such as data compromise, the production of harmful content, and the interruption of standard operations. Although relevant research efforts have achieved notable breakthroughs, a comprehensive examination of the topic, especially a systematic review within the realm of communication networks, remains insufficient. This article seeks to offer a comprehensive survey of contemporary black-box attack strategies aimed at LLMs. We begin by retracing the development and applications of such attacks across diverse fields, then propose a taxonomy pertinent to black-box attacks within communication networks, classifying them into three principal categories: scenario and context manipulation attacks, transformation and evasion attacks, and automated and optimized generation attacks. In addition, we examine the associated impacts and potential risks, synthesize limitations in existing research, and present prospective directions and challenges for future research. Our overarching goal is to provide substantial insights that advance the security and reliability of LLMs while promoting the stable evolution of these models within communication network environments.
AB - Large Language Models (LLMs) exhibit extraordinary competence in language comprehension and generation. However, as these models find increasing adoption in communication systems, they also become vulnerable to adversarial threats. Such attacks exploit model response mechanisms to generate malicious content or execute jailbreak attempts. Given the paramount importance of reliability and security in communication networks, this issue has garnered considerable attention. Of particular concern are black-box attacks, which circumvent conventional defense strategies by exploiting input-output interactions without requiring internal model knowledge or parameter access. These attacks are highly clandestine and demonstrate substantial practical feasibility, with possible repercussions such as data compromise, the production of harmful content, and the interruption of standard operations. Although relevant research efforts have achieved notable breakthroughs, a comprehensive examination of the topic, especially a systematic review within the realm of communication networks, remains insufficient. This article seeks to offer a comprehensive survey of contemporary black-box attack strategies aimed at LLMs. We begin by retracing the development and applications of such attacks across diverse fields, then propose a taxonomy pertinent to black-box attacks within communication networks, classifying them into three principal categories: scenario and context manipulation attacks, transformation and evasion attacks, and automated and optimized generation attacks. In addition, we examine the associated impacts and potential risks, synthesize limitations in existing research, and present prospective directions and challenges for future research. Our overarching goal is to provide substantial insights that advance the security and reliability of LLMs while promoting the stable evolution of these models within communication network environments.
KW - Black-box Attack
KW - Generative AI
KW - Large Language Model
KW - Network Security
UR - http://www.scopus.com/inward/record.url?scp=105008038803&partnerID=8YFLogxK
U2 - 10.1109/MNET.2025.3578610
DO - 10.1109/MNET.2025.3578610
M3 - Article
AN - SCOPUS:105008038803
SN - 0890-8044
JO - IEEE Network
JF - IEEE Network
ER -