A Systematic Survey on Black-box Attacks in Large Language Models Within Communication Networks

Wenbiao Du, Jingfeng Xue*, Wenjie Guo, Xiuqi Yang, Yifeng Fu, Yong Wang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Large Language Models (LLMs) exhibit extraordinary competence in language comprehension and generation. However, as these models find increasing adoption in communication systems, they also become vulnerable to adversarial threats. Such attacks exploit model response mechanisms to generate malicious content or execute jailbreak attempts. Given the paramount importance of reliability and security in communication networks, this issue has garnered considerable attention. Of particular concern are black-box attacks, which circumvent conventional defense strategies by exploiting input-output interactions without requiring internal model knowledge or parameter access. These attacks are highly clandestine and demonstrate substantial practical feasibility, with possible repercussions such as data compromise, the production of harmful content, and the interruption of standard operations. Although relevant research efforts have achieved notable breakthroughs, a comprehensive examination of the topic, especially a systematic review within the realm of communication networks, remains insufficient. This article seeks to offer a comprehensive survey of contemporary black-box attack strategies aimed at LLMs. We begin by retracing the development and applications of such attacks across diverse fields, then propose a taxonomy pertinent to black-box attacks within communication networks, classifying them into three principal categories: scenario and context manipulation attacks, transformation and evasion attacks, and automated and optimized generation attacks. In addition, we examine the associated impacts and potential risks, synthesize limitations in existing research, and present prospective directions and challenges for future research. Our overarching goal is to provide substantial insights that advance the security and reliability of LLMs while promoting the stable evolution of these models within communication network environments.

Original languageEnglish
JournalIEEE Network
DOIs
Publication statusAccepted/In press - 2025
Externally publishedYes

Keywords

  • Black-box Attack
  • Generative AI
  • Large Language Model
  • Network Security

Cite this