TY - JOUR
T1 - Secure and Robust Joint Source-Channel Coding With Semantic Clustering and Adversarial Purification
AU - Huang, Xin
AU - Zeng, Liang
AU - Lu, Yaojun
AU - An, Jianping
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2025
Y1 - 2025
N2 - Vision Transformer (ViT) and Swin Transformer have attracted significant attention in the fields of deep learning-based joint source-channel coding (JSCC) and semantic communication due to their excellent visual modeling capabilities. However, we have identified three challenges in these methods: (1) Quadratic complexity of global attention in ViT introduces a substantial computational burden; (2) Token grouping based on windows in Swin Transformer reduces computational complexity, but this approach neglects the semantic information within the tokens. It may divide tokens with similar semantics into different windows or divide tokens with different semantics into the same window, leading to reduced effectiveness; (3) Vulnerability to adversarial attacks in deep learning-based JSCC poses a significant threat to semantic communication, compromising the security and robustness of the system. To address these challenges, we propose a semantic clustering and adversarial purification-based JSCC (SCAPJSCC) scheme. It not only reduces the computational complexity of the self-attention mechanism but also preserves and leverages the semantic information inherent in images. Furthermore, we introduce a plug-and-play adversarial purification module on the receiver, enhancing the robustness and security against adversarial attacks at both the transmitter and the communication channel. Experimental results demonstrate that SCAPJSCC outperforms the state-of-the-art method SwinJSCC, achieving more effective semantic modeling of image information, and stronger resilience to various adversarial attacks.
AB - Vision Transformer (ViT) and Swin Transformer have attracted significant attention in the fields of deep learning-based joint source-channel coding (JSCC) and semantic communication due to their excellent visual modeling capabilities. However, we have identified three challenges in these methods: (1) Quadratic complexity of global attention in ViT introduces a substantial computational burden; (2) Token grouping based on windows in Swin Transformer reduces computational complexity, but this approach neglects the semantic information within the tokens. It may divide tokens with similar semantics into different windows or divide tokens with different semantics into the same window, leading to reduced effectiveness; (3) Vulnerability to adversarial attacks in deep learning-based JSCC poses a significant threat to semantic communication, compromising the security and robustness of the system. To address these challenges, we propose a semantic clustering and adversarial purification-based JSCC (SCAPJSCC) scheme. It not only reduces the computational complexity of the self-attention mechanism but also preserves and leverages the semantic information inherent in images. Furthermore, we introduce a plug-and-play adversarial purification module on the receiver, enhancing the robustness and security against adversarial attacks at both the transmitter and the communication channel. Experimental results demonstrate that SCAPJSCC outperforms the state-of-the-art method SwinJSCC, achieving more effective semantic modeling of image information, and stronger resilience to various adversarial attacks.
KW - adversarial purification
KW - Joint source-channel coding
KW - semantic clustering
KW - semantic communication
UR - http://www.scopus.com/inward/record.url?scp=105002022362&partnerID=8YFLogxK
U2 - 10.1109/TCCN.2025.3556777
DO - 10.1109/TCCN.2025.3556777
M3 - Article
AN - SCOPUS:105002022362
SN - 2332-7731
JO - IEEE Transactions on Cognitive Communications and Networking
JF - IEEE Transactions on Cognitive Communications and Networking
ER -