TY - JOUR
T1 - Adaptive detection of encrypted malware traffic via fully convolutional masked autoencoders
AU - Jia, Jizhe
AU - Shen, Meng
AU - Yuan, Qingjun
AU - Liu, Yong
AU - Wang, Jing
AU - Kong, Jian
AU - Huang, Liang
AU - He, Haotian
AU - Zhu, Liehuang
N1 - Publisher Copyright:
© Higher Education Press 2026.
PY - 2026/4
Y1 - 2026/4
N2 - Network traffic encryption techniques are widely adopted to protect data confidentiality and prevent privacy leakage during data transmission. However, malware often leverages these traffic encryption techniques to conceal malicious activities. Recent research has demonstrated the effectiveness of machine and deep learning-based malware traffic detection methods. However, these methods rely on a sufficient amount of labeled data readily available for model training, limiting the capability of transferring to new malware detection. In this paper, we propose Malcom, an adaptive encrypted malware traffic detection method based on fully convolutional masked autoencoders to detect malware traffic hidden in the encrypted traffic. We first propose a novel traffic representation named Header-Payload Matrix (HPM) to extract discriminative features that can differentiate from malware and benign traffic. Subsequently, we develop a hierarchical ConvNeXt traffic encoder and a lightweight ConvNeXt traffic decoder to learn high-level features from a large amount of unlabeled data. The masked autoencoder framework enables our model to be adaptive to new malware detection by fine-tuning with only a few labeled data. We conduct extensive experiments with real-world datasets to evaluate Malcom. The results demonstrate that Malcom outperforms the state-of-the-art (SOTA) methods in two typical scenarios. Particularly, in the scenario of few-shot learning, Malcom achieves an average F1 score of 97.35%, with an improvement of 8.24% over the SOTA method, by fine-tuning with only 10 samples per malware type.
AB - Network traffic encryption techniques are widely adopted to protect data confidentiality and prevent privacy leakage during data transmission. However, malware often leverages these traffic encryption techniques to conceal malicious activities. Recent research has demonstrated the effectiveness of machine and deep learning-based malware traffic detection methods. However, these methods rely on a sufficient amount of labeled data readily available for model training, limiting the capability of transferring to new malware detection. In this paper, we propose Malcom, an adaptive encrypted malware traffic detection method based on fully convolutional masked autoencoders to detect malware traffic hidden in the encrypted traffic. We first propose a novel traffic representation named Header-Payload Matrix (HPM) to extract discriminative features that can differentiate from malware and benign traffic. Subsequently, we develop a hierarchical ConvNeXt traffic encoder and a lightweight ConvNeXt traffic decoder to learn high-level features from a large amount of unlabeled data. The masked autoencoder framework enables our model to be adaptive to new malware detection by fine-tuning with only a few labeled data. We conduct extensive experiments with real-world datasets to evaluate Malcom. The results demonstrate that Malcom outperforms the state-of-the-art (SOTA) methods in two typical scenarios. Particularly, in the scenario of few-shot learning, Malcom achieves an average F1 score of 97.35%, with an improvement of 8.24% over the SOTA method, by fine-tuning with only 10 samples per malware type.
KW - encrypted traffic analysis
KW - malware traffic detection
KW - masked autoencoder
KW - self-supervised learning
UR - https://www.scopus.com/pages/publications/105021479157
U2 - 10.1007/s11704-025-41273-9
DO - 10.1007/s11704-025-41273-9
M3 - Article
AN - SCOPUS:105021479157
SN - 2095-2228
VL - 20
JO - Frontiers of Computer Science
JF - Frontiers of Computer Science
IS - 4
M1 - 2004804
ER -