Graph based encrypted malicious traffic detection with hybrid analysis of multi-view features

Yueping Hong; Qi Li; Yanqing Yang; Meng Shen

doi:10.1016/j.ins.2023.119229

Graph based encrypted malicious traffic detection with hybrid analysis of multi-view features

Yueping Hong, Qi Li^*, Yanqing Yang, Meng Shen

^*Corresponding author for this work

School of Cyberspace Science and Technology

Beijing University of Posts and Telecommunications

Research output: Contribution to journal › Article › peer-review

12 Citations (Scopus)

Abstract

At present, the TLS cryptographic protocol is widely deployed. While protecting the security and integrity of transmitted information, it also makes the detection of malicious behavior more difficult. In recent years, researchers have proposed many encrypted malicious traffic detection methods. However, the existing approaches have some shortcomings. Firstly, although researchers have extracted multi-view features from different aspects, which can be divided into vectorized features based on feature engineering and image features based on original data, existing methods cannot fully integrate the features of different forms of expression. Secondly, most of the existing methods do not fully analyze the correlation between different encrypted traffic. Thirdly, the existing methods based on correlation analysis have low processing efficiency and cannot be applied to real networks. In the paper, we present MalDiscovery, a novel technique to discover encrypted malicious traffic to address all the above issues. For encrypted malicious traffic, MalDiscovery constructs an attribute KNN graph, in which encrypted sessions are used as nodes to construct a KNN graph according to the similarity of image features, and vectorized features are used as attributes of corresponding nodes. After that, the GraphSAGE model is used to collect relevant node information through correlation analysis to enrich the embeddings of each node. Finally, we achieve the accurate binary classification of nodes in the graph based on richer embeddings. We use extensive experiments to evaluate the proposed method, and the experiment results show that MalDiscovery can achieve an accuracy of about 99.9%, significantly outperforming all compared methods.

Original language	English
Article number	119229
Journal	Information Sciences
Volume	644
DOIs	https://doi.org/10.1016/j.ins.2023.119229
Publication status	Published - Oct 2023

Keywords

Encrypted traffic
Malicious traffic
Multi-view features
SSL/TLS

Access to Document

10.1016/j.ins.2023.119229

Cite this

@article{850b224ffae3493b934e178e31d8968d,

title = "Graph based encrypted malicious traffic detection with hybrid analysis of multi-view features",

abstract = "At present, the TLS cryptographic protocol is widely deployed. While protecting the security and integrity of transmitted information, it also makes the detection of malicious behavior more difficult. In recent years, researchers have proposed many encrypted malicious traffic detection methods. However, the existing approaches have some shortcomings. Firstly, although researchers have extracted multi-view features from different aspects, which can be divided into vectorized features based on feature engineering and image features based on original data, existing methods cannot fully integrate the features of different forms of expression. Secondly, most of the existing methods do not fully analyze the correlation between different encrypted traffic. Thirdly, the existing methods based on correlation analysis have low processing efficiency and cannot be applied to real networks. In the paper, we present MalDiscovery, a novel technique to discover encrypted malicious traffic to address all the above issues. For encrypted malicious traffic, MalDiscovery constructs an attribute KNN graph, in which encrypted sessions are used as nodes to construct a KNN graph according to the similarity of image features, and vectorized features are used as attributes of corresponding nodes. After that, the GraphSAGE model is used to collect relevant node information through correlation analysis to enrich the embeddings of each node. Finally, we achieve the accurate binary classification of nodes in the graph based on richer embeddings. We use extensive experiments to evaluate the proposed method, and the experiment results show that MalDiscovery can achieve an accuracy of about 99.9%, significantly outperforming all compared methods.",

keywords = "Encrypted traffic, Malicious traffic, Multi-view features, SSL/TLS",

author = "Yueping Hong and Qi Li and Yanqing Yang and Meng Shen",

note = "Publisher Copyright: {\textcopyright} 2023 Elsevier Inc.",

year = "2023",

month = oct,

doi = "10.1016/j.ins.2023.119229",

language = "English",

volume = "644",

journal = "Information Sciences",

issn = "0020-0255",

publisher = "Elsevier Inc.",

}

TY - JOUR

T1 - Graph based encrypted malicious traffic detection with hybrid analysis of multi-view features

AU - Hong, Yueping

AU - Li, Qi

AU - Yang, Yanqing

AU - Shen, Meng

PY - 2023/10

Y1 - 2023/10

N2 - At present, the TLS cryptographic protocol is widely deployed. While protecting the security and integrity of transmitted information, it also makes the detection of malicious behavior more difficult. In recent years, researchers have proposed many encrypted malicious traffic detection methods. However, the existing approaches have some shortcomings. Firstly, although researchers have extracted multi-view features from different aspects, which can be divided into vectorized features based on feature engineering and image features based on original data, existing methods cannot fully integrate the features of different forms of expression. Secondly, most of the existing methods do not fully analyze the correlation between different encrypted traffic. Thirdly, the existing methods based on correlation analysis have low processing efficiency and cannot be applied to real networks. In the paper, we present MalDiscovery, a novel technique to discover encrypted malicious traffic to address all the above issues. For encrypted malicious traffic, MalDiscovery constructs an attribute KNN graph, in which encrypted sessions are used as nodes to construct a KNN graph according to the similarity of image features, and vectorized features are used as attributes of corresponding nodes. After that, the GraphSAGE model is used to collect relevant node information through correlation analysis to enrich the embeddings of each node. Finally, we achieve the accurate binary classification of nodes in the graph based on richer embeddings. We use extensive experiments to evaluate the proposed method, and the experiment results show that MalDiscovery can achieve an accuracy of about 99.9%, significantly outperforming all compared methods.

AB - At present, the TLS cryptographic protocol is widely deployed. While protecting the security and integrity of transmitted information, it also makes the detection of malicious behavior more difficult. In recent years, researchers have proposed many encrypted malicious traffic detection methods. However, the existing approaches have some shortcomings. Firstly, although researchers have extracted multi-view features from different aspects, which can be divided into vectorized features based on feature engineering and image features based on original data, existing methods cannot fully integrate the features of different forms of expression. Secondly, most of the existing methods do not fully analyze the correlation between different encrypted traffic. Thirdly, the existing methods based on correlation analysis have low processing efficiency and cannot be applied to real networks. In the paper, we present MalDiscovery, a novel technique to discover encrypted malicious traffic to address all the above issues. For encrypted malicious traffic, MalDiscovery constructs an attribute KNN graph, in which encrypted sessions are used as nodes to construct a KNN graph according to the similarity of image features, and vectorized features are used as attributes of corresponding nodes. After that, the GraphSAGE model is used to collect relevant node information through correlation analysis to enrich the embeddings of each node. Finally, we achieve the accurate binary classification of nodes in the graph based on richer embeddings. We use extensive experiments to evaluate the proposed method, and the experiment results show that MalDiscovery can achieve an accuracy of about 99.9%, significantly outperforming all compared methods.

KW - Encrypted traffic

KW - Malicious traffic

KW - Multi-view features

KW - SSL/TLS

UR - http://www.scopus.com/inward/record.url?scp=85161589849&partnerID=8YFLogxK

U2 - 10.1016/j.ins.2023.119229

DO - 10.1016/j.ins.2023.119229

M3 - Article

AN - SCOPUS:85161589849

SN - 0020-0255

VL - 644

JO - Information Sciences

JF - Information Sciences

M1 - 119229

ER -

Graph based encrypted malicious traffic detection with hybrid analysis of multi-view features

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this