TY - JOUR
T1 - Machine Learning-Powered Encrypted Network Traffic Analysis
T2 - A Comprehensive Survey
AU - Shen, Meng
AU - Ye, Ke
AU - Liu, Xingtong
AU - Zhu, Liehuang
AU - Kang, Jiawen
AU - Yu, Shui
AU - Li, Qi
AU - Xu, Ke
N1 - Publisher Copyright:
© 1998-2012 IEEE.
PY - 2023
Y1 - 2023
N2 - Traffic analysis is the process of monitoring network activities, discovering specific patterns, and gleaning valuable information from network traffic. It can be applied in various fields such as network assert probing and anomaly detection. With the advent of network traffic encryption, however, traffic analysis becomes an arduous task. Due to the invisibility of packet payload, traditional traffic analysis methods relying on capturing valuable information from plaintext payload are likely to lose efficacy. Machine learning has been emerging as a powerful tool to extract informative features without getting access to payload, and thus is widely employed in encrypted traffic analysis. In this paper, we present a comprehensive survey on recent achievements in machine learning-powered encrypted traffic analysis. To begin with, we review the literature in this area and summarize the analysis goals that serve as the basis for literature classification. Then, we abstract the workflow of encrypted traffic analysis with machine learning tools, including traffic collection, traffic representation, traffic analysis method, and performance evaluation. For the surveyed studies, the requirements of classification granularity and information timeliness may vary a lot for different analysis goals. Hence, in terms of the goal of traffic analysis, we present a comprehensive review on existing studies according to four categories: network asset identification, network characterization, privacy leakage detection, and anomaly detection. Finally, we discuss the challenges and directions for future research on encrypted traffic analysis.
AB - Traffic analysis is the process of monitoring network activities, discovering specific patterns, and gleaning valuable information from network traffic. It can be applied in various fields such as network assert probing and anomaly detection. With the advent of network traffic encryption, however, traffic analysis becomes an arduous task. Due to the invisibility of packet payload, traditional traffic analysis methods relying on capturing valuable information from plaintext payload are likely to lose efficacy. Machine learning has been emerging as a powerful tool to extract informative features without getting access to payload, and thus is widely employed in encrypted traffic analysis. In this paper, we present a comprehensive survey on recent achievements in machine learning-powered encrypted traffic analysis. To begin with, we review the literature in this area and summarize the analysis goals that serve as the basis for literature classification. Then, we abstract the workflow of encrypted traffic analysis with machine learning tools, including traffic collection, traffic representation, traffic analysis method, and performance evaluation. For the surveyed studies, the requirements of classification granularity and information timeliness may vary a lot for different analysis goals. Hence, in terms of the goal of traffic analysis, we present a comprehensive review on existing studies according to four categories: network asset identification, network characterization, privacy leakage detection, and anomaly detection. Finally, we discuss the challenges and directions for future research on encrypted traffic analysis.
KW - Encrypted traffic analysis
KW - anomaly detection
KW - deep learning
KW - machine learning
KW - traffic classification
UR - http://www.scopus.com/inward/record.url?scp=85139434818&partnerID=8YFLogxK
U2 - 10.1109/COMST.2022.3208196
DO - 10.1109/COMST.2022.3208196
M3 - Article
AN - SCOPUS:85139434818
SN - 1553-877X
VL - 25
SP - 791
EP - 824
JO - IEEE Communications Surveys and Tutorials
JF - IEEE Communications Surveys and Tutorials
IS - 1
ER -