TY - JOUR
T1 - Online and automatic identification and mining of encryption network behavior in big data environment
AU - Hejun, Zhu
AU - Liehuang, Zhu
AU - Meng, Shen
AU - Khan, Salabat
N1 - Publisher Copyright:
© 2018 - IOS Press and the authors. All rights reserved.
PY - 2018
Y1 - 2018
N2 - The paper studied the encrypted network behavior recognition and mining in a large amount of network data environment, and proposed a fast online recognition method for the encryption network behavior based on the combination of correlation coefficient and k-nearest neighbor (KNN). Taking the encrypted Twitter traffic as the research object, a lot of encrypted Twitter network behaviors including message sending, pictures sending and other behaviors were analyzed, and then the statistical characteristics to express the encryption network behavior were extracted, and the samples library of encryption network behaviors based on correlation coefficient were established. Then, through the real-time collection of interactive network data, the correlation coefficient between the interactive data and the sample library were calculated, in order to overcome the noise interference of the similar data traffic. Meanwhile, the data packets after the similarity filtering were classified as the true behavior or the false behavior by using the KNN algorithm, and then the encryption network behavior was identified automatically by the default threshold of the correlation coefficient in big data environment, and compared with the traditional correlation coefficient method, the recognition efficiency of this method was greatly improved, which reaches to about 94. Based on above, combined with the network vulnerability analysis, web crawler and virtual identity mining, the comprehensive encryption network behavior mining was successfully realized in the environment of big data.
AB - The paper studied the encrypted network behavior recognition and mining in a large amount of network data environment, and proposed a fast online recognition method for the encryption network behavior based on the combination of correlation coefficient and k-nearest neighbor (KNN). Taking the encrypted Twitter traffic as the research object, a lot of encrypted Twitter network behaviors including message sending, pictures sending and other behaviors were analyzed, and then the statistical characteristics to express the encryption network behavior were extracted, and the samples library of encryption network behaviors based on correlation coefficient were established. Then, through the real-time collection of interactive network data, the correlation coefficient between the interactive data and the sample library were calculated, in order to overcome the noise interference of the similar data traffic. Meanwhile, the data packets after the similarity filtering were classified as the true behavior or the false behavior by using the KNN algorithm, and then the encryption network behavior was identified automatically by the default threshold of the correlation coefficient in big data environment, and compared with the traditional correlation coefficient method, the recognition efficiency of this method was greatly improved, which reaches to about 94. Based on above, combined with the network vulnerability analysis, web crawler and virtual identity mining, the comprehensive encryption network behavior mining was successfully realized in the environment of big data.
KW - Encryption network behavior identification
KW - correlation coefficient
KW - encryption network behavior mining
KW - k-nearest neighbor
UR - http://www.scopus.com/inward/record.url?scp=85043598253&partnerID=8YFLogxK
U2 - 10.3233/JIFS-169404
DO - 10.3233/JIFS-169404
M3 - Article
AN - SCOPUS:85043598253
SN - 1064-1246
VL - 34
SP - 1111
EP - 1119
JO - Journal of Intelligent and Fuzzy Systems
JF - Journal of Intelligent and Fuzzy Systems
IS - 2
ER -