TY - GEN
T1 - Research on user identification algorithm based on massive multi-site VPN log
AU - Lu, Bingbing
AU - Zhang, Huaping
AU - Liu, Bin
AU - Zhao, Zhonghua
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/7/2
Y1 - 2017/7/2
N2 - VPN (Virtual Private Network) is the primary mean for users to access network information by crossing the border currently. There is little research about VPN users, though the number of users who using VPN is pretty large. Consequently, it is desiderated to find a solution to strengthen the ability to observe and discover cross-border access users. This paper proposes a novel user identification algorithm according to massive multi-site VPN log. First of all, a formal description of VPN user identification problems is given, then we analyze quantitatively for probability distribution of VPN log in two dimensions: usernames and CIP (Client Internet Protocol) addresses. Based on this, we give the solution of problems in VPN user identification, and propose a user identification algorithm based on the combination of access vector similarity, username similarity, the number of regions where users surf the internet and the connected subgraphs. Then we test the algorithm in VPN log within two months, which has proved the effectiveness and correctness of user identification algorithm.
AB - VPN (Virtual Private Network) is the primary mean for users to access network information by crossing the border currently. There is little research about VPN users, though the number of users who using VPN is pretty large. Consequently, it is desiderated to find a solution to strengthen the ability to observe and discover cross-border access users. This paper proposes a novel user identification algorithm according to massive multi-site VPN log. First of all, a formal description of VPN user identification problems is given, then we analyze quantitatively for probability distribution of VPN log in two dimensions: usernames and CIP (Client Internet Protocol) addresses. Based on this, we give the solution of problems in VPN user identification, and propose a user identification algorithm based on the combination of access vector similarity, username similarity, the number of regions where users surf the internet and the connected subgraphs. Then we test the algorithm in VPN log within two months, which has proved the effectiveness and correctness of user identification algorithm.
KW - Client IP (internet protocol) addresses feature
KW - User identification
KW - Username feature
KW - VPN (virtual private network)
UR - http://www.scopus.com/inward/record.url?scp=85047742263&partnerID=8YFLogxK
U2 - 10.1109/ICCT.2017.8359858
DO - 10.1109/ICCT.2017.8359858
M3 - Conference contribution
AN - SCOPUS:85047742263
T3 - International Conference on Communication Technology Proceedings, ICCT
SP - 1372
EP - 1381
BT - 2017 17th IEEE International Conference on Communication Technology, ICCT 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 17th IEEE International Conference on Communication Technology, ICCT 2017
Y2 - 27 October 2017 through 30 October 2017
ER -