TY - JOUR
T1 - Automatic classification method for software vulnerability based on deep neural network
AU - Huang, Guoyan
AU - Li, Yazhou
AU - Wang, Qian
AU - Ren, Jiadong
AU - Cheng, Yongqiang
AU - Zhao, Xiaolin
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019
Y1 - 2019
N2 - Software vulnerabilities are the root causes of various security risks. Once a vulnerability is exploited by malicious attacks, it will greatly compromise the safety of the system, and may even cause catastrophic losses. Hence automatic classification methods are desirable to effectively manage the vulnerability in software, improve the security performance of the system, and reduce the risk of the system being attacked and damaged. In this paper, a new automatic vulnerability classification model (TFI-DNN) has been proposed. The model is built upon term frequency-inverse document frequency (TF-IDF), information gain (IG), and deep neural network (DNN): The TF-IDF is used to calculate the frequency and weight of each word from vulnerability description; the IG is used for feature selection to obtain an optimal set of feature word, and; the DNN neural network model is used to construct an automatic vulnerability classifier to achieve effective vulnerability classification. The National Vulnerability Database of the United States has been used to validate the effectiveness of the proposed model. Compared to SVM, Naive Bayes, and KNN, the TFI-DNN model has achieved better performance in multi-dimensional evaluation indexes including accuracy, recall rate, precision, and F1-score.
AB - Software vulnerabilities are the root causes of various security risks. Once a vulnerability is exploited by malicious attacks, it will greatly compromise the safety of the system, and may even cause catastrophic losses. Hence automatic classification methods are desirable to effectively manage the vulnerability in software, improve the security performance of the system, and reduce the risk of the system being attacked and damaged. In this paper, a new automatic vulnerability classification model (TFI-DNN) has been proposed. The model is built upon term frequency-inverse document frequency (TF-IDF), information gain (IG), and deep neural network (DNN): The TF-IDF is used to calculate the frequency and weight of each word from vulnerability description; the IG is used for feature selection to obtain an optimal set of feature word, and; the DNN neural network model is used to construct an automatic vulnerability classifier to achieve effective vulnerability classification. The National Vulnerability Database of the United States has been used to validate the effectiveness of the proposed model. Compared to SVM, Naive Bayes, and KNN, the TFI-DNN model has achieved better performance in multi-dimensional evaluation indexes including accuracy, recall rate, precision, and F1-score.
KW - Deep neural network
KW - information gain
KW - software security
KW - vulnerability classification
UR - http://www.scopus.com/inward/record.url?scp=85063198279&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2019.2900462
DO - 10.1109/ACCESS.2019.2900462
M3 - Article
AN - SCOPUS:85063198279
SN - 2169-3536
VL - 7
SP - 28291
EP - 28298
JO - IEEE Access
JF - IEEE Access
M1 - 8654631
ER -