TY - JOUR
T1 - Building an Effective Intrusion Detection System by Using Hybrid Data Optimization Based on Machine Learning Algorithms
AU - Ren, Jiadong
AU - Guo, Jiawei
AU - Qian, Wang
AU - Yuan, Huang
AU - Hao, Xiaobing
AU - Jingjing, Hu
N1 - Publisher Copyright:
© 2019 Jiadong Ren et al.
PY - 2019
Y1 - 2019
N2 - Intrusion detection system (IDS) can effectively identify anomaly behaviors in the network; however, it still has low detection rate and high false alarm rate especially for anomalies with fewer records. In this paper, we propose an effective IDS by using hybrid data optimization which consists of two parts: data sampling and feature selection, called DO-IDS. In data sampling, the Isolation Forest (iForest) is used to eliminate outliers, genetic algorithm (GA) to optimize the sampling ratio, and the Random Forest (RF) classifier as the evaluation criteria to obtain the optimal training dataset. In feature selection, GA and RF are used again to obtain the optimal feature subset. Finally, an intrusion detection system based on RF is built using the optimal training dataset obtained by data sampling and the features selected by feature selection. The experiment will be carried out on the UNSW-NB15 dataset. Compared with other algorithms, the model has obvious advantages in detecting rare anomaly behaviors.
AB - Intrusion detection system (IDS) can effectively identify anomaly behaviors in the network; however, it still has low detection rate and high false alarm rate especially for anomalies with fewer records. In this paper, we propose an effective IDS by using hybrid data optimization which consists of two parts: data sampling and feature selection, called DO-IDS. In data sampling, the Isolation Forest (iForest) is used to eliminate outliers, genetic algorithm (GA) to optimize the sampling ratio, and the Random Forest (RF) classifier as the evaluation criteria to obtain the optimal training dataset. In feature selection, GA and RF are used again to obtain the optimal feature subset. Finally, an intrusion detection system based on RF is built using the optimal training dataset obtained by data sampling and the features selected by feature selection. The experiment will be carried out on the UNSW-NB15 dataset. Compared with other algorithms, the model has obvious advantages in detecting rare anomaly behaviors.
UR - http://www.scopus.com/inward/record.url?scp=85068853458&partnerID=8YFLogxK
U2 - 10.1155/2019/7130868
DO - 10.1155/2019/7130868
M3 - Article
AN - SCOPUS:85068853458
SN - 1939-0114
VL - 2019
JO - Security and Communication Networks
JF - Security and Communication Networks
M1 - 7130868
ER -