Building an Effective Intrusion Detection System by Using Hybrid Data Optimization Based on Machine Learning Algorithms

Jiadong Ren, Jiawei Guo, Wang Qian, Huang Yuan*, Xiaobing Hao, Hu Jingjing

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

106 Citations (Scopus)

Abstract

Intrusion detection system (IDS) can effectively identify anomaly behaviors in the network; however, it still has low detection rate and high false alarm rate especially for anomalies with fewer records. In this paper, we propose an effective IDS by using hybrid data optimization which consists of two parts: data sampling and feature selection, called DO-IDS. In data sampling, the Isolation Forest (iForest) is used to eliminate outliers, genetic algorithm (GA) to optimize the sampling ratio, and the Random Forest (RF) classifier as the evaluation criteria to obtain the optimal training dataset. In feature selection, GA and RF are used again to obtain the optimal feature subset. Finally, an intrusion detection system based on RF is built using the optimal training dataset obtained by data sampling and the features selected by feature selection. The experiment will be carried out on the UNSW-NB15 dataset. Compared with other algorithms, the model has obvious advantages in detecting rare anomaly behaviors.

Original languageEnglish
Article number7130868
JournalSecurity and Communication Networks
Volume2019
DOIs
Publication statusPublished - 2019

Fingerprint

Dive into the research topics of 'Building an Effective Intrusion Detection System by Using Hybrid Data Optimization Based on Machine Learning Algorithms'. Together they form a unique fingerprint.

Cite this