A multistage intrusion detection method for alleviating class overlapping problem

He Pang, Fusheng Jin*, Mengnan Chen, Yutong Jiang, Ye Yuan

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Intrusion detection system (IDS) can identify abnormal network traffic and attacks, which is an important means of network security defense. However, some intrusion data are often disguised as normal data for transmission, which increases the difficulty of intrusion data classification. In addition, the existing packet-based or flow-based data feature extraction methods result in low feature dimensions, causing the problem of class overlapping between different categories with the same features. To clarify, overlapping samples are those that overlap between erroneous samples and correct samples. Nonoverlapping samples are those in the test set that do not match the characteristics of the already identified overlapping samples and are therefore considered nonoverlapping samples. Therefore, the detection effect of some attacks with high concealment is poor. In order to solve the above problems, this paper proposes a multistage intrusion detection method: an existing intrusion detection model with higher classification performance (OBLR) is used to predict the data in the first stage. In the second stage, for the overlapping data in the confusing data, the method learns the distribution of each feature group according to the randomly divided “intermediary set,” and realizes the prediction of overlapping samples through the prior distribution knowledge, and achieves efficient classification of overlapping samples without increasing the computational burden of the model. For nonoverlapping data in the confusing data, KPCA (kernel principal component analysis) dimension elevation is used in the third stage to capture more detailed difference information between samples, and GMM (Gaussian mixed model) is combined with the “representative samples” proposed in this paper to assist classifier classification. At the same time, all the base classifiers are integrated through LTR (learning to rank) to improve the classification effect of the model for nonoverlapping data in the confusing data. The experimental results show that 99.71% accuracy and 0.158% false positive rate are achieved on the complex intrusion dataset UNSW-NB15, which is better than the existing methods. In particular, this method can increase the accuracy of 38.1% for the confusing samples that cannot be correctly detected by the existing model.

Original languageEnglish
Article number106167
JournalNeural Computing and Applications
DOIs
Publication statusAccepted/In press - 2024

Keywords

  • Gaussian mixed model
  • Intrusion detection
  • Kernel principal component analysis
  • Learning to rank

Fingerprint

Dive into the research topics of 'A multistage intrusion detection method for alleviating class overlapping problem'. Together they form a unique fingerprint.

Cite this

Pang, H., Jin, F., Chen, M., Jiang, Y., & Yuan, Y. (Accepted/In press). A multistage intrusion detection method for alleviating class overlapping problem. Neural Computing and Applications, Article 106167. https://doi.org/10.1007/s00521-024-10903-x