Abstract
We study feature selection (FS) for flow-based intrusion detection and propose a deterministic hybrid-FS that fuses Mutual Information, Random-Forest, and XGBoost importances under a simplex search with a single threshold. Using CIC-IDS-2017, CSE-CIC-IDS2018, and NF-UNSW-NB15, we evaluate ten FS techniques paired with six ensembles under a leakage-safe protocol. The hybrid-FS consistently matches or exceeds the best single selectors while reducing feature count (e.g., 78 → 31) and improving runtime. Throughput rises by ∼9–10% and per-flow latency drops from 0.44 → 0.40 ms (p50) and 1.40 → 1.20 ms (p99), with mean ± 95% CIs and paired tests. False-positive rate (FPR) decreases by 15–19% (≈ 22 fewer false alarms per hour at 100k flows/h). Against representative PSO/GA hybrids, our fusion attains small but consistent macro-F1 gains and 15–25% FPR reductions at comparable latency. We clarify adversarial robustness with an explicit FGSM feature-space threat model and DeepPackGen configuration, and we diagnose cross-dataset shift with lightweight mitigations. A 24-hour SOC replay links FPR to analyst time savings (2.5–3.7 hours/day) without sacrificing macro-F1 or AUROC. The results position deterministic, compact FS as a practical choice for inline IDS where tail latency and alert volume matter.
| Original language | English |
|---|---|
| Journal | IEEE Transactions on Dependable and Secure Computing |
| DOIs | |
| Publication status | Accepted/In press - 2026 |
| Externally published | Yes |
Keywords
- Ensemble machine learning models
- feature selection techniques
- hyperparameter tuning
- intrusion detection systems (IDS)
Fingerprint
Dive into the research topics of 'Evaluation to Integration: Hybrid Feature Selection Framework With Ensemble Machine Learning for Intrusion Detection'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver