MaliCage: A packed malware family classification framework based on DNN and GAN

Xianwei Gao*, Changzhen Hu, Chun Shan, Weijie Han

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

17 Citations (Scopus)
Plum Print visual indicator of research metrics
  • Citations
    • Citation Indexes: 17
  • Captures
    • Readers: 37
see details

Abstract

To evade security detection, hackers always add a deceptive packer outside of the original malicious codes. The coexistence of original unpacked samples and packed samples of same family needs special attention in malware detection. The features of packed malware are changed by the packer, which would disturb the prediction results of malware classifier. The state-of-the-art studies of malware detection mainly focus on whether the malware is packed, or which type of packer is used. However, the ability of detecting the family of packed malware is still insufficient. Motivated by the above challenges, a novel packed malware family classification framework called MaliCage is proposed. The goal of the framework is to classify packed malware accurately. MaliCage consists of three core modules: packer detector, malware classifier, and a packer generative adversarial network (GAN). The packer detector is used as the pre-step of the framework to identify whether malware is packed. After distinguishing the packed samples, the dynamic features extracted from the sandbox are fitted to the malware classifier based on deep neural networks (DNN). The malware classifier can classify unpacked and packed malware simultaneously. Furthermore, the packer GAN generates fake packed samples to alleviate the underfitting of the malware classifiers. We built a single-packer dataset and a multi-packer dataset to evaluate the framework. In the single-packer experiment, 10 classes of malware samples packed by UPX were examined objectively. The accuracy of the malware classifier when using only real packed samples was 91.66%. After introducing fake packed samples generated by packer GAN, the accuracy of the packed malware classifier could reach 97.8%. In the multi-packer scenario, our method can also accurately classify benign programs, unpacked malware and malware packed by several common packers. The validation results show that MaliCage can not only effectively solve the impacts of packed malware on machine learning model, but also improve the detection accuracy.

Original languageEnglish
Article number103267
JournalJournal of Information Security and Applications
Volume68
DOIs
Publication statusPublished - Aug 2022

Keywords

  • Classification
  • DNN
  • GAN
  • Packed malware

Fingerprint

Dive into the research topics of 'MaliCage: A packed malware family classification framework based on DNN and GAN'. Together they form a unique fingerprint.

Cite this

Gao, X., Hu, C., Shan, C., & Han, W. (2022). MaliCage: A packed malware family classification framework based on DNN and GAN. Journal of Information Security and Applications, 68, Article 103267. https://doi.org/10.1016/j.jisa.2022.103267