Byte-Level Function-Associated Method for Malware Detection

  • Jingwei Hao*
  • , Senlin Luo
  • , Limin Pan
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

The byte stream is widely used in malware detection due to its independence of reverse engineering. However, existing methods based on the byte stream implement an indiscriminate feature extraction strategy, which ignores the byte function difference in different segments and fails to achieve targeted feature extraction for various byte semantic representation modes, resulting in byte semantic confusion. To address this issue, an enhanced adversarial byte function associated method for malware backdoor attack is proposed in this paper by categorizing various function bytes into three functions involving structure, code, and data. The Minhash algorithm, grayscale mapping, and state transition probability statistics are then used to capture byte semantics from the perspectives of text signature, spatial structure, and statistical aspects, respectively, to increase the accuracy of byte semantic representation. Finally, the three-channel malware feature image is constructed based on different function byte semantics, and a convolutional neural network is applied for detection. Experiments on multiple data sets from 2018 to 2021 show that the method can effectively combine byte functions to achieve targeted feature extraction, avoid byte semantic confusion, and improve the accuracy of malware detection.

Original languageEnglish
Pages (from-to)719-734
Number of pages16
JournalComputer Systems Science and Engineering
Volume46
Issue number1
DOIs
Publication statusPublished - 2023

Keywords

  • Byte function
  • malware backdoor attack
  • semantic representation model
  • visualization

Fingerprint

Dive into the research topics of 'Byte-Level Function-Associated Method for Malware Detection'. Together they form a unique fingerprint.

Cite this