BinDeep: A deep learning approach to binary code similarity detection

Donghai Tian, Xiaoqi Jia, Rui Ma*, Shuke Liu, Wenjing Liu, Changzhen Hu

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

21 引用 (Scopus)

摘要

Binary code similarity detection (BCSD) plays an important role in malware analysis and vulnerability discovery. Existing methods mainly rely on the expert's knowledge for the BCSD, which may not be reliable in some cases. More importantly, the detection accuracy (or performance) of these methods are not so satisfied. To address these issues, we propose BinDeep, a deep learning approach for binary code similarity detection. This method firstly extracts the instruction sequence from the binary function and then uses the instruction embedding model to vectorize the instruction features. Next, BinDeep applies a Recurrent Neural Network (RNN) deep learning model to identify the specific types of two functions for later comparison. According to the type information, BinDeep selects the corresponding deep learning model for similarity comparison. Specifically, BinDeep uses the Siamese neural networks, which combine the LSTM and CNN to measure the similarities of two target functions. Different from the traditional deep learning model, our hybrid model takes advantage of the CNN spatial structure learning and the LSTM sequence learning. The evaluation shows that our approach can achieve good BCSD between cross-architecture, cross-compiler, cross-optimization, and cross-version binary code.

源语言英语
文章编号114348
期刊Expert Systems with Applications
168
DOI
出版状态已出版 - 15 4月 2021

指纹

探究 'BinDeep: A deep learning approach to binary code similarity detection' 的科研主题。它们共同构成独一无二的指纹。

引用此