Malware Visualization for Fine-Grained Classification

Jianwen Fu; Jingfeng Xue; Yong Wang; Zhenyan Liu; Chun Shan

doi:10.1109/ACCESS.2018.2805301

Malware Visualization for Fine-Grained Classification

Jianwen Fu, Jingfeng Xue, Yong Wang^*, Zhenyan Liu, Chun Shan

^*此作品的通讯作者

计算机学院

Beijing Institute of Technology

科研成果: 期刊稿件 › 文章 › 同行评审

109 引用（Scopus）

摘要

Due to the rapid rise of automated tools, the number of malware variants has increased dramatically, which poses a tremendous threat to the security of the Internet. Recently, some methods for quick analysis of malware have been proposed, but these methods usually require a large computational overhead and cannot classify samples accurately for large-scale and complex malware data set. Therefore, in this paper, we propose a new visualization method for characterizing malware globally and locally to achieve fast and effective fine-grained classification. We take a new approach to visualize malware as RGB-colored images and extract global features from the images. Gray-level co-occurrence matrix and color moments are selected to describe the global texture features and color features, respectively, which produces low-dimensional feature data to reduce the complexity of training model. Moreover, a series of special byte sequences are extracted from code sections and data sections of malware and are processed into feature vectors by Simhash as the local features. Finally, we merge the global features and local features to perform malware classification using random forest, K-nearest neighbor, and support vector machine. Experimental results show that our approach obtains the highest accuracy of 97.47% and the highest F-measure of 96.85% of 7087 samples from 15 families. Color features and the local features effectively assist in the classification based on texture features and enhance the F-measure by 3.4% and 1%, respectively. Overall, the combination of global features and local features can realize fine-grained malware classification with low computational cost.

源语言	英语
页（从-至）	14510-14523
页数	14
期刊	IEEE Access
卷	6
DOI	https://doi.org/10.1109/ACCESS.2018.2805301
出版状态	已出版 - 11 2月 2018

访问文件

10.1109/ACCESS.2018.2805301

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{3a93093a731c474aba2136be8004be0f,

title = "Malware Visualization for Fine-Grained Classification",

abstract = "Due to the rapid rise of automated tools, the number of malware variants has increased dramatically, which poses a tremendous threat to the security of the Internet. Recently, some methods for quick analysis of malware have been proposed, but these methods usually require a large computational overhead and cannot classify samples accurately for large-scale and complex malware data set. Therefore, in this paper, we propose a new visualization method for characterizing malware globally and locally to achieve fast and effective fine-grained classification. We take a new approach to visualize malware as RGB-colored images and extract global features from the images. Gray-level co-occurrence matrix and color moments are selected to describe the global texture features and color features, respectively, which produces low-dimensional feature data to reduce the complexity of training model. Moreover, a series of special byte sequences are extracted from code sections and data sections of malware and are processed into feature vectors by Simhash as the local features. Finally, we merge the global features and local features to perform malware classification using random forest, K-nearest neighbor, and support vector machine. Experimental results show that our approach obtains the highest accuracy of 97.47% and the highest F-measure of 96.85% of 7087 samples from 15 families. Color features and the local features effectively assist in the classification based on texture features and enhance the F-measure by 3.4% and 1%, respectively. Overall, the combination of global features and local features can realize fine-grained malware classification with low computational cost.",

keywords = "Malware visualization, RGB-colored image, fine-grained classification",

author = "Jianwen Fu and Jingfeng Xue and Yong Wang and Zhenyan Liu and Chun Shan",

note = "Publisher Copyright: {\textcopyright} 2013 IEEE.",

year = "2018",

month = feb,

day = "11",

doi = "10.1109/ACCESS.2018.2805301",

language = "English",

volume = "6",

pages = "14510--14523",

journal = "IEEE Access",

issn = "2169-3536",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Malware Visualization for Fine-Grained Classification

AU - Fu, Jianwen

AU - Xue, Jingfeng

AU - Wang, Yong

AU - Liu, Zhenyan

AU - Shan, Chun

PY - 2018/2/11

Y1 - 2018/2/11

N2 - Due to the rapid rise of automated tools, the number of malware variants has increased dramatically, which poses a tremendous threat to the security of the Internet. Recently, some methods for quick analysis of malware have been proposed, but these methods usually require a large computational overhead and cannot classify samples accurately for large-scale and complex malware data set. Therefore, in this paper, we propose a new visualization method for characterizing malware globally and locally to achieve fast and effective fine-grained classification. We take a new approach to visualize malware as RGB-colored images and extract global features from the images. Gray-level co-occurrence matrix and color moments are selected to describe the global texture features and color features, respectively, which produces low-dimensional feature data to reduce the complexity of training model. Moreover, a series of special byte sequences are extracted from code sections and data sections of malware and are processed into feature vectors by Simhash as the local features. Finally, we merge the global features and local features to perform malware classification using random forest, K-nearest neighbor, and support vector machine. Experimental results show that our approach obtains the highest accuracy of 97.47% and the highest F-measure of 96.85% of 7087 samples from 15 families. Color features and the local features effectively assist in the classification based on texture features and enhance the F-measure by 3.4% and 1%, respectively. Overall, the combination of global features and local features can realize fine-grained malware classification with low computational cost.

AB - Due to the rapid rise of automated tools, the number of malware variants has increased dramatically, which poses a tremendous threat to the security of the Internet. Recently, some methods for quick analysis of malware have been proposed, but these methods usually require a large computational overhead and cannot classify samples accurately for large-scale and complex malware data set. Therefore, in this paper, we propose a new visualization method for characterizing malware globally and locally to achieve fast and effective fine-grained classification. We take a new approach to visualize malware as RGB-colored images and extract global features from the images. Gray-level co-occurrence matrix and color moments are selected to describe the global texture features and color features, respectively, which produces low-dimensional feature data to reduce the complexity of training model. Moreover, a series of special byte sequences are extracted from code sections and data sections of malware and are processed into feature vectors by Simhash as the local features. Finally, we merge the global features and local features to perform malware classification using random forest, K-nearest neighbor, and support vector machine. Experimental results show that our approach obtains the highest accuracy of 97.47% and the highest F-measure of 96.85% of 7087 samples from 15 families. Color features and the local features effectively assist in the classification based on texture features and enhance the F-measure by 3.4% and 1%, respectively. Overall, the combination of global features and local features can realize fine-grained malware classification with low computational cost.

KW - Malware visualization

KW - RGB-colored image

KW - fine-grained classification

UR - http://www.scopus.com/inward/record.url?scp=85042078959&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2018.2805301

DO - 10.1109/ACCESS.2018.2805301

M3 - Article

AN - SCOPUS:85042078959

SN - 2169-3536

VL - 6

SP - 14510

EP - 14523

JO - IEEE Access

JF - IEEE Access

ER -

Malware Visualization for Fine-Grained Classification

摘要

访问文件

其它文件与链接

指纹

引用此