Research on malicious code analysis method based on semi-supervised learning

Tingting He, Jingfeng Xue, Jianwen Fu, Yong Wang*, Chun Shan

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

1 引用 (Scopus)

摘要

The research on classification method of malicious code is helpful for researchers to understand attack characteristics quickly, and help to reduce the loss of users and even the states. Currently, most of the malware classification methods are based on supervised learning algorithms, but it is powerless for the small number of labeled samples. Therefore, in this paper, we propose a new malware classification method, which is based on semi-supervised learning algorithm. First, we extract the impactful static features and dynamic features to serialize and obtain features of high dimension. Then, we select them with Ensemble Feature Grader consistent with Information Gain, Random Forest and Logistic Regression with L1 and L2, and reduce dimension again with PCA. Finally, we use Learning with local and global consistency algorithm with K-means to classify malwares. The experimental results of comparison among SVM, LLGC and K-means + LLGC show that using of the feature extraction, feature reduction and classification method, K-means + LLGC algorithm is superior to LLGC in both classification accuracy and efficiency, the accuracy is increased by 2% to 3%, and the accuracy is more than SVM when the number of labeled samples is small.

源语言英语
主期刊名Trusted Computing and Information Security - 11th Chinese Conference, CTCIS 2017, Proceedings
编辑Fei Yan, Ming Xu, Shaojing Fu, Zheng Qin
出版商Springer Verlag
227-241
页数15
ISBN(印刷版)9789811070792
DOI
出版状态已出版 - 2017
活动11th Chinese Conference on Trusted Computing and Information Security, CTCIS 2017 - Changsha, 中国
期限: 14 9月 201717 9月 2017

出版系列

姓名Communications in Computer and Information Science
704
ISSN(印刷版)1865-0929

会议

会议11th Chinese Conference on Trusted Computing and Information Security, CTCIS 2017
国家/地区中国
Changsha
时期14/09/1717/09/17

指纹

探究 'Research on malicious code analysis method based on semi-supervised learning' 的科研主题。它们共同构成独一无二的指纹。

引用此