Detecting malicious urls using a deep learning approach based on stacked denoising autoencoder

Huaizhi Yan*, Xin Zhang, Jiangwei Xie, Changzhen Hu

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

4 引用 (Scopus)

摘要

As the source of spamming, phishing, malware and many more such attacks, malicious URL is a chronic and complicated problem on the Internet. Machine learning approaches have taken effect and obtained high accuracy in detecting malicious URL. But the tedious process of extracting features from URL and the high dimension of feature vector makes the implementing time consuming. This paper presents a deep learning method using Stacked denoising autoencoders model to learn and detect intrinsic malicious features. We employ an SdA network to analyze URLs and extract features automatically. Then a logistic regression is implemented to detect malicious and benign URLs, which can generate detection models without a manually feature engineering. We have implemented our network model using Keras, a high-level neural networks API with a Tensor-flow backend, an open source deep learning library. 5 datasets were used and 4 other method were compared with our model. In the result, our architecture achieves an accuracy of 98.25% and a micro-averaged F1 score of 0.98, tested on a mixed dataset containing around 2 million samples.

源语言英语
主期刊名Trusted Computing and Information Security - 12th Chinese Conference, CTCIS 2018, Revised Selected Papers
编辑Huanguo Zhang, Bo Zhao, Fei Yan
出版商Springer Verlag
372-388
页数17
ISBN(印刷版)9789811359125
DOI
出版状态已出版 - 2019
活动12th Chinese Conference on Trusted Computing and Information Security, CTCIS 2018 - Wuhan, 中国
期限: 18 10月 201818 10月 2018

出版系列

姓名Communications in Computer and Information Science
960
ISSN(印刷版)1865-0929

会议

会议12th Chinese Conference on Trusted Computing and Information Security, CTCIS 2018
国家/地区中国
Wuhan
时期18/10/1818/10/18

指纹

探究 'Detecting malicious urls using a deep learning approach based on stacked denoising autoencoder' 的科研主题。它们共同构成独一无二的指纹。

引用此