Back to Search
Start Over
A Machine Learning Framework for Domain Generation Algorithm-Based Malware Detection
- Source :
- IEEE Access, Vol 7, Pp 32765-32782 (2019)
- Publication Year :
- 2019
- Publisher :
- Institute of Electrical and Electronics Engineers (IEEE), 2019.
-
Abstract
- Attackers usually use a command and control (C2) server to manipulate the communication. In order to perform an attack, threat actors often employ a domain generation algorithm (DGA), which can allow malware to communicate with C2 by generating a variety of network locations. Traditional malware control methods, such as blacklisting, are insufficient to handle DGA threats. In this paper, we propose a machine learning framework for identifying and detecting DGA domains to alleviate the threat. We collect real-time threat data from the real-life traffic over a one-year period. We also propose a deep learning model to classify a large number of DGA domains. The proposed machine learning framework consists of a two-level model and a prediction model. In the two-level model, we first classify the DGA domains apart from normal domains and then use the clustering method to identify the algorithms that generate those DGA domains. In the prediction model, a time-series model is constructed to predict incoming domain features based on the hidden Markov model (HMM). Furthermore, we build a deep neural network (DNN) model to enhance the proposed machine learning framework by handling the huge dataset we gradually collected. Our extensive experimental results demonstrate the accuracy of the proposed framework and the DNN model. To be precise, we achieve an accuracy of 95.89% for the classification in the framework and 97.79% in the DNN model, 92.45% for the second-level clustering, and 95.21% for the HMM prediction in the framework.
- Subjects :
- Domain generation algorithm
General Computer Science
Computer science
networking
security
02 engineering and technology
computer.software_genre
Machine learning
Malware
Domain (software engineering)
0202 electrical engineering, electronic engineering, information engineering
General Materials Science
Hidden Markov model
Cluster analysis
Artificial neural network
business.industry
Deep learning
General Engineering
020206 networking & telecommunications
machine learning
domain generation algorithm
020201 artificial intelligence & image processing
lcsh:Electrical engineering. Electronics. Nuclear engineering
Artificial intelligence
business
lcsh:TK1-9971
computer
Subjects
Details
- ISSN :
- 21693536
- Volume :
- 7
- Database :
- OpenAIRE
- Journal :
- IEEE Access
- Accession number :
- edsair.doi.dedup.....7b4d9affa8558cd743a1ef093750a75e