Descriptor: "domain generation algorithm" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"domain generation algorithm"' showing total 154 results

Start Over Descriptor "domain generation algorithm"

154 results on '"domain generation algorithm"'

1. KDTM: Multi-Stage Knowledge Distillation Transfer Model for Long-Tailed DGA Detection.

Author: Fan, Baoyu, Ma, Han, Liu, Yue, Yuan, Xiaochen, and Ke, Wei
Subjects: *DEEP learning, *PARETO distribution, *KNOWLEDGE transfer, *SAMPLE size (Statistics), *BOTNETS
Abstract: As the most commonly used attack strategy by Botnets, the Domain Generation Algorithm (DGA) has strong invisibility and variability. Using deep learning models to detect different families of DGA domain names can improve the network defense ability against hackers. However, this task faces an extremely imbalanced sample size among different DGA categories, which leads to low classification accuracy for small sample categories and even classification failure for some categories. To address this issue, we introduce the long-tailed concept and augment the data of small sample categories by transferring pre-trained knowledge. Firstly, we propose the Data Balanced Review Method (DBRM) to reduce the sample size difference between the categories, thus a relatively balanced dataset for transfer learning is generated. Secondly, we propose the Knowledge Transfer Model (KTM) to enhance the knowledge of the small sample categories. KTM uses a multi-stage transfer to transfer weights from the big sample categories to the small sample categories. Furthermore, we propose the Knowledge Distillation Transfer Model (KDTM) to relieve the catastrophic forgetting problem caused by transfer learning, which adds knowledge distillation loss based on the KTM. The experimental results show that KDTM can significantly improve the classification performance of all categories, especially the small sample categories. It can achieve a state-of-the-art macro average F1 score of 84.5%. The robustness of the KDTM model is verified using three DGA datasets that follow the Pareto distributions. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

2. Detecting Domain Names Generated by DGAs With Low False Positives in Chinese Domain Names

Author: Huiju Lee, Jeong Do Yoo, Seonghoon Jeong, and Huy Kang Kim
Subjects: Botnet, deep learning, domain generation algorithm, embedding, subword segmentation, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Attackers are known to utilize domain generation algorithms (DGAs) to generate domain names for command and control (C&C) servers and facilitate the distribution of uniform resource locators within malicious software. DGAs pose a significant threat to cybersecurity owing to their ability to dynamically generate unpredictable domain names. Extensive research is currently underway to detect the domain names created using DGAs. However, the high false positive rates when handling benign domain names in non-English languages pose a challenge. Thus, this study proposes a DGA detection method that effectively embeds non-English domain names to focus on Chinese domain names, which are referred to as domain names composed of Pinyin. The proposed method segments domain names into meaningful subwords for effective vector representation. Consequently, the FastText model learns the context information of the segmented subwords and embeds the domain name. Further, the deep learning-based detection model learns the vectorized domain names and determines whether a particular domain name is DGA-generated. We labeled the Chinese domain names among the benign domain names for our experiment. The experimental results show that the proposed method outperforms existing methods across all performance metrics on the entire test dataset. Notably, the proposed method minimizes the false positive rate, thereby enhancing detection reliability. In addition, it exhibits high performance, achieving a recall of 0.9873 and 0.9886 for Chinese and English domain names, respectively. This demonstrates that the proposed method consistently delivers high performance across various metrics and languages.
Published: 2024
Full Text: View/download PDF

3. BotDetector: a system for identifying DGA-based botnet with CNN-LSTM.

Author: Zang, Xiaodong, Cao, Jianbo, Zhang, Xinchang, Gong, Jian, and Li, Guiqing
Subjects: BOTNETS, REVERSE engineering, COMPUTER network security, MACHINE learning, ALGORITHMS, DEEP learning
Abstract: Botnets are one of the major threats to network security nowadays. To carry out malicious actions remotely, they heavily rely on Command and Control channels. DGA-based botnets use a domain generation algorithm to generate a significant number of domain names. By analyzing the linguistic distinctions between legitimate and DGA-based domain names, traditional machine learning schemes obtain great benefits. However, it is difficult to identify the ones based on wordlists or pseudo-random generated. Accordingly, this paper proposes an efficient CNN-LSTM-based detection model (BotDetector) that uses only a set of simple-to-compute, easy-to-compute character features. We evaluate our model with two open-source benchmark datasets (360 netlab, Bambenek) and real DNS traffic from the China Education and Research Network. Experimental results demonstrate that our algorithm improves by 1.6 % in terms of accuracy and F1-score and reduces the computation time by 9.4 % compared to other state-of-the-art alternatives. Remarkably, our work can identify botnet's covert communication channels that use domain names based on word lists or pseudo-random generation without any help of reverse engineering. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

4. A Survey of Machine Learning and Deep Learning Based DGA Detection Techniques

Author: Saeed, Amr M. H., Wang, Danghui, Alnedhari, Hamas A. M., Mei, Kuizhi, Wang, Jihe, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Qiu, Meikang, editor, Gai, Keke, editor, and Qiu, Han, editor
Published: 2022
Full Text: View/download PDF

5. KDTM: Multi-Stage Knowledge Distillation Transfer Model for Long-Tailed DGA Detection

Author: Baoyu Fan, Han Ma, Yue Liu, Xiaochen Yuan, and Wei Ke
Subjects: domain generation algorithm, long-tailed problem, transfer learning, knowledge distillation, data balanced review method, Mathematics, QA1-939
Abstract: As the most commonly used attack strategy by Botnets, the Domain Generation Algorithm (DGA) has strong invisibility and variability. Using deep learning models to detect different families of DGA domain names can improve the network defense ability against hackers. However, this task faces an extremely imbalanced sample size among different DGA categories, which leads to low classification accuracy for small sample categories and even classification failure for some categories. To address this issue, we introduce the long-tailed concept and augment the data of small sample categories by transferring pre-trained knowledge. Firstly, we propose the Data Balanced Review Method (DBRM) to reduce the sample size difference between the categories, thus a relatively balanced dataset for transfer learning is generated. Secondly, we propose the Knowledge Transfer Model (KTM) to enhance the knowledge of the small sample categories. KTM uses a multi-stage transfer to transfer weights from the big sample categories to the small sample categories. Furthermore, we propose the Knowledge Distillation Transfer Model (KDTM) to relieve the catastrophic forgetting problem caused by transfer learning, which adds knowledge distillation loss based on the KTM. The experimental results show that KDTM can significantly improve the classification performance of all categories, especially the small sample categories. It can achieve a state-of-the-art macro average F1 score of 84.5%. The robustness of the KDTM model is verified using three DGA datasets that follow the Pareto distributions.
Published: 2024
Full Text: View/download PDF

6. BadDGA: Backdoor Attack on LSTM-Based Domain Generation Algorithm Detector.

Author: Zhai, You, Yang, Liqun, Yang, Jian, He, Longtao, and Li, Zhoujun
Subjects: ARTIFICIAL neural networks, BOTNETS, DETECTORS, DEEP learning, ALGORITHMS, DATA scrubbing
Abstract: Due to the outstanding performance of deep neural networks (DNNs), many researchers have begun to transfer deep learning techniques to their fields. To detect algorithmically generated domains (AGDs) generated by domain generation algorithm (DGA) in botnets, a long short-term memory (LSTM)-based DGA detector has achieved excellent performance. However, the previous DNNs have found various inherent vulnerabilities, so cyberattackers can use these drawbacks to deceive DNNs, misleading DNNs into making wrong decisions. Backdoor attack as one of the popular attack strategies strike against DNNs has attracted widespread attention in recent years. In this paper, to cheat the LSTM-based DGA detector, we propose BadDGA, a backdoor attack against the LSTM-based DGA detector. Specifically, we offer four backdoor attack trigger construction methods: TLD-triggers, Ngram-triggers, Word-triggers, and IDN-triggers. Finally, we evaluate BadDGA on ten popular DGA datasets. The experimental results show that under the premise of 1‰ poisoning rate, our proposed backdoor attack can achieve a 100% attack success rate to verify the effectiveness of our method. Meanwhile, the model's utility on clean data is influenced slightly. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

7. A model to detect domain names generated by DGA malware.

Author: Divya, T, Amritha, P.P, and Viswanathan, Sangeetha
Subjects: MALWARE, RANDOM forest algorithms, COMPUTER networks, DECISION trees, LOGISTIC regression analysis, INTERNET domain naming system, POSE estimation (Computer vision)
Abstract: Command and control(C&C) servers are being more frequently used in cyberattacks in recent years. A malware-infected machine is controlled and directed by an attacker using a command-and-control server in order to steal data from the network. To hide their servers, attackers commonly employ a domain generation algorithm that generates domain names for them by concatenating words from word lists. Some of the algorithmically-generated domain names are used to connect to the C&C server. With the emergence of sophisticated domain generation algorithms, detecting such domains has become a challenge, which in turn poses a severe danger to computer networks. In this paper, we are proposing a concept called centrality, which is used as one of the features to analyze the words in the domain names generated by the domain generation algorithm malware. For classification, we are using Naïve Bayes, KNN, SVM, Decision Trees, Random Forest and logistic regression. Experimental results showed that Random Forest gave the highest classification accuracy rate of 88.64% and Naive Bayes gave the lowest accuracy of 44.32%. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

8. D3-SACNN: DGA Domain Detection With Self-Attention Convolutional Network

Author: Kejun Zhao, Wei Guo, Fenglin Qin, and Xinjun Wang
Subjects: Convolutional network, domain classification, domain generation algorithm, self-attention, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Botnets are currently one of the main cyber security threats. In order to enhance the concealment, botnets usually use Domain Generation Algorithm (DGA) to establish communication between bots and command and control servers. Character-based deep learning methods are widely researched in the classification of DGA domains to detect botnets and have achieved good results. But the pronounceable DGA domain detection is still a challenge, since the linguistic statistical characteristics of the pronounceable DGA domains and benign domains are very similar. We propose a multi-head self-attention convolutional network method for DGA domain classification task. We use a shallow convolutional neural network to extract hidden features of domain characters. The multi-head self-attention mechanism with different input values is used to effectively obtain the relationship between the characters and the extracted implicit features, which will help us more effectively distinguish between pronounceable DGA domains and benign domains. Experiments on public data show that our model can effectively detect various types of DGA domains. Especially for the pronounceable DGA domains, our method is significantly better than other detection methods.
Published: 2022
Full Text: View/download PDF

9. DeepDGA-MINet: Cost-Sensitive Deep Learning Based Framework for Handling Multiclass Imbalanced DGA Detection

Author: Vinayakumar, R., Soman, K. P., Poornachandran, Prabaharan, Gupta, Brij B., editor, Perez, Gregorio Martinez, editor, Agrawal, Dharma P., editor, and Gupta, Deepak, editor
Published: 2020
Full Text: View/download PDF

10. DGA botnet detection method based on capsule network and k-means routing.

Author: Liu, Xiaoyang and Liu, Jiamiao
Subjects: *CAPSULE neural networks, *BOTNETS, *ROUTING algorithms, *PERSONAL names, *K-means clustering, *DEEP learning
Abstract: For the current mainstream DGA domain name detection methods, scalars are almost used to represent numerical features, resulting in the loss of the spatial feature information of domain name characters. This paper proposes a sequence capsule network based on the k-means routing algorithm, LSTM-CapsNet, which only uses DGA domain name text information for detection. The model uses a bidirectional LSTM unit to extract basic features for the capsule network and uses the k-means algorithm to cluster vector features to implement routing functions. In order to verify the proposed LSTM-CapsNet model, data from two different sources are collected to ensure the reliability of the experiment, covering the DGA domain name dataset from the real network defined as Real-Dataset, and the DGA domain name obtained through the domain name generation algorithm is defined as Gen-Dataset. The current DGA domain name detection method of state-of-the-art proposed by researchers is compared and tested on two data sets. The experimental results show that the proposed model has achieved 99.17% and 97.75% of the F-score evaluation indicators in the DGA domain name recognition of the two datasets; at the same time, the recognition of the DGA domain name family has been very competitive. Compared with the existing DGA domain name family classification model, the F-score value of the proposed model exceeds 89% in Gen-Dataset multi-class recognition. This model not only improves the ability of DGA domain name recognition and DGA domain name family recognition but also has an outstanding ability to find real-time aspects in model testing. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

11. DGA domain detection and botnet prevention using Q-learning for POMDP

Author: Y. V. Bubnov and N. N. Ivanov
Subjects: domain generation algorithm, computer network security, recurrent neural network, partially observable markov decision process, q-learning, Electronics, TK7800-8360
Abstract: An effective method for preventing the operation of computer network nodes for organizing a botnet is proposed. A botnet is a collection of devices connected via the Internet for the purpose of organizing DDoS attacks, stealing data, sending spam and other malicious actions. The described method implies the detection of generated domain names in DNS queries using a neural network with parallel organization of convolutional and bidirectional recurrent layers. The effectiveness of the method is based on the assumption that generated domain names are used to create a botnet for merging. Experiments confirm that the proposed neural network is superior to the accuracy of existing counterparts on the UMUDGA dataset. The estimation of the quality of recognition of generated domain names using ROC analysis is calculated for a trained neural network. The article also formulates a model for controlling detectors using a partially observable Markov decisionmaking process to block infected nodes of a computer network. The search for the optimal policy for the formulated model by means of Q-learning of value agents is proposed. A comparative analysis of the average, minimum and maximum value of actions taken by agents in the process of interacting with the environment is carried out.
Published: 2021
Full Text: View/download PDF

12. Nemesis: Detecting Algorithmically Generated Domains with an LSTM Language Model

Author: Yuan, Dunsheng, Xiong, Ying, Zang, Tianning, Huang, Ji, Akan, Ozgur, Editorial Board Member, Bellavista, Paolo, Editorial Board Member, Cao, Jiannong, Editorial Board Member, Coulson, Geoffrey, Editorial Board Member, Dressler, Falko, Editorial Board Member, Ferrari, Domenico, Editorial Board Member, Gerla, Mario, Editorial Board Member, Kobayashi, Hisashi, Editorial Board Member, Palazzo, Sergio, Editorial Board Member, Sahni, Sartaj, Editorial Board Member, Shen, Xuemin (Sherman), Editorial Board Member, Stan, Mircea, Editorial Board Member, Xiaohua, Jia, Editorial Board Member, Zomaya, Albert Y., Editorial Board Member, Wang, Xinheng, editor, Gao, Honghao, editor, Iqbal, Muddesar, editor, and Min, Geyong, editor
Published: 2019
Full Text: View/download PDF

13. RL-Gen: A Character-Level Text Generation Framework with Reinforcement Learning in Domain Generation Algorithm Case

Author: Cheng, Hua, Cai, Jing, Fang, Yiquan, Barbosa, Simone Diniz Junqueira, Editorial Board Member, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Kotenko, Igor, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Gedeon, Tom, editor, Wong, Kok Wai, editor, and Lee, Minho, editor
Published: 2019
Full Text: View/download PDF

14. Lightweight malicious domain name detection model based on separable convolution

Author: Luhui YANG, Huiwen BAI, Guangjie LIU, and Yuewei DAI
Subjects: separable convolution, domain generation algorithm, deep learning, cyber security, Electronic computers. Computer science, QA75.5-76.95
Abstract: The application of artificial intelligence in the detection of malicious domain names needs to consider both accuracy and calculation speed,which can make it closer to the actual application.Based on the above considerations,a lightweight malicious domain name detection model based on separable convolution was proposed.The model uses a separable convolution structure.It first applies depthwise convolution on every input channel,and then performs pointwise convolution on all output channels.This can effectively reduce the parameters of convolution process without impacting the effectiveness of convolution feature extraction,and realize faster convolution process while keeping high accuracy.To improve the detection accuracy considering the imbalance of the number and difficulty of positive and negative samples,a focal loss function was introduced in the training process of the model.The proposed algorithm was compared with three typical deep-learning-based detection models on a public data set.Experimental results denote that the proposed algorithm achieves detection accuracy close to the state-of-the-art model,and can significantly improve model inference speed on CPU.
Published: 2020
Full Text: View/download PDF

15. Lightweight malicious domain name detection model based on separable convolution

Author: YANG Luhui, BAI Huiwen and LIU Guangjie, DAI Yuewei
Subjects: separable convolution, domain generation algorithm, deep learning, cyber security, Electronic computers. Computer science, QA75.5-76.95
Abstract: The application of artificial intelligence in the detection of malicious domain names needs to consider both accuracy and calculation speed, which can make it closer to the actual application. Based on the above considerations, a lightweight malicious domain name detection model based on separable convolution was proposed.The model uses a separable convolution structure. It first applies depthwise convolution on every input channel, and then performs pointwise convolution on all output channels. This can effectively reduce the parameters of convolution process without impacting the effectiveness of convolution feature extraction, and realize faster convolution process while keeping high accuracy. To improve the detection accuracy considering the imbalance of the number and difficulty of positive and negative samples, a focal loss function was introduced in the training process of the model. The proposed algorithm was compared with three typical deep-learning-based detection models on a public data set. Experimental results denote that the proposed algorithm achieves detection accuracy close to the state-of-the-art model, and can significantly improve model inference speed on CPU.
Published: 2020
Full Text: View/download PDF

16. Applied Machine Learning in Recognition of DGA Domain Names.

Author: Štampar, Miroslav and Fertalj, Krešimir
Abstract: Recognition of domain names generated by domain generation algorithms (DGAs) is the essential part of malware detection by inspection of network traffic. Besides basic heuristics (HE) and limited detection based on blacklists, the most promising course seems to be machine learning (ML). There is a lack of studies that extensively compare different ML models in the field of DGA binary classification, including both conventional and deep learning (DL) representatives. Also, those few that exist are either focused on a small set of models, use a poor set of features in ML models or fail to secure unbiased independence between training and evaluation samples. To overcome these limitations, we engineered a robust feature set, and accordingly trained and evaluated 14 ML, 9 DL, and 2 comparative models on two independent datasets. Results show that if ML features are properly engineered, there is a marginal difference in overall score between top ML and DL representatives. This paper represents the first attempt to neutrally compare the performance of many different models for the recognition of DGA domain names, where the best models perform as well as the top representatives from the literature. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

17. A DGA domain names detection modeling method based on integrating an attention mechanism and deep neural network

Author: Fangli Ren, Zhengwei Jiang, Xuren Wang, and Jian Liu
Subjects: Domain generation algorithm, Malware, Attention mechanism, Deep learning, Computer engineering. Computer hardware, TK7885-7895, Electronic computers. Computer science, QA75.5-76.95
Abstract: Abstract Command and control (C2) servers are used by attackers to operate communications. To perform attacks, attackers usually employee the Domain Generation Algorithm (DGA), with which to confirm rendezvous points to their C2 servers by generating various network locations. The detection of DGA domain names is one of the important technologies for command and control communication detection. Considering the randomness of the DGA domain names, recent research in DGA detection applyed machine learning methods based on features extracting and deep learning architectures to classify domain names. However, these methods are insufficient to handle wordlist-based DGA threats, which generate domain names by randomly concatenating dictionary words according to a special set of rules. In this paper, we proposed a a deep learning framework ATT-CNN-BiLSTM for identifying and detecting DGA domains to alleviate the threat. Firstly, the Convolutional Neural Network (CNN) and bidirectional Long Short-Term Memory (BiLSTM) neural network layer was used to extract the features of the domain sequences information; secondly, the attention layer was used to allocate the corresponding weight of the extracted deep information from the domain names. Finally, the different weights of features in domain names were put into the output layer to complete the tasks of detection and classification. Our extensive experimental results demonstrate the effectiveness of the proposed model, both on regular DGA domains and DGA that hard to detect such as wordlist-based and part-wordlist-based ones. To be precise,we got a F1 score of 98.79% for the detection and macro average precision and recall of 83% for the classification task of DGA domain names.
Published: 2020
Full Text: View/download PDF

18. A Superficial Analysis Approach for Identifying Malicious Domain Names Generated by DGA Malware

Author: Akihiro Satoh, Yutaka Fukuda, Toyohiro Hayashi, and Gen Kitagata
Subjects: Domain generation algorithm, domain name system, malware, network security, Telecommunication, TK5101-6720, Transportation and communications, HE1-9990
Abstract: Some of the most serious security threats facing computer networks involve malware. To prevent malware-related damage, administrators must swiftly identify and remove the infected machines that may reside in their networks. However, many malware families have domain generation algorithms (DGAs) to avoid detection. A DGA is a technique in which the domain name is changed frequently to hide the callback communication from the infected machine to the command-and-control server. In this article, we propose an approach for estimating the randomness of domain names by superficially analyzing their character strings. This approach is based on the following observations: human-generated benign domain names tend to reflect the intent of their domain registrants, such as an organization, product, or content. In contrast, dynamically generated malicious domain names consist of meaningless character strings because conflicts with already registered domain names must be avoided; hence, there are discernible differences in the strings of dynamically generated and human-generated domain names. Notably, our approach does not require any prior knowledge about DGAs. Our evaluation indicates that the proposed approach is capable of achieving recall and precision as high as 0.9960 and 0.9029, respectively, when used with labeled datasets. Additionally, this approach has proven to be highly effective for datasets collected via a campus network. Thus, these results suggest that malware-infected machines can be swiftly identified and removed from networks using DNS queries for detected malicious domains as triggers.
Published: 2020
Full Text: View/download PDF

19. Detecting Stealthy Domain Generation Algorithms Using Heterogeneous Deep Neural Network Framework

Author: Luhui Yang, Guangjie Liu, Yuewei Dai, Jinwei Wang, and Jiangtao Zhai
Subjects: Convolutional neural network, cyber security, domain generation algorithm, deep learning, long short term memory, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Distinguishing malicious domain names generated by various domain generation algorithms (DGA) is critical for defending a network against sophisticated network attacks. In recent years, stealthy domain generation algorithms (SDGA) have been proposed and revealed significantly stronger stealthiness comparing to the traditional character-based DGA. Existing state-of-the-art detection schemes are not effective enough for detecting SDGA. In this paper, we exploit the character-level characteristics of the SDGA domain names and propose a heterogeneous deep neural network framework (HDNN) for detecting SDGA. HDNN employs a proposed improved parallel CNN (IPCNN) architecture with multi-sizes of convolution kernel for extracting multi-scale local features from a domain name. The framework also contains a proposed self-attention based bidirectional long short term memory (SA-Bi-LSTM) architecture which can extract the bidirectional global features with attention mechanism from a domain name. Besides that, the focal loss function is introduced to mitigate the imbalance of the sample quantity in the training phase. The benchmark experiments are carried out based on the database composed of the collected benign domain names, real-world DGA and SDGA ones. Compared to the 6 influential deep-learning-based DGA detection schemes, the proposed scheme has achieved state-of-the-art detection results on SDGAs, and also achieved state-of-the-art results on binary and multiclass classification for traditional DGAs.
Published: 2020
Full Text: View/download PDF

20. A Machine Learning Framework for Studying Domain Generation Algorithm (DGA)-Based Malware

Author: Chin, Tommy, Xiong, Kaiqi, Hu, Chengbin, Li, Yi, Akan, Ozgur, Series Editor, Bellavista, Paolo, Series Editor, Cao, Jiannong, Series Editor, Coulson, Geoffrey, Series Editor, Dressler, Falko, Series Editor, Ferrari, Domenico, Series Editor, Gerla, Mario, Series Editor, Kobayashi, Hisashi, Series Editor, Palazzo, Sergio, Series Editor, Sahni, Sartaj, Series Editor, Shen, Xuemin (Sherman), Series Editor, Stan, Mircea, Series Editor, Xiaohua, Jia, Series Editor, Zomaya, Albert Y., Series Editor, Beyah, Raheem, editor, Chang, Bing, editor, Li, Yingjiu, editor, and Zhu, Sencun, editor
Published: 2018
Full Text: View/download PDF

21. LagProber: Detecting DGA-Based Malware by Using Query Time Lag of Non-existent Domains

Author: Luo, Xi, Wang, Liming, Xu, Zhen, An, Wei, Hutchison, David, Series Editor, Kanade, Takeo, Series Editor, Kittler, Josef, Series Editor, Kleinberg, Jon M., Series Editor, Mattern, Friedemann, Series Editor, Mitchell, John C., Series Editor, Naor, Moni, Series Editor, Pandu Rangan, C., Series Editor, Steffen, Bernhard, Series Editor, Terzopoulos, Demetri, Series Editor, Tygar, Doug, Series Editor, Weikum, Gerhard, Series Editor, Naccache, David, editor, Xu, Shouhuai, editor, Qing, Sihan, editor, Samarati, Pierangela, editor, Blanc, Gregory, editor, Lu, Rongxing, editor, Zhang, Zonghua, editor, and Meddahi, Ahmed, editor
Published: 2018
Full Text: View/download PDF

22. Dictionary Extraction and Detection of Algorithmically Generated Domain Names in Passive DNS Traffic

Author: Pereira, Mayana, Coleman, Shaun, Yu, Bin, DeCock, Martine, Nascimento, Anderson, Hutchison, David, Series Editor, Kanade, Takeo, Series Editor, Kittler, Josef, Series Editor, Kleinberg, Jon M., Series Editor, Mattern, Friedemann, Series Editor, Mitchell, John C., Series Editor, Naor, Moni, Series Editor, Pandu Rangan, C., Series Editor, Steffen, Bernhard, Series Editor, Terzopoulos, Demetri, Series Editor, Tygar, Doug, Series Editor, Weikum, Gerhard, Series Editor, Bailey, Michael, editor, Holz, Thorsten, editor, Stamatogiannakis, Manolis, editor, and Ioannidis, Sotiris, editor
Published: 2018
Full Text: View/download PDF

23. 基于迁移肀习的小样本DGA恶意域名柃测方法.

Author: 顾兆军, 杨文瑾, and 周景贤
Abstract: Copyright of Journal of Computer Engineering & Applications is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2021
Full Text: View/download PDF

24. A Machine Learning Framework for Domain Generation Algorithm-Based Malware Detection

Author: Yi Li, Kaiqi Xiong, Tommy Chin, and Chengbin Hu
Subjects: Malware, domain generation algorithm, machine learning, security, networking, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Attackers usually use a command and control (C2) server to manipulate the communication. In order to perform an attack, threat actors often employ a domain generation algorithm (DGA), which can allow malware to communicate with C2 by generating a variety of network locations. Traditional malware control methods, such as blacklisting, are insufficient to handle DGA threats. In this paper, we propose a machine learning framework for identifying and detecting DGA domains to alleviate the threat. We collect real-time threat data from the real-life traffic over a one-year period. We also propose a deep learning model to classify a large number of DGA domains. The proposed machine learning framework consists of a two-level model and a prediction model. In the two-level model, we first classify the DGA domains apart from normal domains and then use the clustering method to identify the algorithms that generate those DGA domains. In the prediction model, a time-series model is constructed to predict incoming domain features based on the hidden Markov model (HMM). Furthermore, we build a deep neural network (DNN) model to enhance the proposed machine learning framework by handling the huge dataset we gradually collected. Our extensive experimental results demonstrate the accuracy of the proposed framework and the DNN model. To be precise, we achieve an accuracy of 95.89% for the classification in the framework and 97.79% in the DNN model, 92.45% for the second-level clustering, and 95.21% for the HMM prediction in the framework.
Published: 2019
Full Text: View/download PDF

25. Detecting malicious domain names based on AGD

Author: Xiaodong ZANG, Jian GONG, and Xiaoyan HU
Subjects: network security monitoring, domain generation algorithm, command and control server, algorithmically generated domain, Telecommunication, TK5101-6720
Abstract: A new malicious domain name detection algorithm was proposed.More specifically,the domain names in a cluster belonging to a DGA (domain generation algorithm) or its variants was identified firstly by using cluster correlation.Then,these AGD (algorithmically generated domain) names’ TTL,the distribution and attribution of their resolved IP addresses,their whois features and their historical information were extracted and further applied SVM algorithm to identify the malicious domain names.Experimental results demonstrate that it achieves an accuracy rate of 98.4% and the false positive of 0.9% without any client query records.
Published: 2018
Full Text: View/download PDF

26. Scalable detection of botnets based on DGA: Efficient feature discovery process in machine learning techniques.

Author: Zago, Mattia, Gil Pérez, Manuel, and Martínez Pérez, Gregorio
Subjects: *BOTNETS, *MACHINE learning, *RESEARCH methodology, *NATURAL language processing
Abstract: Botnets are evolving, and their covert modus operandi, based on cloud technologies such as the virtualisation and the dynamic fast-flux addressing, has been proved challenging for classic intrusion detection systems and even the so-called next-generation firewalls. Moreover, dynamic addressing has been spotted in the wild in combination with pseudo-random domain names generation algorithm (DGA), ultimately leading to an extremely accurate and effective disguise technique. Although these concealing methods have been exposed and analysed to great extent in the past decade, the literature lacks some important conclusions and common-ground knowledge, especially when it comes to Machine Learning (ML) solutions. This research horizontally navigates the state of the art aiming to polish the feature discovery process, which is the single most time-consuming part of any ML approach. Results show that only a minor fraction of the defined features are indeed practical and informative, especially when considering 0-day botnet identification. The contributions described in this article will ease the detection process, ultimately enabling improved and more scalable solutions for DGA-based botnets detection. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

27. A DGA domain names detection modeling method based on integrating an attention mechanism and deep neural network.

Author: Ren, Fangli, Jiang, Zhengwei, Wang, Xuren, and Liu, Jian
Subjects: ARTIFICIAL neural networks, COMMAND & control systems, MACHINE learning, DEEP learning, SHORT-term memory
Abstract: Command and control (C2) servers are used by attackers to operate communications. To perform attacks, attackers usually employee the Domain Generation Algorithm (DGA), with which to confirm rendezvous points to their C2 servers by generating various network locations. The detection of DGA domain names is one of the important technologies for command and control communication detection. Considering the randomness of the DGA domain names, recent research in DGA detection applyed machine learning methods based on features extracting and deep learning architectures to classify domain names. However, these methods are insufficient to handle wordlist-based DGA threats, which generate domain names by randomly concatenating dictionary words according to a special set of rules. In this paper, we proposed a a deep learning framework ATT-CNN-BiLSTM for identifying and detecting DGA domains to alleviate the threat. Firstly, the Convolutional Neural Network (CNN) and bidirectional Long Short-Term Memory (BiLSTM) neural network layer was used to extract the features of the domain sequences information; secondly, the attention layer was used to allocate the corresponding weight of the extracted deep information from the domain names. Finally, the different weights of features in domain names were put into the output layer to complete the tasks of detection and classification. Our extensive experimental results demonstrate the effectiveness of the proposed model, both on regular DGA domains and DGA that hard to detect such as wordlist-based and part-wordlist-based ones. To be precise,we got a F1 score of 98.79% for the detection and macro average precision and recall of 83% for the classification task of DGA domain names. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

28. Hydras and IPFS: a decentralised playground for malware.

Author: Patsakis, Constantinos and Casino, Fran
Subjects: *PLAYGROUNDS, *CRYPTOGRAPHY, *ALGORITHMS, *BOTNETS, *MALWARE, *MALWARE prevention
Abstract: Modern malware can take various forms and has reached a very high level of sophistication in terms of its penetration, persistence, communication and hiding capabilities. The use of cryptography, and of covert communication channels over public and widely used protocols and services, is becoming a norm. In this work, we start by introducing Resource Identifier Generation Algorithms. These are an extension of a well-known mechanism called domain generation algorithms, which are frequently employed by cybercriminals for bot management and communication. Our extension allows, beyond DNS, the use of other protocols. More concretely, we showcase the exploitation of the InterPlanetary File System (IPFS). This is a solution for the "permanent web", which enjoys a steadily growing community interest and adoption. The IPFS is, in addition, one of the most prominent solutions for blockchain storage. We go beyond the straightforward case of using the IPFS for hosting malicious content and explore ways in which a botmaster could employ it, to manage her bots, validating our findings experimentally. Finally, we discuss the advantages of our approach for malware authors, its efficacy and highlight its extensibility for other distributed storage services. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

29. Enabling Network Security Through Active DNS Datasets

Author: Kountouras, Athanasios, Kintis, Panagiotis, Lever, Chaz, Chen, Yizheng, Nadji, Yacin, Dagon, David, Antonakakis, Manos, Joffe, Rodney, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Monrose, Fabian, editor, Dacier, Marc, editor, Blanc, Gregory, editor, and Garcia-Alfaro, Joaquin, editor
Published: 2016
Full Text: View/download PDF

30. Evolving Smart URL Filter in a Zone-Based Policy Firewall for Detecting Algorithmically Generated Malicious Domains

Author: Demertzis, Konstantinos, Iliadis, Lazaros, Goebel, Randy, Series editor, Tanaka, Yuzuru, Series editor, Wahlster, Wolfgang, Series editor, Gammerman, Alexander, editor, Vovk, Vladimir, editor, and Papadopoulos, Harris, editor
Published: 2015
Full Text: View/download PDF

31. Detecting the DGA-Based Malicious Domain Names

Author: Zhang, Ying, Zhang, Yongzheng, Xiao, Jun, Yuan, Yuyu, editor, Wu, Xu, editor, and Lu, Yueming, editor
Published: 2014
Full Text: View/download PDF

32. Applied machine learning in recognition of DGA domain names

Author: Miroslav Štampar and Krešimir Fertalj
Subjects: Domain generation algorithm, domain generation algorithm, binary classification, supervised machine learning, deep learning, blind evaluation, General Computer Science, Computer science, business.industry, Deep learning, Machine learning, computer.software_genre, Field (computer science), Domain (software engineering), Set (abstract data type), Binary classification, Malware, Artificial intelligence, business, Heuristics, computer
Abstract: Recognition of domain names generated by domain generation algorithms (DGAs) is the essential part of malware detection by inspection of network traffic. Besides basic heuristics (HE) and limited detection based on blacklists, the most promising course seems to be machine learning (ML). There is a lack of studies that extensively compare different ML models in the field of DGA binary classification, including both conventional and deep learning (DL) representatives. Also, those few that exist are either focused on a small set of models, use a poor set of features in ML models or fail to secure unbiased independence between training and evaluation samples. To overcome these limitations, we engineered a robust feature set, and accordingly trained and evaluated 14 ML, 9 DL, and 2 comparative models on two independent datasets. Results show that if ML features are properly engineered, there is a marginal difference in overall score between top ML and DL representatives. This paper represents the first attempt to neutrally compare the performance of many different models for the recognition of DGA domain names, where the best models perform as well as the top representatives from the literature.
Published: 2022

33. BadDGA: Backdoor Attack on LSTM-Based Domain Generation Algorithm Detector

Author: You Zhai, Liqun Yang, Jian Yang, Longtao He, and Zhoujun Li
Subjects: backdoor attack, Computer Networks and Communications, Hardware and Architecture, Control and Systems Engineering, Signal Processing, domain generation algorithm, Electrical and Electronic Engineering, botnet
Abstract: Due to the outstanding performance of deep neural networks (DNNs), many researchers have begun to transfer deep learning techniques to their fields. To detect algorithmically generated domains (AGDs) generated by domain generation algorithm (DGA) in botnets, a long short-term memory (LSTM)-based DGA detector has achieved excellent performance. However, the previous DNNs have found various inherent vulnerabilities, so cyberattackers can use these drawbacks to deceive DNNs, misleading DNNs into making wrong decisions. Backdoor attack as one of the popular attack strategies strike against DNNs has attracted widespread attention in recent years. In this paper, to cheat the LSTM-based DGA detector, we propose BadDGA, a backdoor attack against the LSTM-based DGA detector. Specifically, we offer four backdoor attack trigger construction methods: TLD-triggers, Ngram-triggers, Word-triggers, and IDN-triggers. Finally, we evaluate BadDGA on ten popular DGA datasets. The experimental results show that under the premise of 1‰ poisoning rate, our proposed backdoor attack can achieve a 100% attack success rate to verify the effectiveness of our method. Meanwhile, the model’s utility on clean data is influenced slightly.
Published: 2023
Full Text: View/download PDF

34. Detection of Algorithmically Generated Domain Names Using the Recurrent Convolutional Neural Network with Spatial Pyramid Pooling

Author: Zhanghui Liu, Yudong Zhang, Yuzhong Chen, Xinwen Fan, and Chen Dong
Subjects: domain generation algorithm, algorithmically generated domain name, SMOTE, recurrent convolutional neural network, spatial pyramid pooling, Science, Astrophysics, QB460-466, Physics, QC1-999
Abstract: Domain generation algorithms (DGAs) use specific parameters as random seeds to generate a large number of random domain names to prevent malicious domain name detection. This greatly increases the difficulty of detecting and defending against botnets and malware. Traditional models for detecting algorithmically generated domain names generally rely on manually extracting statistical characteristics from the domain names or network traffic and then employing classifiers to distinguish the algorithmically generated domain names. These models always require labor intensive manual feature engineering. In contrast, most state-of-the-art models based on deep neural networks are sensitive to imbalance in the sample distribution and cannot fully exploit the discriminative class features in domain names or network traffic, leading to decreased detection accuracy. To address these issues, we employ the borderline synthetic minority over-sampling algorithm (SMOTE) to improve sample balance. We also propose a recurrent convolutional neural network with spatial pyramid pooling (RCNN-SPP) to extract discriminative and distinctive class features. The recurrent convolutional neural network combines a convolutional neural network (CNN) and a bi-directional long short-term memory network (Bi-LSTM) to extract both the semantic and contextual information from domain names. We then employ the spatial pyramid pooling strategy to refine the contextual representation by capturing multi-scale contextual information from domain names. The experimental results from different domain name datasets demonstrate that our model can achieve 92.36% accuracy, an 89.55% recall rate, a 90.46% F1-score, and 95.39% AUC in identifying DGA and legitimate domain names, and it can achieve 92.45% accuracy rate, a 90.12% recall rate, a 90.86% F1-score, and 96.59% AUC in multi-classification problems. It achieves significant improvement over existing models in terms of accuracy and robustness.
Published: 2020
Full Text: View/download PDF

35. 基于 AGD 的恶意域名检测.

Author: 臧小东, 龚俭, and 胡晓艳
Abstract: Copyright of Journal on Communication / Tongxin Xuebao is the property of Journal on Communications Editorial Office and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2018
Full Text: View/download PDF

36. A Lightweight Hybrid Detection Method for Botnet

Author: Wei Ma, Jiguang Wang, Qianyun Chen, and Xing Wang
Subjects: Domain generation algorithm, business.industry, Network packet, Computer science, Domain Name System, ComputerSystemsOrganization_COMPUTER-COMMUNICATIONNETWORKS, Botnet, Domain (software engineering), Software, Default gateway, Signal Processing, The Internet, Electrical and Electronic Engineering, business, Computer network
Abstract: Botnet is a serious threat for the Internet and it has created great damage to the Internet. How to detect botnet has become an ongoing endeavor research. Series of methods have been discussed in recent research. However, one of the remaining challenges is that the high computational overhead. In this paper, a lightweight hybrid botnet detection method is proposed. Considering the features in the botnet data packets and the characteristic of employing DGA (Domain Generation Algorithm) domain names to connect to the botnet, two sensors are designed and deployed individually and parallelly. Signature detection is used on the gateway sensor to dig out known bot software and deep learning based techniques are used on the DNS (Domain Name Server) server sensor to find DGA domain names. With this method, the computational overhead would be shared by the two sensors and experiments are conducted and the results indicate that the method is effective in detecting botnet
Published: 2021

37. DGA-based botnet detection toward imbalanced multiclass learning

Author: Yijing Chen, Guolin Shao, Guozhu Wen, Xingshu Chen, and Bo Pang
Subjects: Multidisciplinary, Domain generation algorithm, Computer science, Sample size determination, Robustness (computer science), Resampling, Botnet, Sample (statistics), Function (mathematics), Data mining, computer.software_genre, computer, Domain (software engineering)
Abstract: Botnets based on the Domain Generation Algorithm (DGA) mechanism pose great challenges to the main current detection methods because of their strong concealment and robustness. However, the complexity of the DGA family and the imbalance of samples continue to impede research on DGA detection. In the existing work, the sample size of each DGA family is regarded as the most important determinant of the resampling proportion; thus, differences in the characteristics of various samples are ignored, and the optimal resampling effect is not achieved. In this paper, a Long Short-Term Memory-based Property and Quantity Dependent Optimization (LSTM.PQDO) method is proposed. This method takes advantage of LSTM to automatically mine the comprehensive features of DGA domain names. It iterates the resampling proportion with the optimal solution based on a comprehensive consideration of the original number and characteristics of the samples to heuristically search for a better solution around the initial solution in the right direction; thus, dynamic optimization of the resampling proportion is realized. The experimental results show that the LSTM.PQDO method can achieve better performance compared with existing models to overcome the difficulties of unbalanced datasets; moreover, it can function as a reference for sample resampling tasks in similar scenarios.
Published: 2021

38. The Ecosystem of Detection and Blocklisting of Domain Generation

Author: Leigh Metcalf and Jonathan M. Spring
Subjects: Domain generation algorithm, Computer Networks and Communications, Computer science, 0211 other engineering and technologies, 02 engineering and technology, computer.software_genre, Machine learning, Domain (software engineering), 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, Baseline (configuration management), Interpretability, 021110 strategic, defence & security studies, Network defense, business.industry, Computer Science Applications, Core (game theory), Open source, Hardware and Architecture, Malware, Artificial intelligence, business, Safety Research, computer, Software, Information Systems
Abstract: Malware authors use domain generation algorithms to establish more reliable communication methods that can avoid reactive defender blocklisting techniques. Network defense has sought to supplement blocklists with methods for detecting machine-generated domains. We present a repeatable evaluation and comparison of the available open source detection methods. We designed our evaluation with multiple interrelated aspects, to improve both interpretability and realism. In addition to evaluating detection methods, we assess the impact of the domain generation ecosystem on prior results about the nature of blocklists and how they are maintained. The results of the evaluation of open source detection methods finds all methods are inadequate for practical use. The results of the blocklist impact study finds that generated domains decrease the overlap among blocklists; however, while the effect is large in relative terms, the baseline is so small that the core conclusions of the prior work are sustained. Namely, that blocklist construction is very targeted, context-specific, and as a result blocklists do no overlap much. We recommend that Domain Generation Algorithm detection should also be similarly narrowly targeted to specific algorithms and specific malware families, rather than attempting to create general-purpose detection for machine-generated domains.
Published: 2021

39. Detecting Word-Based Algorithmically Generated Domains Using Semantic Analysis

Author: Luhui Yang, Jiangtao Zhai, Weiwei Liu, Xiaopeng Ji, Huiwen Bai, Guangjie Liu, and Yuewei Dai
Subjects: network attack, domain generation algorithm, DGA detection, semantic analysis, ensemble classifier, Mathematics, QA1-939
Abstract: In highly sophisticated network attacks, command-and-control (C&C) servers always use domain generation algorithms (DGAs) to dynamically produce several candidate domains instead of static hard-coded lists of IP addresses or domain names. Distinguishing the domains generated by DGAs from the legitimate ones is critical for finding out the existence of malware or further locating the hidden attackers. The word-based DGAs disclosed in recent network attack events have shown significantly stronger stealthiness when compared with traditional character-based DGAs. In word-based DGAs, two or more words are randomly chosen from one or more specific dictionaries to form a dynamic domain, these regularly generated domains aim to mimic the characteristics of a legitimate domain. Existing DGA detection schemes, including the state-of-the-art one based on deep learning, still cannot find out these domains accurately while maintaining an acceptable false alarm rate. In this study, we exploit the inter-word and inter-domain correlations using semantic analysis approaches, word embedding and the part-of-speech are taken into consideration. Next, we propose a detection framework for word-based DGAs by incorporating the frequency distribution of the words and that of part-of-speech into the design of the feature set. Using an ensemble classifier constructed from Naive Bayes, Extra-Trees, and Logistic Regression, we benchmark the proposed scheme with malicious and legitimate domain samples extracted from public datasets. The experimental results show that the proposed scheme can achieve significantly higher detection accuracy for word-based DGAs when compared with three state-of-the-art DGA detection schemes.
Published: 2019
Full Text: View/download PDF

40. ConnSpoiler: Disrupting C&C Communication of IoT-Based Botnet Through Fast Detection of Anomalous Domain Queries

Author: Lihua Yin, Hui Lu, Liming Wang, Chunsheng Zhu, Zhen Xu, and Xi Luo
Subjects: Domain generation algorithm, business.industry, Computer science, ComputerSystemsOrganization_COMPUTER-COMMUNICATIONNETWORKS, 020208 electrical & electronic engineering, Botnet, Denial-of-service attack, 02 engineering and technology, Computer Science Applications, Domain (software engineering), Control and Systems Engineering, Server, 0202 electrical engineering, electronic engineering, information engineering, Electrical and Electronic Engineering, business, Internet of Things, Information Systems, Computer network
Abstract: The development of Internet of Things (IoT) dramatically facilitates the integration of computing systems with the physical world. However, as IoT devices are more easy to compromise than desktop computers, cybercriminals have founded IoT-based botnets to launch Distributed Denial of Service (DDoS) attacks with unprecedented traffic volume. To mitigate the damages associated with these attacks, the detection of IoT-based botnet has to preempt the command and control (C&C) communication to prevent the delivery of the attack codes. Motivated by the extensively implementation of domain generation algorithm in botnets, in this article, we propose ConnSpoiler, a lightweight system that detects IoT-based botnets by identifying the stream of algorithmically generated domains (AGDs) in a fast way. ConnSpoiler only needs negligible system resources to take effect and thus can execute well on the resource-restraint IoT devices. By outfitting a powerful statistical algorithm, i.e., threshold random walk, ConnSpoiler has a high probability (about 94%) of detecting infection before the compromised devices connect C&C servers, which can help to prevent the succeeding attacks. Moreover, ConnSpoiler only requires the benign domains to take effect and therefore does not need extra effort to label malicious samples for training phase. We evaluate ConnSpoiler based on real-world DNS traffics collected from two different large ISP networks and show that it accurately identifies devices that are compromised by unknown botnets.
Published: 2020

41. A Superficial Analysis Approach for Identifying Malicious Domain Names Generated by DGA Malware

Author: Gen Kitagata, Yutaka Fukuda, Toyohiro Hayashi, and Akihiro Satoh
Subjects: Information retrieval, Domain generation algorithm, Network security, business.industry, Computer science, malware, Domain Name System, computer.software_genre, lcsh:HE1-9990, Domain (software engineering), lcsh:Telecommunication, Campus network, lcsh:TK5101-6720, network security, Callback, Malware, lcsh:Transportation and communications, business, Precision and recall, computer, domain name system
Abstract: Some of the most serious security threats facing computer networks involve malware. To prevent malware-related damage, administrators must swiftly identify and remove the infected machines that may reside in their networks. However, many malware families have domain generation algorithms (DGAs) to avoid detection. A DGA is a technique in which the domain name is changed frequently to hide the callback communication from the infected machine to the command-and-control server. In this article, we propose an approach for estimating the randomness of domain names by superficially analyzing their character strings. This approach is based on the following observations: human-generated benign domain names tend to reflect the intent of their domain registrants, such as an organization, product, or content. In contrast, dynamically generated malicious domain names consist of meaningless character strings because conflicts with already registered domain names must be avoided; hence, there are discernible differences in the strings of dynamically generated and human-generated domain names. Notably, our approach does not require any prior knowledge about DGAs. Our evaluation indicates that the proposed approach is capable of achieving recall and precision as high as 0.9960 and 0.9029, respectively, when used with labeled datasets. Additionally, this approach has proven to be highly effective for datasets collected via a campus network. Thus, these results suggest that malware-infected machines can be swiftly identified and removed from networks using DNS queries for detected malicious domains as triggers.
Published: 2020

42. Detecting Stealthy Domain Generation Algorithms Using Heterogeneous Deep Neural Network Framework

Author: Guangjie Liu, Jinwei Wang, Jiangtao Zhai, Yuewei Dai, and Luhui Yang
Subjects: Scheme (programming language), General Computer Science, Exploit, Computer science, long short term memory, Binary number, Convolutional neural network, 02 engineering and technology, Domain (software engineering), Multiclass classification, 0202 electrical engineering, electronic engineering, information engineering, General Materials Science, computer.programming_language, Artificial neural network, cyber security, General Engineering, deep learning, 020206 networking & telecommunications, Function (mathematics), Benchmark (computing), domain generation algorithm, 020201 artificial intelligence & image processing, lcsh:Electrical engineering. Electronics. Nuclear engineering, lcsh:TK1-9971, computer, Algorithm
Abstract: Distinguishing malicious domain names generated by various domain generation algorithms (DGA) is critical for defending a network against sophisticated network attacks. In recent years, stealthy domain generation algorithms (SDGA) have been proposed and revealed significantly stronger stealthiness comparing to the traditional character-based DGA. Existing state-of-the-art detection schemes are not effective enough for detecting SDGA. In this paper, we exploit the character-level characteristics of the SDGA domain names and propose a heterogeneous deep neural network framework (HDNN) for detecting SDGA. HDNN employs a proposed improved parallel CNN (IPCNN) architecture with multi-sizes of convolution kernel for extracting multi-scale local features from a domain name. The framework also contains a proposed self-attention based bidirectional long short term memory (SA-Bi-LSTM) architecture which can extract the bidirectional global features with attention mechanism from a domain name. Besides that, the focal loss function is introduced to mitigate the imbalance of the sample quantity in the training phase. The benchmark experiments are carried out based on the database composed of the collected benign domain names, real-world DGA and SDGA ones. Compared to the 6 influential deep-learning-based DGA detection schemes, the proposed scheme has achieved state-of-the-art detection results on SDGAs, and also achieved state-of-the-art results on binary and multiclass classification for traditional DGAs.
Published: 2020

43. Botnet & Botnet in the Browser

Author: Mehlführer, Max
Subjects: DNS flux, Botnet, Browser, MarioNet, Bot, Service worker, Domain generation algorithm, Bot herder, HTML5, Web worker, Zeus, Zeus Gameover
Abstract: Diese Arbeit untersucht Botnetze und Botnetze im Browser. Mithilfe einer literarischen Recherche wird die Frage, wie Botnetze im Browser funktionieren verglichen mit „herkömmlichen“ Botnetzen und wie sie eine Gefahr für das Internet darstellen, beantwortet. Durch Features von HTML5 wie Service Worker und Web Worker macht MarioNet den Browser zu einem Bot, nur durch das Besuchen einer Webseite. Dadurch ist es möglich die Ressourcen des Computers, auf dem der Browser läuft, für bösartige Zwecke zu verwenden, ohne der Einwilligung des Eigentümers. In dieser Arbeit wird MarioNet mit Zeus Gameover verglichen basierend auf Verbreitungstechnik, Topologie, Kommunikationsprotokoll, und welche Methoden verwendet werden, um unentdeckt zu bleiben. Die Ergebnisse dieser Arbeit zeigen, dass das Benutzen des Browsers als Bot, mithilfe von Service Workern, eine größere Gefahr darstellt als „herkömmliche“ Botnetze. This thesis examines botnets and botnets in the browser. Furthermore, the question, how do botnets in the browser work and oppose a threat to the Internet compared to "conventional" botnet is answered using literature research. MarioNet turns a browser into a bot only by visiting a compromised website using features like service worker and web worker. Therefore, it is possible to use the resources of a computer, running the web browser for malicious purposes without the owner’s consent. In this thesis a comparison between MarioNet and Zeus Gameover is provided, based on their propagation techniques, topology, communication protocols and evasion techniques. The results show, that using the browser as a bot with the help of service worker is a more significant risk than “conventional” botnets.
Published: 2022

44. Real-Time Detection of Dictionary DGA Network Traffic Using Deep Learning

Author: Highnam, Kate, Puzio, Domenic, Luo, Song, and Jennings, Nicholas R.
Published: 2021
Full Text: View/download PDF

45. Convolutional, adversarial and random forest-based DGA detection : Comparative study for DGA detection with different machine learning algorithms

Author: Brandt, Carl-Simon, Kleivard, Jonathan, Turesson, Andreas, Brandt, Carl-Simon, Kleivard, Jonathan, and Turesson, Andreas
Abstract: Malware is becoming more intelligent as static methods for blocking communication with Command and Control (C&C) server are becoming obsolete. Domain Generation Algorithms (DGAs) are a common evasion technique that generates pseudo-random domain names to communicate with C&C servers in a difficult way to detect using handcrafted methods. Trying to detect DGAs by looking at the domain name is a broad and efficient approach to detect malware-infected hosts. This gives us the possibility of detecting a wider assortment of malware compared to other techniques, even without knowledge of the malware’s existence. Our study compared the effectiveness of three different machine learning classifiers: Convolutional Neural Network (CNN), Generative Adversarial Network (GAN) and Random Forest (RF) when recognizing patterns and identifying these pseudo-random domains. The result indicates that CNN differed significantly from GAN and RF. It achieved 97.46% accuracy in the final evaluation, while RF achieved 93.89% and GAN achieved 60.39%. In the future, network traffic (efficiency) could be a key component to examine, as productivity may be harmed if the networkis over burdened by domain identification using machine learning algorithms.
Published: 2021

46. Detection method of domain names generated by DGAs based on semantic representation and deep neural network

Author: Congyuan Xu, Xin Du, and Jizhong Shen
Subjects: Domain generation algorithm, General Computer Science, Artificial neural network, Computer science, business.industry, Botnet, 020206 networking & telecommunications, Pattern recognition, 02 engineering and technology, Domain (software engineering), Range (mathematics), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Semantic representation, Artificial intelligence, business, Law
Abstract: Botnets have become one of the main threats to cyberspace security currently. More and more bots utilize the domain generation algorithm (DGA) to generate malicious domain names to communicate with Command & Control (CC it only needs to input the domain name itself and can automatically estimate the probability that the domain name was generated by DGAs. Experiments on real-world data show that the proposed method can effectively detect domain names generated by DGAs with 98.69% average detection rate and 0.9829 average F-measure, and significantly outperformed the state-of-art methods in detecting pronounceable and wordlist-based DGA domain names with more than 93.89% detection rate. Therefore, the proposed detection method is robust and has a wide range of adaptability in detecting various types of domain names generated by DGAs.
Published: 2019

47. Detection of algorithmically generated malicious domain names using masked N-grams

Author: Jose Selvi, Ricardo J. Rodríguez, and Emilio Soria-Olivas
Subjects: 0209 industrial biotechnology, Domain generation algorithm, Computer science, General Engineering, 02 engineering and technology, computer.software_genre, Blacklist, Computer Science Applications, Random forest, Domain (software engineering), 020901 industrial engineering & automation, Artificial Intelligence, Server, 0202 electrical engineering, electronic engineering, information engineering, Malware, 020201 artificial intelligence & image processing, Data mining, computer, Host (network), Block (data storage)
Abstract: Malware detection is a challenge that has increased in complexity in the last few years. A widely adopted strategy is to detect malware by means of analyzing network traffic, capturing the communications with their command and control (C&C) servers. However, some malware families have shifted to a stealthier communication strategy, since anti-malware companies maintain blacklists of known malicious locations. Instead of using static IP addresses or domain names, they algorithmically generate domain names that may host their C&C servers. Hence, blacklist approaches become ineffective since the number of domain names to block is large and varies from time to time. In this paper, we introduce a machine learning approach using Random Forest that relies on purely lexical features of the domain names to detect algorithmically generated domains. In particular, we propose using masked N-grams, together with other statistics obtained from the domain name. Furthermore, we provide a dataset built for experimentation that contains regular and algorithmically generated domain names, coming from different malware families. We also classify these families according to their type of domain generation algorithm. Our findings show that masked N-grams provide detection accuracy that is comparable to that of other existing techniques, but with much better performance.
Published: 2019

48. A Machine Learning Framework for Domain Generation Algorithm-Based Malware Detection

Author: Chengbin Hu, Tommy Chin, Yi Li, and Kaiqi Xiong
Subjects: Domain generation algorithm, General Computer Science, Computer science, networking, security, 02 engineering and technology, computer.software_genre, Machine learning, Malware, Domain (software engineering), 0202 electrical engineering, electronic engineering, information engineering, General Materials Science, Hidden Markov model, Cluster analysis, Artificial neural network, business.industry, Deep learning, General Engineering, 020206 networking & telecommunications, machine learning, domain generation algorithm, 020201 artificial intelligence & image processing, lcsh:Electrical engineering. Electronics. Nuclear engineering, Artificial intelligence, business, lcsh:TK1-9971, computer
Abstract: Attackers usually use a command and control (C2) server to manipulate the communication. In order to perform an attack, threat actors often employ a domain generation algorithm (DGA), which can allow malware to communicate with C2 by generating a variety of network locations. Traditional malware control methods, such as blacklisting, are insufficient to handle DGA threats. In this paper, we propose a machine learning framework for identifying and detecting DGA domains to alleviate the threat. We collect real-time threat data from the real-life traffic over a one-year period. We also propose a deep learning model to classify a large number of DGA domains. The proposed machine learning framework consists of a two-level model and a prediction model. In the two-level model, we first classify the DGA domains apart from normal domains and then use the clustering method to identify the algorithms that generate those DGA domains. In the prediction model, a time-series model is constructed to predict incoming domain features based on the hidden Markov model (HMM). Furthermore, we build a deep neural network (DNN) model to enhance the proposed machine learning framework by handling the huge dataset we gradually collected. Our extensive experimental results demonstrate the accuracy of the proposed framework and the DNN model. To be precise, we achieve an accuracy of 95.89% for the classification in the framework and 97.79% in the DNN model, 92.45% for the second-level clustering, and 95.21% for the HMM prediction in the framework.
Published: 2019

49. A Superficial Analysis Approach for Identifying Malicious Domain Names Generated by DGA Malware

Author: Satoh, Akihiro, Fukuda, Yutaka, Hayashi, Toyohiro, Kitagata, Gen, Satoh, Akihiro, Fukuda, Yutaka, Hayashi, Toyohiro, and Kitagata, Gen
Abstract: Some of the most serious security threats facing computer networks involve malware. To prevent malware-related damage, administrators must swiftly identify and remove the infected machines that may reside in their networks. However, many malware families have domain generation algorithms (DGAs) to avoid detection. A DGA is a technique in which the domain name is changed frequently to hide the callback communication from the infected machine to the command-and-control server. In this article, we propose an approach for estimating the randomness of domain names by superficially analyzing their character strings. This approach is based on the following observations: human-generated benign domain names tend to reflect the intent of their domain registrants, such as an organization, product, or content. In contrast, dynamically generated malicious domain names consist of meaningless character strings because conflicts with already registered domain names must be avoided; hence, there are discernible differences in the strings of dynamically generated and human-generated domain names. Notably, our approach does not require any prior knowledge about DGAs. Our evaluation indicates that the proposed approach is capable of achieving recall and precision as high as 0.9960 and 0.9029, respectively, when used with labeled datasets. Additionally, this approach has proven to be highly effective for datasets collected via a campus network. Thus, these results suggest that malware-infected machines can be swiftly identified and removed from networks using DNS queries for detected malicious domains as triggers.
Published: 2020

50. Early DGA-Based Botnet Identification: Pushing Detection to the Edges

Author: Gregorio Martínez Pérez, Mattia Zago, and Manuel Gil Pérez
Subjects: Domain generation algorithm, Edge device, Computer Networks and Communications, Computer science, Distributed computing, Botnet, 020206 networking & telecommunications, Context (language use), 02 engineering and technology, Intrusion detection system, Domain Generation Algorithm (DGA), Machine Learning, 5G, Cybersecurity, Edge Artificial Intelligence, Federated Learning, Domain (software engineering), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Enhanced Data Rates for GSM Evolution, Software
Abstract: With the first commercially available 5G infrastructures, worldwide’s attention is shifting to the next generation of theorised technologies that might be finally deployable. In this context, the cybersecurity of edge equipment and end-devices must be a top priority as botnets see their spread remarkably increase. Most of them rely on algorithmically generated domain names (AGDs) to evade detection and remain shrouded from intrusion detection systems, via the so-called Domain Generation Algorithm (DGA). Despite the issue, by applying concepts such as distributed computing and federated learning, the cybersecurity community has prototyped and developed dynamic and scalable solutions that leverage the increased capabilities and connectivity of edge devices. This article proposes a lightweight and privacy-preserving framework that pushes the intelligence modules to the edges aiming to achieve early DGA-based botnet detection in mobile and edge-oriented scenarios. Experimental results prove the deployability of such architecture at all levels, including resource-constrained end-devices.
Published: 2021
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

154 results on '"domain generation algorithm"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources