Back to Search
Start Over
A Lexical Approach for Classifying Malicious URLs
- Publication Year :
- 2015
-
Abstract
- Given the continuous growth of illicit activities on the Internet, there is a need for intelligent systems to identify malicious web pages. It has been shown that URL anal- ysis is an e\u21b5ective tool for detecting phishing, malware, and other attacks. Previous studies have performed URL classification using a combination of lexical features, network tra c, hosting information, and other strategies. These approaches require time-intensive lookups which introduce significant delay in real-time systems. This paper describes a lightweight approach for classifying malicious web pages using URL lexical analysis alone. The goal is to explore the upper-bound of the classification accuracy of a purely lexical approach. Another aim is to develop an approach which could be used in a real-time system. These goal culminate in the development of a classification system based on lexical analysis of URLs. It correctly classifies URLs of malicious web pages with 99.1% accuracy, a 0.4% false positive rate, an F1-Score of 98.7, and requires 0.62 milliseconds on average. This method substantially out- performs previously published algorithms on out-of-sample data.
- Subjects :
- Machine Learning
Subjects
Details
- Database :
- OAIster
- Notes :
- Heileman, Greg, Jordan, Ramiro, Lamb, Chris, Darling, Michael
- Publication Type :
- Electronic Resource
- Accession number :
- edsoai.on1153390294
- Document Type :
- Electronic Resource