Back to Search Start Over

A Lexical Approach for Classifying Malicious URLs

Publication Year :
2015

Abstract

Given the continuous growth of illicit activities on the Internet, there is a need for intelligent systems to identify malicious web pages. It has been shown that URL anal- ysis is an e\u21b5ective tool for detecting phishing, malware, and other attacks. Previous studies have performed URL classification using a combination of lexical features, network tra c, hosting information, and other strategies. These approaches require time-intensive lookups which introduce significant delay in real-time systems. This paper describes a lightweight approach for classifying malicious web pages using URL lexical analysis alone. The goal is to explore the upper-bound of the classification accuracy of a purely lexical approach. Another aim is to develop an approach which could be used in a real-time system. These goal culminate in the development of a classification system based on lexical analysis of URLs. It correctly classifies URLs of malicious web pages with 99.1% accuracy, a 0.4% false positive rate, an F1-Score of 98.7, and requires 0.62 milliseconds on average. This method substantially out- performs previously published algorithms on out-of-sample data.

Subjects

Subjects :
Machine Learning

Details

Database :
OAIster
Notes :
Heileman, Greg, Jordan, Ramiro, Lamb, Chris, Darling, Michael
Publication Type :
Electronic Resource
Accession number :
edsoai.on1153390294
Document Type :
Electronic Resource