Back to Search
Start Over
Internet traffic clustering with side information
- Source :
- Journal of Computer and System Sciences. 80:1021-1036
- Publication Year :
- 2014
- Publisher :
- Elsevier BV, 2014.
-
Abstract
- Internet traffic classification is a critical and essential functionality for network management and security systems. Due to the limitations of traditional port-based and payload-based classification approaches, the past several years have seen extensive research on utilizing machine learning techniques to classify Internet traffic based on packet and flow level characteristics. For the purpose of learning from unlabeled traffic data, some classic clustering methods have been applied in previous studies but the reported accuracy results are unsatisfactory. In this paper, we propose a semi-supervised approach for accurate Internet traffic clustering, which is motivated by the observation of widely existing partial equivalence relationships among Internet traffic flows. In particular, we formulate the problem using a Gaussian Mixture Model (GMM) with set-based equivalence constraint and propose a constrained Expectation Maximization (EM) algorithm for clustering. Experiments with real-world packet traces show that the proposed approach can significantly improve the quality of resultant traffic clusters.
- Subjects :
- Computer Networks and Communications
business.industry
Computer science
Applied Mathematics
Constrained clustering
Conceptual clustering
Internet traffic engineering
Internet traffic
Mixture model
computer.software_genre
Machine learning
Theoretical Computer Science
Traffic classification
Computational Theory and Mathematics
Data mining
Artificial intelligence
Cluster analysis
business
computer
Traffic generation model
Subjects
Details
- ISSN :
- 00220000
- Volume :
- 80
- Database :
- OpenAIRE
- Journal :
- Journal of Computer and System Sciences
- Accession number :
- edsair.doi...........30468dd63973e3af28dfb8211c6d8833