1. Network traffic analysis using machine learning: an unsupervised approach to understand and slice your network
- Author
-
Kandaraj Piamrat, Ons Aouedi, J. K. Menuka Perera, Salima Hamma, Laboratoire des Sciences du Numérique de Nantes (LS2N), IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Nantes - UFR des Sciences et des Techniques (UN UFR ST), and Université de Nantes (UN)-Université de Nantes (UN)-École Centrale de Nantes (ECN)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
Traffic analysis ,Computer science ,Test data generation ,Unsupervised Learning ,Feature selection ,02 engineering and technology ,Machine learning ,computer.software_genre ,Clustering ,Machine Learning ,Traffic flow (computer networking) ,[INFO.INFO-NI]Computer Science [cs]/Networking and Internet Architecture [cs.NI] ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,Network Traffic ,0202 electrical engineering, electronic engineering, information engineering ,[INFO]Computer Science [cs] ,Feature Selection ,Electrical and Electronic Engineering ,Cluster analysis ,business.industry ,Dimensionality reduction ,020206 networking & telecommunications ,020207 software engineering ,Network Slicing ,Scalability ,Unsupervised learning ,Artificial intelligence ,business ,computer - Abstract
International audience; Recent development in smart devices has lead us to an explosion in data generation and heterogeneity, which requires new network solutions for better analysing and understanding traffic. These solutions should be intelligent and scalable in order to handle the huge amount of data automatically. With the progress of high-performance computing (HPC), it becomes feasible easily to deploy machine learning (ML) to solve complex problems and its efficiency has been validated in several domains (e.g., healthcare or computer vision). At the same time, network slicing (NS) has drawn significant attention from both industry and academia as it is essential to address the diversity of service requirements. Therefore, the adoption of ML within NS management is an interesting issue. In this paper, we have focused on analyzing network data with the objective of defining network slices according to traffic flow behaviors. For dimensionality reduction, the feature selection has been applied to select the most relevant features (15 out of 87 features) from a real dataset of more than 3 million instances. Then, a K-Means clustering is applied to better understand and distinguish behaviors of traffic. The results demonstrated a good correlation among instances in the same cluster generated by the unsupervised learning. This solution can be further integrated in a real environment using network function virtualization.
- Published
- 2021