Back to Search Start Over

A comparative dimensionality reduction study in telecom customer segmentation using deep learning and PCA.

Authors :
Alkhayrat, Maha
Aljnidi, Mohamad
Aljoumaa, Kadan
Source :
Journal of Big Data; 2/3/2020, Vol. 7 Issue 1, p1-23, 23p
Publication Year :
2020

Abstract

Telecom Companies logs customer's actions which generate a huge amount of data that can bring important findings related to customer's behavior and needs. The main characteristics of such data are the large number of features and the high sparsity that impose challenges to the analytics steps. This paper aims to explore dimensionality reduction on a real telecom dataset and evaluate customers' clustering in reduced and latent space, compared to original space in order to achieve better quality clustering results. The original dataset contains 220 features that belonging to 100,000 customers. However, dimensionality reduction is an important data preprocessing step in the data mining process specially with the presence of curse of dimensionality. In particular, the aim of data reduction techniques is to filter out irrelevant features and noisy data samples. To reduce the high dimensional data, we projected it down to a subspace using well known Principal Component Analysis (PCA) decomposition and a novel approach based on Autoencoder Neural Network, performing in this way dimensionality reduction of original data. Then K-Means Clustering is applied on both-original and reduced data set. Different internal measures were performed to evaluate clustering for different numbers of dimensions and then we evaluated how the reduction method impacts the clustering task. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
21961115
Volume :
7
Issue :
1
Database :
Complementary Index
Journal :
Journal of Big Data
Publication Type :
Academic Journal
Accession number :
141513037
Full Text :
https://doi.org/10.1186/s40537-020-0286-0