Back to Search
Start Over
A comparative dimensionality reduction study in telecom customer segmentation using deep learning and PCA.
- Source :
- Journal of Big Data; 2/3/2020, Vol. 7 Issue 1, p1-23, 23p
- Publication Year :
- 2020
-
Abstract
- Telecom Companies logs customer's actions which generate a huge amount of data that can bring important findings related to customer's behavior and needs. The main characteristics of such data are the large number of features and the high sparsity that impose challenges to the analytics steps. This paper aims to explore dimensionality reduction on a real telecom dataset and evaluate customers' clustering in reduced and latent space, compared to original space in order to achieve better quality clustering results. The original dataset contains 220 features that belonging to 100,000 customers. However, dimensionality reduction is an important data preprocessing step in the data mining process specially with the presence of curse of dimensionality. In particular, the aim of data reduction techniques is to filter out irrelevant features and noisy data samples. To reduce the high dimensional data, we projected it down to a subspace using well known Principal Component Analysis (PCA) decomposition and a novel approach based on Autoencoder Neural Network, performing in this way dimensionality reduction of original data. Then K-Means Clustering is applied on both-original and reduced data set. Different internal measures were performed to evaluate clustering for different numbers of dimensions and then we evaluated how the reduction method impacts the clustering task. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 21961115
- Volume :
- 7
- Issue :
- 1
- Database :
- Complementary Index
- Journal :
- Journal of Big Data
- Publication Type :
- Academic Journal
- Accession number :
- 141513037
- Full Text :
- https://doi.org/10.1186/s40537-020-0286-0