1. Clustering-based data placement in cloud computing: a predictive approach.
- Author
-
Sellami, Mokhtar, Mezni, Haithem, Hacid, Mohand Said, and Gammoudi, Mohamed Moshen
- Subjects
- *
CLOUD computing , *DATA distribution , *TERRITORIAL partition , *INFORMATION technology , *DEFAULT (Finance) , *BIG data , *CLOUD storage - Abstract
Nowadays, cloud computing environments have become a natural choice to host and process a huge volume of data. The combination of cloud computing and big data frameworks is an effective way to run data-intensive applications and tasks. Also, an optimal arrangement of data partitions can improve the tasks executions, which is not the case in most big data frameworks. For example, the default distribution of data partitions in Hadoop-based clouds causes several problems, which are mainly related to the load balancing and the resource usage. In addition, most existing data placement solutions are static and lack precision in the placement of data partitions. To overcome these issues, we propose a data placement approach based on the prediction of the future resources usage. We exploit Kernel Density Estimation (KDE) and Fuzzy FCA techniques to, first, forecast the workers' and tasks' future resource consumption and, second, cluster data partitions and intensive jobs according to the estimated resource usage. Fuzzy FCA is also used to exclude partitions and jobs that require less resources, which will reduce the needless migrations. To allow monitoring and predicting the workers' states and the data partitions' consumption, we modeled the big data cluster as an autonomic service-based system. The obtained results have shown that our solution outperformed existing approaches in terms of migrations rate and resource consumption. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF