Back to Search Start Over

EVCA Classifier: A MCMC-Based Classifier for Analyzing High-Dimensional Big Data.

Authors :
Vlachou, Eleni
Karras, Christos
Karras, Aristeidis
Tsolis, Dimitrios
Sioutas, Spyros
Source :
Information (2078-2489). Aug2023, Vol. 14 Issue 8, p451. 27p.
Publication Year :
2023

Abstract

In this work, we introduce an innovative Markov Chain Monte Carlo (MCMC) classifier, a synergistic combination of Bayesian machine learning and Apache Spark, highlighting the novel use of this methodology in the spectrum of big data management and environmental analysis. By employing a large dataset of air pollutant concentrations in Madrid from 2001 to 2018, we developed a Bayesian Logistic Regression model, capable of accurately classifying the Air Quality Index (AQI) as safe or hazardous. This mathematical formulation adeptly synthesizes prior beliefs and observed data into robust posterior distributions, enabling superior management of overfitting, enhancing the predictive accuracy, and demonstrating a scalable approach for large-scale data processing. Notably, the proposed model achieved a maximum accuracy of 87.91% and an exceptional recall value of 99.58% at a decision threshold of 0.505, reflecting its proficiency in accurately identifying true negatives and mitigating misclassification, even though it slightly underperformed in comparison to the traditional Frequentist Logistic Regression in terms of accuracy and the AUC score. Ultimately, this research underscores the efficacy of Bayesian machine learning for big data management and environmental analysis, while signifying the pivotal role of the first-ever MCMC Classifier and Apache Spark in dealing with the challenges posed by large datasets and high-dimensional data with broader implications not only in sectors such as statistics, mathematics, physics but also in practical, real-world applications. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
20782489
Volume :
14
Issue :
8
Database :
Academic Search Index
Journal :
Information (2078-2489)
Publication Type :
Academic Journal
Accession number :
170740385
Full Text :
https://doi.org/10.3390/info14080451