Back to Search Start Over

New hybrid data mining model for prediction of Salmonella presence in agricultural waters based on ensemble feature selection and machine learning algorithms.

Source :
Journal of Food Safety; Aug2021, Vol. 41 Issue 4, p1-9, 9p
Publication Year :
2021

Abstract

This paper aims to create a new hybrid ensemble data mining model to predict the Salmonella presence in agricultural surface waters based on the combination of heterogeneous ensemble approach for feature selection, clustering, regression, and classification algorithms. The data set for this study was collected from six agricultural ponds in Central Florida consisting of 23 features with 540 instances (26 Salmonella positive and 514 Salmonella negative). The model consisted of three stages. Initially, a heterogeneous ensemble feature selection (HEFS) approach was applied to select top features. Then, the k‐means clustering algorithm was implemented to remove misclassified cases from the data set. Finally, classification and regression algorithms, including support vector machine (SVM), Naïve Bayes (NB), Artificial Neural Network (ANN), Random Forest (RF) with soft voting approach were applied to the preprocessed data set to predict the Salmonella presence in agricultural surface waters with the amount of test set (20%). These algorithms were combined in 10 different ensemble models through the soft voting approach. The performance of these hybrid ensemble models was also evaluated. The ensemble ANN + RF model achieved the highest performance and outperformed all other single and ensemble models based on Area under the ROC Curve (AUC) (0.98) and prediction accuracy (94.9%). The findings emphasize the validity of our hybrid ensemble model which encourages researchers to predict Salmonella presence in agricultural surface waters. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
01496085
Volume :
41
Issue :
4
Database :
Complementary Index
Journal :
Journal of Food Safety
Publication Type :
Academic Journal
Accession number :
151816661
Full Text :
https://doi.org/10.1111/jfs.12903