51. Realtime spam detection system using random forest and support vector machine with countvectorizer algorithm.
- Author
-
Sekhar, B. and Soundari, A. Gnana
- Subjects
- *
SOCIAL media , *MACHINE learning , *SPAM email , *RANDOM forest algorithms , *SUPPORT vector machines , *ALGORITHMS - Abstract
The goal of this research is to catalogue all of the undesirable materials available on rival and popular social media platforms. Tools and methods: The dataset for training and testing the proposed prediction models was constructed using the messages from a number of popular social media messaging platforms, with a minimum of 5 attributes and 150 messages. Random Forest, a machine learning algorithm, was used to design the framework, and it was compared favourably to the Support Vector Machine algorithm. Discussion and Remarks: Accuracy in retrieval is 90.30 percent for the Random forest algorithm and 94.36 percent for the Support vector machine learning algorithm (countvectorizer). The significance level between the two algorithms is high at p=0.02 (p0.05). This study confirms that the SVM (countvectorizer) Machine Learning algorithm has a higher accuracy rate than the Random Forest algorithm and creates a novel framework to identify spam words. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF