151 results on '"RANDOM forest algorithms"'
Search Results
2. Optimizing sales strategy in the Indian automobile industry: Predicting future car prices using machine learning and demographic data.
- Author
-
Khan, M. Reyasudin Basir, Islam, Gazi Md. Nurul, Ng, Poh Kiat, Zainuddin, Ahmad Anwar, Lean, Chong Peng, Al-Fattah, Jabbar, and Kamarudin, Nazhatul Hafizah
- Subjects
- *
MACHINE learning , *RANDOM forest algorithms , *BUSINESS planning , *AUTOMOBILE sales & prices , *DECISION trees , *BIG data - Abstract
Demographics play a vital role in defining the size, distribution, and structure of a population. In the context of the automobile industry, business owners can leverage demographic insights to gauge the demand for vehicles and strategically align their sales efforts. Accurate sales forecasting is essential for long-term business strategy, providing manufacturers with a competitive advantage in optimizing production planning methods. This project utilizes large-scale automobile sales data to forecast car price variations in the coming months, considering factors such as purchase patterns, car models, and other relevant data. By analyzing different attributes from a past-year dataset, three machine learning algorithms: Linear Regression, Decision Tree Regression, and Random Forest Regression were employed to predict future car prices. The performance of each algorithm is evaluated using the R-squared value. Notably, the Random Forest regression model achieves a higher accuracy of 93%, outperforming both Decision Tree regression and Linear regression. These results demonstrate the suitability of Random Forest regression in predicting big data for the industry's future product production plan and overall strategy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Application of machine learning models in predicting motorcyclist severity in heavy good vehicles (HGV) crashes in Malaysia.
- Author
-
Kang, Ho Ming, Musa, Sarah, Darman, Hazlina, Hamidun, Rizati, and Roslan, Azzuhana
- Subjects
- *
SUPPORT vector machines , *RANDOM forest algorithms , *DECISION trees , *SPEED limits , *ROAD safety measures - Abstract
Prediction on motorcyclist severity is always a critical task for transportation system and a promising research topic in road safety studies. Machine learning models have gained popularity in the recent years due to their strong prediction accuracy. Therefore, we aim at comparing the predictive performance, including prediction accuracy and estimation of variable importance, among the machine learning models. In this study, crash data from Malaysia is used to predict the motorcyclist severity using variables such as road type, speed limit, location type and collision type. The analysis begins with the use of random forest (RF) to adequately select important features for prediction. Then, three most often used machine learning models, which are multinomial logistic regression (MLR), decision tree (DT) and support vector machine (SVM), are applied and their performances are evaluated. The results indicated that the most important features in predicting the motorcyclist severity are the number of drivers killed, and environmental factors such as traffic system, collision type and light condition. Among the three models used in this study, SVM has shown better performance with 82.14% accuracy than DT and LR. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Disease prediction system using machine learning.
- Author
-
Jayapradha, J., Singh, Neetish Kumar, Dwivedi, Vishal, and Devi, M. Uma
- Subjects
- *
PREDICTION models , *DECISION trees , *RANDOM forest algorithms , *CLASSIFICATION algorithms , *FORECASTING - Abstract
In this era, technology has revolutionized the health industry to a great extent. The proposed model aims to design a diagnostic system for various diseases based on their symptoms. The Disease Prediction System has implemented different ML prediction models for the prediction of the user's disease based on various symptoms inputted by the user. Machine learning classification algorithms analyse the inputs given by the user and then predict the disease and probability of occurrence of the disease as output. The proposed system predicts the diseases such as i) Diabetes, ii) Kidney, iii) Cancer, iv) Heart and v) Liver. Four prediction models, Naive Bayes, Decision Tree, Random Forest and Logistic Regression, has been implemented in the proposed system for various disease. The dataset "Disease Prediction Using Machine Learning," with a count of 132 symptoms, has been used in the proposed model. The main goal of the proposed model is to predict the disease; however, the user doesn't need a medical report to use this system as the prediction is based on the symptoms, which will save time and money. The system also has an easy-to-use user interface, and all the users can use it to predict genetic diseases. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Rainfall prediction system using machine learning algorithms.
- Author
-
Brindha, R., Firoz, Sk. Md Khaja, and Reddy, C. Ramnath
- Subjects
- *
ARTIFICIAL neural networks , *BACK propagation , *RANDOM forest algorithms , *PRECIPITATION forecasting , *DECISION trees , *MACHINE learning - Abstract
Agriculture is vital for survival in India. Rainfall is crucial to agriculture. Predicting rainfall has become a significant issue recently. Rainfall forecasting helps people be prepared and informed of impending rain so they can take the necessary safety measures to preserve their crops from the rain. There are numerous methods available to predict rainfall. Predicting rainfall is where machine learning techniques are most beneficial. XGBoost, Decision Tree, Random Forest, Light BGM, and Logistic Regression are some of the most important machine learning algorithms. The linear and non-linear models, which are both often used, forecast seasonal precipitation. Logistic regression models are the initial models. Rainfall can be predicted when utilising Artificial Neural Networks (ANN) by employing Back Propagation Neural Networks, Decision Trees, and regression models like Random Forest. Due to the atmosphere's dynamic character, applied mathematics techniques are unable to guarantee reliable precision for a statement about precipitation. Regression may be used in the prediction of precipitation utilising machine learning approaches. The goal of this project is to provide non-experts with simple access to the methods and approaches used in the field of precipitation prediction as well as to compare different machine learning techniques. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Mobile advertisements click through rate prediction using machine learning.
- Author
-
Jacob, Jacinta Ann and Gnanavel, S.
- Subjects
- *
RANDOM forest algorithms , *DECISION trees , *ADVERTISING , *INTERNET advertising , *PREDICTION models , *MACHINE learning - Abstract
Online advertising has a big impact on whether your business succeeds or fails. Because of this, it is crucial to assess your advertisement's effectiveness before posting it online. Finding the Click-Through Rate (CTR) allows for this. Unfortunately, because you must gather user clicks before determining Click-Through Rate, this method is not environmentally friendly. In this situation, CTR prediction is helpful. For forecasting ad Click-Through Rate, user click data is a crucial source of information. Accurate Click-Through Rate prediction for contemporary e-advertising platforms is a challenging and crucial undertaking. Click-Through Rate prediction employs machine learning methods to determine how many times a potential consumer has clicked on an online ad. The more clicks an advertisement receives, the more successful it is. In this paper, we create a machine learning-based Click-Through Rate prediction model. Finding the Click-Through Rate allows for this. Unfortunately, because you must gather user clicks before determining Click-Through Rate, this method is not environmentally friendly. In this situation, CTR prediction is helpful. For forecasting ad Click-Through Rate, user click data is a crucial source of information. Accurate CTR prediction for contemporary e-advertising platforms is a challenging and crucial undertaking. Click-Through Rate prediction employs machine learning methods to determine how many times a potential consumer has clicked on an online ad. The more clicks an advertisement receives, the more successful it is. In this paper, we create a machine learning-based Click-Through Rate prediction model. The proposed study defines a model that produces accurate results with minimal use of computational resources. Three classification methods were used namely logistic regression, decision tree classifier and random forest classifier. Awasu dataset was used for analysis. The click data is generated over a 10-day period and sorted chronologically. This study answers the following question: Considering a user and the page they visit. What is the likelihood that they will click on a particular ad? The Random Forest classifier proved to be the best model with an accuracy score of 96%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Urban SO2 levels prediction using machine learning.
- Author
-
Prabha, Gayathri Narayanan, Harshith, Akula Venkat, Rajesh, Uthara, Omanakuttan, Vishnu Kesav, and Shiju, Amrita Varshini
- Subjects
- *
CITIES & towns , *RANDOM forest algorithms , *DECISION trees , *MANUFACTURING processes , *REGRESSION trees , *MACHINE learning - Abstract
The pristine images of the skies are deceiving. The air living beings breathe lies invisible hazardous particles which can adversely harm them. One such compound is sulfur Dioxide. These gaseous compounds are born from the combustion of sulfur-containing fuels and various industrial processes. Accurate prediction of urban air quality is crucial for well-being. This research delves into a novel approach for forecasting SO2 concentrations in cities. The proposed method leverages readily available hourly data, over the span of a month, extracting informative trend attribute from past SO2. Through the application of Machine Learning Regression Models, this paper provides an innovative approach in predicting these SO2 values, Utilizing data from 5 diverse cities. We achieved our aim of identifying the best performing models, Decision Tree Regression and Random Forest regression, from all the models that were compared for this study according to the performance metrics. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Classifying diabetes using data mining algorithms.
- Author
-
Bau, Yoon-Teck, Shaifuddin, Nurshara Batrisyia, and Lee, Kian-Chin
- Subjects
- *
RANDOM forest algorithms , *DATA mining , *DIABETES , *DECISION trees , *ALGORITHMS ,DEVELOPING countries - Abstract
Across the globe, diabetes is recognized as one of the many causes of deaths, especially in Third World countries as there is a lack of treatment for diabetes, especially in the early stages. In study, the presence of diabetes will be classified within the community, thus contributing to the existing technology within the healthcare system. Our discovery can help doctors to predict the existence of diabetes accurately and alert patients to seek early treatments. Four data mining algorithms were used within this study which consists of both single and ensemble classifiers. The two single classifiers are decision tree, and logistic regression classifier while the ensemble classifiers are random forest, and stacking. These classifiers are chosen as they are efficient and high in performance. This research uses the PIMA diabetes dataset as it can be obtained by the general public. The stratify cross-validation is used to ensure the efficiency of the models. Ensemble classifiers show better or similar testing results compared to single classifiers. From data visualisation, two important features are discovered. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Predicting graduate-on-time using machine learning.
- Author
-
Ahmad, Intan Khairina Adlina, Ting, Choo-Yee, Goh, Hui-Ngo, Quek, Albert, and Cham, Chin-Leei
- Subjects
- *
GRADUATE education , *RANDOM forest algorithms , *DECISION trees , *GRADE point average , *REGRESSION analysis , *MACHINE learning - Abstract
Predicting academic performance is a crucial task for educators and institutions because it enables the early identification of at-risk students and helps provide targeted interventions to improve their academic outcomes. Existing research often focuses on predicting academic performance using CGPA; less work, however, has used graduate-on-time (GOT) as a dependent variable. In this study, the objective was to (i) determine the optimal set of features that influence the predictions, (ii) construct a predictive model that predicts academic performance focusing on graduating on time (GOT). The data, obtained from the Ministry of Higher Education Graduates Tracer Study, contains information about graduated MMU students. It has 2382 entries and 95 columns, which include records of Graduate On Time (GOT), Cumulative Grade Point Average, Estimated Terms, and many more. This paper employed machine learning techniques such as Gaussian Naive Bayes, Decision Tree, Logistic Regression, Random Forest, Gradient Boosting, Stacking Ensemble methods and Multilayer Perceptron. The results showed that among all the techniques, the Ensemble method model exhibits the highest accuracy (84.03%), precision (84.86%), and recall (90.57%), as well as a f1-score of 87.62%. The Random Forest model and the Logistic Regression model both have a f1-score of 84%, which comes in second place after the strong results of the Ensemble technique. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Stellar classification using supervised machine learning.
- Author
-
Swathi, S., Saranya, S., Vijayalakshmi, K., and Aswini, V.
- Subjects
- *
SUPERVISED learning , *NAIVE Bayes classification , *MACHINE learning , *RANDOM forest algorithms , *DECISION trees , *K-nearest neighbor classification - Abstract
In this work, objects from Sloan Digital Sky Survey Data Release 17 (SDSS DR17) [1] were classified as dwarfs or giants using learning methods. The classification was created by supervised learning. Machine learning algorithms like K Nearest Neighbors, Naive Bayes Classifier, Support Vector Classification and Random Forest were developed in addition to Decision Trees. The algorithm that performed the worst was naive Bayes. Random Forest, which performed the best, was able to successfully classify the cases in the dataset that were marked as stars. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Spectacles prediction using machine learning algorithms.
- Author
-
Amulya, Soma, Sudeeptha, K., Sathvika, G., and Veeramsetty, Venkataramana
- Subjects
- *
RANDOM forest algorithms , *DECISION trees , *LOGISTIC regression analysis , *EYESTRAIN , *REGRESSION analysis , *MACHINE learning - Abstract
Eye defects in teenagers is becoming more prevalent these days, especially due to the pandemic situation. Digital eye strain due to excessive usage of gadgets can result in a prescription for spectacles, which is a basic test in any ophthalmology clinic, but due to the current situation of the pandemic, visiting a clinic might be risky. In this project, we propose an AI (ML) model for spectacles prediction based on a few input parameters through an app. AI machine learning models are used widely in the medical sector. Here, we applied three different machine learning models (logistic regression algorithm, decision tree algorithm and random forest algorithm) to extract maximum accuracy prediction using the dataset we collected from teenagers. We achieved highest accuracy of 97% using logistic regression model and made a website to predict spectacles. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Machine learning based Indian premier league (IPL) game predictions.
- Author
-
Mohmmad, Sallauddin, Raju, Oggula, Sridhar, Kankanala, Karivedula, Sheshipal, Laxmi Prasanna, Chindrala, and Shabana
- Subjects
- *
RANDOM forest algorithms , *DECISION trees , *LOGISTIC regression analysis , *FORECASTING , *GAMES - Abstract
Indian Premier League (IPL) is a famous Twenty-20 League conducted by the Board of Control for Cricket in India (BCCI). It was started in April 2008 and completed its fifteen seasons in 2022. The current, i.e., the fifteenth IPL season, was held in May 2022. IPL is a popular sport where it has a large set of the audience throughout the country. Therefore, every cricket fan would be eager to know and predict the IPL match results. This project is about a detailed exploratory data analysis of IPL matches conducted from the year 2008 till matches held in 2019. Here, we analyze the overall IPL match scores, best batting and bowling performances, the team with a more significant number of wins, the most successful IPL team, the most valuable players and their best performance range, and so on. The complete dataset is collected from officials of BCCI of IPL matches held from 2008-2019 through Kaggle. Here, the testing accuracy of SVM classifier is highest at 0.9159, and the next highest is the Decision Tree algorithm which gave 0.8225 accuracies. The second highest is Logistic Regression, which gave 0.8159 accuracies, and the Random Forest algorithm, with 0.7563 accuracies. As the SVM classifier has the highest accuracy of all the four models, we use that model to develop the analyzer model for the project. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Thyroid detection using random forest algorithm.
- Author
-
Pathinettampadian, Karthikeyan, Mandankandy, Arun Anoop, Paramsivam, Saravana Kumar, Ponnuchamy, Aravinth, Jeyakumar, Zekkin Thanraj Samuel Raj, and Ignacious, Edwin Anthony Larance
- Subjects
- *
THYROID gland , *THYROID diseases , *RANDOM forest algorithms , *SUPPORT vector machines , *DECISION trees , *ARTIFICIAL intelligence , *LOGISTIC regression analysis - Abstract
In India, thyroid illness is a common condition that affects over a million people yearly, mostly women. The most prevalent thyroid diseases, hyperthyroidism and hypothyroidism, are brought on by the thyroid gland's abnormal activity, and they may either raise or lower the body's metabolism. The use of artificial intelligence in the healthcare sector is a result of the industry's ongoing technological innovation and advancements. Algorithms based on machine learning can detect thyroid diseases asymmetrically, improving overall health. This research shows how several classification methods, such as Support Vector Machine, Random Forest, Decision Tree, Naïve Bayes, and Logistic Regression, can predict the existence of thyroid illness. The models were evaluated and compared to see which produced the greatest results. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. A comparative analysis for detecting fake news using supervised learning algorithms.
- Author
-
Dixit, Dheeraj Kumar, Bhagat, Amit, and Dangi, Dharmendra
- Subjects
- *
FAKE news , *COMPARATIVE studies , *RANDOM forest algorithms , *DECISION trees , *DATA scrubbing , *MACHINE learning - Abstract
Fake news is a type of essential problem on social media. The rapid circulate of fake news has an ability for disastrous influences on human beings and the society. Thus, it becomes more useful to detect fake news on social sites or internet. Recently, many models have been developed to detect the fake news from the publicly available datasets. In this paper discussed the various machine learning algorithms and their performance analysis on two different news data. The proposed framework contains two step process. In the first step, clean the data and extracted features by TF-IDF and Hashing Vectorizer. In the second step, machine learning algorithms (Logistic regression, Decision Tree, Random Forest, Multinomial Naive Bayes, and Passive Aggressive classifier) have been applied in an effective and efficient manner. Comparative analysis revealed that the optimal performance is achieved by the Logistic regression and Passive aggressive classifier, 95.45% and 97.35% respectively for two public datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. A review of survival and machine learning models for real time data.
- Author
-
Ramakrishnan, M., Manikandan, M., and Jagathesan, T.
- Subjects
- *
SUPERVISED learning , *RANDOM forest algorithms , *DECISION trees , *SURVIVAL analysis (Biometry) , *LOGISTIC regression analysis - Abstract
Survival analysis is a branch of statistics deals to analyze and model the time to event data. To tackle this censoring problem, statistical strategies have been widely explored in the literature. To deal with survival data, traditionally many methods including Nonparametric, Semiparametric, Parametric and Bayesian Models were incorporated. But these methods require some assumptions, but many machine learning techniques (Survival Decision Tree, Survival Random Forest, Naive Bayes and Logistic Regression) are designed to handle survival data and other challenges that may emerge in real-time data. In this paper, the author elaborates significance of these methods for survival data and identified important variables for the specified event for three real time data. Important variables were prioritized using random forest's variable selection methods, and the results were compared to survival approaches. Supervised Machine Learning Methods were applied, and survival analysis was used to validate them. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Comparing the performance of random forest with decision tree and logistic regression algorithm in loan default prediction.
- Author
-
Kalyani, T., Vickram, A. S., and Dhanalakshmi, R.
- Subjects
- *
RANDOM forest algorithms , *DECISION trees , *DEFAULT (Finance) , *REGRESSION trees , *LOGISTIC regression analysis , *REGRESSION analysis , *BANK failures - Abstract
The primary goal of this study is to compare the performance of the Novel Random Forest (RF) algorithm, Decision Tree (DT), and Logistic Regression in forecasting loan default (LR). The 346-record loan dataset that Novel Random Forest is associated with. It has been suggested and assessed how well the revolutionary methods of Random Forest, Decision Tree, and Logistic Regression can forecast loan defaults in the banking and finance industry. There were a total of 17 participants in each study group. The classifier's efficacy in terms of accuracy and precision is measured and documented. On this dataset, the Logistic Regression model predicts loan default with an accuracy of 81%, while the Decision Tree model achieves 93% and the Random Forest model achieves 95%. (p 0.031) is statistically significant. That's why it's clear that Novel Random Forest outperforms both Decision Tree and Logistic Regression. When compared to Decision Tree and Logistic Regression, Novel Random forest has superior accuracy and precision. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Prediction of student results using novel random forest in comparison with decision tree to improve accuracy.
- Author
-
Reddy, A. Lokesh, Sathish, T., and Sangeetha, N.
- Subjects
- *
RANDOM forest algorithms , *DECISION trees , *CLASSIFICATION algorithms - Abstract
The fundamental objective of this study is to evaluate and contrast the performance of two innovative classification algorithms, namely Random Forest and Decision Tree. Each of the random forest and choice tree exercises had 20 participants. The academic success of a student in the past might serve as an indication of the possible future achievements the student will have. The purpose of this research is to come up with strategies that can accurately forecast how well a kid will do academically. The use of decision trees and random forests as methods for forecasting student success are now both considered to be well-established methodologies. Each of the research groups had a total of 72 participants. The accuracy of predictions made using this dataset ranges from 88.2 percent when utilising the time-tested Decision tree to 91.7 percent when utilising the innovative Random forest classifier. 0.03 is the most significant part of the fraction. The Random Forest algorithm is better to the decision tree approach for this reason. The innovative decision tree known as the random forest performs far better than its predecessor, the tried-and-true decision tree. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Medical image prediction for diagnosis of breast cancer disease comparing the machine learning algorithms: SVM, KNN, logistic regression, random forest and decision tree to measure accuracy.
- Author
-
Dinesh, Paidipati, Vickram, A. S., and Kalyanasundaram, P.
- Subjects
- *
RANDOM forest algorithms , *DECISION trees , *COMPUTER-assisted image analysis (Medicine) , *CANCER diagnosis , *DIAGNOSTIC imaging , *LOGISTIC regression analysis , *MACHINE learning - Abstract
The study's primary objective is to compare the efficacy of the state-of-the-art SVM method for image prediction with that of KNN, Logistic Regression, Random Forest, and Decision Tree. The UCI Machine Learning Laboratory provides a total of 569 samples. Groups like SVM, KNN, Decision Tree, Random Forest, and Logistic Regression are used to the samples after they have been separated into benign and malignant cells so that their respective performances may be compared. G power calculation is used to determine how many samples are needed for this analysis. The maximum acceptable error is set at 0.5, and the minimum power of analysis is set at 0.8. Predictions made using Logistic Regression appear to have a higher accuracy(95%) than those made using SVM, KNN, Decision Tree, or Random Forest(92%,90%,85%, and 91%). This proposed system has a probability importance of 0.55. The Wisconsin dataset was used to compare Logistic Regression against SVM, KNN, Decision Tree, and Random Forest for the detection of breast cancer. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Classification of fire and smoke images using random forest algorithm in comparison with decision tree to measure accuracy, precision, recall, F-score.
- Author
-
Reddy, B. Haranadh, Vickram, A. S., and Karthikeyan, P. R.
- Subjects
- *
DECISION trees , *RANDOM forest algorithms , *TOBACCO smoke , *SMOKE , *FIRE detectors , *MACHINE learning , *CLASSIFICATION - Abstract
The purpose of this research is to compare and contrast how well the random forest algorithm and the decision tree method perform in classifying fire and smoke photographs. With a g power of 0.8 and an alpha of 0.05, we find that out of a total of 4000 photos, 2000 depict fire and the remaining 2000 depict smoke. In this case, we split the dataset into a training set (n=3200, or 80%) and a validation set (n=800, or 20%). Classification tasks are executed with the help of the Sklearn machine learning package in Python. Precision, recall, f score, and accuracy numbers are some of the metrics used to measure an algorithm's effectiveness. When compared to the Decision Tree algorithm's 90.54, 90.00, 90.9, and 89.70 percent accuracy, F-score, recall, and precision, Random Forest achieved 97.42, 97.28, 97.43, and 97.15 percent values, respectively (p0.001(2-tailed)). The results of this research show that the random forest method outperforms the decision tree algorithm by a wide margin. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. Financial risk prediction using decision tree with Twitter data and compare prediction accuracy with random forest.
- Author
-
Sudheer Kumar Reddy, Vajrala and Priyadarsini, P. S. Uma
- Subjects
- *
RANDOM forest algorithms , *DECISION trees , *FINANCIAL risk - Abstract
To use tweets about finance to help decide on a financial choice. To forecast if financial tweets are positive or negative, we use a new Decision Tree that is better at getting the right results compared to Random Forest. The proposed way of doing things looks at 10 examples in each of two groups. We want to make sure we have enough power to detect any differences between the groups, so we aim for an 80% chance of finding a difference if one exists. We also set a significance level of 0. 05, which means we want to be fairly confident in our findings. The Novel Decision Tree is more accurate (96. 4%) than the Random Forest (92. 7%) in predicting the sentiments of financial tweets. The Novel Decision Tree Algorithm is much better than the Random Forest at guessing the feelings in financial tweets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Automatic credit card approval prediction system.
- Author
-
Bhaskar, Astha, Rani, Ritu, Jaiswal, Garima, Dev, Amita, Sharma, Arun, Bansal, Poonam, and Gupta, Umesh
- Subjects
- *
CREDIT cards , *SUPERVISED learning , *RANDOM forest algorithms , *BANKING industry , *DECISION trees , *MACHINE learning - Abstract
Within the banking industry, requests for credit cards are growing tremendously, and manually reviewing each application is frequently a tiresome task that is also prone to human error. In this situation, banks and other big financial institutions can use a machine learning model to forecast whether or not to grant the customer a credit card. Banks utilize machine learning techniques to process their financial data, extract knowledge from it, and use it for risk management and decision-making. This study has created, trained, and evaluated three classification models utilizing authentic Kaggle datasets. The main research goal is to assess and contrast the models based on how accurately they project the composition of the typical class. In this work, we examine the accuracy, F1 Score, Precision, and confusion matrix of different supervised machine learning models to estimate the probability that a credit card request would be approved. After testing three classifiers, it is discovered that Random Forest outperformed Decision Tree and Logistic Regression. Random Forest's accuracy is 94.67%, precision is 0.85, recall is 0.980, and F1 Score is 2.940. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Vision-based detection of car turn signals (left/right).
- Author
-
Madake, Jyoti, Wagatkar, O. M., Chaturvedi, Yashovardhan, Bhatlwande, Shripad, Shilaskar, Swati, and Vernekar, Kundan
- Subjects
- *
RANDOM forest algorithms , *COMPUTER algorithms , *COMPUTER vision , *DECISION trees , *K-means clustering - Abstract
Nowadays in India, due to the increase in the number of accidents vehicles are being automated using a variety of Computer vision algorithms. This paper focuses on the detection of tail signals of cars under different illumination circumstances. This system is implemented using FAST and SIFT algorithm which helps to extract features from the images. The obtained features were optimized by using K-Means Clustering Algorithm. This huge feature vector is converted into 8 clusters. These optimized feature vectors were trained on five different classifiers as Decision Tree, SVM(RBF), Random Forest, and Voting Classifier. The trained data set used in this algorithm contains around 9052 images. The obtained accuracy results of different classifiers are as follows, Decision tress has 76.36%, SVM(RBF) has 77.91%, Random Forest has 87.52%, KNN has 89.72%, the Voting classifier has 86.52%. It is observed that KNN gives the highest accuracy among the five used classifiers. This has, to the best of the authors' knowledge, not been presented in literature before. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Machine learning-based prediction of brain cancer: A comparative analysis of supervised models.
- Author
-
Kaur, Deepinder and Singh, Jaspreet
- Subjects
- *
BRAIN cancer , *MACHINE learning , *RANDOM forest algorithms , *SCIENCE competitions , *SUPPORT vector machines , *DECISION trees - Abstract
Brain cancer is a highly lethal form of cancer with low survival rates and expensive treatment costs. Detecting tumors at an early stage is essential for improving patient outcomes, and the utilization of machine learning (ML) algorithms has demonstrated potential in augmenting the accuracy of brain tumor detection. This research aims to develop an ML model that predicts the likelihood of brain cancer by combining various factors. Four popular ML algorithms, namely Random Forest, Support Vector Machines (SVM), Gradient Boosting, and Decision Tree, are compared for their performance in this task. The dataset used in this study is obtained from Kaggle, a platform for data science competitions and projects. Among the four ML algorithms evaluated, Gradient Boosting achieved the highest accuracy of 83.9%, followed by Random Forest with 81%, SVM with 80.2%, and Decision Tree with 74.1%. To gain further insights, the research conducted feature importance analysis to identify the factors that contribute most significantly to the prediction of brain cancer likelihood. Additionally, addressing class imbalance using the Synthetic Minority Over-sampling Technique (SMOTE) was applied to improve the model's performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. Smart health prediction using machine learning.
- Author
-
Prasad, Ch. Rajendra, Shivapriya, Pillalamarri, Bhargavi, Naragani, Ravula, Nagaraj, Sripathi, Supraja Lakshmi Devi, and Kollem, Sreedhar
- Subjects
- *
MACHINE learning , *RANDOM forest algorithms , *DECISION trees , *FORECASTING , *LOGISTIC regression analysis - Abstract
Now-a-days, Health care industries are literally playing a major role in curing the diseases that are suffering the people. And this will be one kind of help to health care industries. In present days, people are facing lot of issues related to their health due to their life style and their livelihood. Due to their busy schedule in their lives, people are not at all taking care of their health. They are not having the time to consult doctors and know what they are going through and it may lead to severe risk for them. So, People should be aware of what they are going through at early stage will reduce the high risk. In our proposed system we used logistic regression, Random Forest Classifier and Decision tree classifier in prediction of the disease. Disease Prediction is a supervised model that is used for prediction of diseases from the symptoms or the information provides by the user. This proposed system will process the symptoms entered by the user and provide the predicted disease as an output. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. Performance analysis of different machine learning algorithms for intrusion detection on KDD-CUP-99 dataset.
- Author
-
Shokeen, Arihant, Yadav, Naveen, and Sisaudia, Varsha
- Subjects
- *
MACHINE learning , *INTRUSION detection systems (Computer security) , *RANDOM forest algorithms , *COMPUTER network security , *DECISION trees , *LOGISTIC regression analysis - Abstract
Network security has emerged as an essential issue as an outcome of the Internet's broad consumption. The efficiency of anomaly-based detection systems for network intrusions (IDS) as a tool for spotting illicit traffic has risen. IDS's precision and efficiency have been enhanced with the assistance of ML methods. The performance of multiple machine learning (ML) algorithms in anomaly-based intrusion detection is compared in this paper using KDD-CUP-99 dataset. The algorithms considered include Voting, LightGBM, Decision Tree, KNN, Random Forest, AdaBoost, Naive Bayes Model, CatBoost, and Logistic Regression. The study's findings indicate that the Decision Tree, Random Forest, LightGBM, and Voting algorithms did remarkably well and exhibited high rates of accuracy, while Naive Bayes performed inadequately. This study comes to the conclusion that using ML algorithms might greatly improve the precision of knowledge-based network intrusion detection and provide a workable network security solution. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. Development of predictive model for students' final grades using machine learning techniques.
- Author
-
Rahman, Nurul Habibah Abdul, Sulaiman, Sahimel Azwal, and Ramli, Nor Azuana
- Subjects
- *
MACHINE learning , *PREDICTION models , *LITERATURE reviews , *RANDOM forest algorithms , *DECISION trees - Abstract
Predictive analytics is a new frontier sector of higher education in today's world of data science, similar to other businesses such as marketing, financial, fraud detection, and demographic trends. Predictive analytics can provide beneficial information to educators and potentially assist them in enhancing students' performance by analyzing historical data using a variety of approaches from data mining and machine learning. The e-learning practiced in today's education system are unfortunately cause the dropout rates among students. Dropouts may cause the big and negative issues for university system and the stakeholders as well. Based on the literature review, studies on machine learning and predictive analytics to improve student performance are still scarce in Malaysian higher education. Therefore, the objective of this quantitative research is to develop the best predictive model for predicting students' performance at Pahang Islamic University College using machine learning techniques such as Decision Tree, Random Forest, AdaBoost, and Gradient Boosting. Students who have taken the Business Statistics course from the years 2013 to 2021 will be the subjects of the study. Data retrieved through a Learning Management System were used. From the analysis that has been done, Random Forest is the best method to be used in the predictive model for students' final grades. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. A framework for Twitter spam detection and reporting.
- Author
-
Chandana, Anuradha, Naik, Arjun J., Kumar, Amit, and Banu, Sohara
- Subjects
- *
SOCIAL media , *SPAM email , *BOTNETS , *ONLINE social networks , *K-nearest neighbor classification , *RANDOM forest algorithms , *DECISION trees - Abstract
Twitter is an American social networking and blogging site where users can communicate through messages known as "Tweets. The number of characters is restricted to 280 for all languages except Korean, Chinese and Japanese. Users on social media platforms tend to easily believe the contents of posts related to any random events and some of these events happen to be fake. Twitter spammers may fulfill their malicious objectives, such as spam sending, distributing malware, hosting botnet command and control (C&C) networks, and launching other illicit activities in the underground. Hence, we have proposed a system that will not only help in finding the type of spammers but also eliminate identical tweets. So we have implemented multi-classifier algorithms such as naive Bayes, K-Nearest neighbor, Decision tree and Random forest on a dataset obtained from Twitter and their performance is compared, also the most accurate algorithm is found out. The results of the experiment have been very positive. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Random forests for detecting weak signals and extracting physical information: A case study of magnetic navigation.
- Author
-
Moradi, Mohammadamin, Zhai, Zheng-Meng, Nielsen, Aaron, and Lai, Ying-Cheng
- Subjects
RANDOM forest algorithms ,MACHINE learning ,DECISION trees ,ELECTRONIC systems ,INERTIAL navigation systems - Abstract
It has been recently demonstrated that two machine-learning architectures, reservoir computing and time-delayed feed-forward neural networks, can be exploited for detecting the Earth's anomaly magnetic field immersed in overwhelming complex signals for magnetic navigation in a GPS-denied environment. The accuracy of the detected anomaly field corresponds to a positioning accuracy in the range of 10–40 m. To increase the accuracy and reduce the uncertainty of weak signal detection as well as to directly obtain the position information, we exploit the machine-learning model of random forests that combines the output of multiple decision trees to give optimal values of the physical quantities of interest. In particular, from time-series data gathered from the cockpit of a flying airplane during various maneuvering stages, where strong background complex signals are caused by other elements of the Earth's magnetic field and the fields produced by the electronic systems in the cockpit, we demonstrate that the random-forest algorithm performs remarkably well in detecting the weak anomaly field and in filtering the position of the aircraft. With the aid of the conventional inertial navigation system, the positioning error can be reduced to less than 10 m. We also find that, contrary to the conventional wisdom, the classic Tolles–Lawson model for calibrating and removing the magnetic field generated by the body of the aircraft is not necessary and may even be detrimental for the success of the random-forest method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. Machine learning based university admit eligibility predictor.
- Author
-
Raman, C. J., Janani, U., Dharani, P., and Balaji, V.
- Subjects
- *
MACHINE learning , *RANDOM forest algorithms , *DECISION trees , *REGRESSION trees , *MASTER'S degree - Abstract
There are a lot of students in the modern educational system who need to pursue further education after taking an undergraduate certification course. Advanced education in the sense that some groups having an undergraduate degree in Engineering must complete their Masters degree through either GATE or CAT or any other entrance examination conducted by the individual institutes either in national level or in the international level to get the admission. In educational institutions, the question of understudy confidentiality is crucial. In order to foresee the probability that a undergraduate would be conceded to a Master's program, we are working with AI models. This will enable students to plan ahead and determine if they will have the chance to be recognised. There are three significant Machine learning models particularly Linear regression, Decision tree regression and Random Forest regression. In this paper we will predict the admissions using Random Forest algorithm, a well-known supervising learning model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. Occupancy detection through machine learning and environmental data.
- Author
-
Telaga, Abdi Suryadinata, 'Aisy, Salmaa Rihhadatul, and Arumi, Sekar Putri
- Subjects
- *
K-nearest neighbor classification , *SUPPORT vector machines , *MACHINE learning , *RANDOM forest algorithms , *DECISION trees , *AIR conditioning - Abstract
Occupancy detection is critical during building operations, particularly in energy efficiency, space utility, heating, ventilation control, and air conditioning (HVAC) to optimize user comfort. However, applying occupancy sensors to detect occupants can violate a person's privacy, and the resulting data could be more precise. Therefore, this study aims to detect occupancy through environmental sensors by utilizing the accuracy of machine learning-based classification results. The ultimate goal of this study is to detect whether there are people in a room based on the benchmarks studied. Benchmarks for classifying whether people are in a room include CO2, humidity, lighting, temperature, and occupancy. This paper presents a method for comparing and evaluating a set of different machine learning techniques based on a given performance measure (for example, precision, accuracy, f1 score, and algorithm recall) and then using the algorithm with the best performance. This study uses five machine learning classification algorithms: K-Nearest Neighbors, Support Vector Machine, Decision Tree, Naive Bayes and Random Forest, and the method used to evaluate the results is cross-validation. Experiments were conducted using occupancy detection datasets from the UCI Machine Learning Repository. After performing the experiment, the predictive results of the five algorithms are high. That is, each has an accuracy value of more than 90%. Support Vector Machine has the highest accuracy value compared to other algorithms, namely 98.5%, and Decision Tree has the lowest accuracy value, namely 91.9%. The results show that selecting the right features and the suitable classification model can significantly impact prediction accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Which algorithm is better? An implementation of normalization to predict student performance.
- Author
-
Priyambudi, Zulfikar Setyo and Nugroho, Yusuf Sulistyo
- Subjects
- *
RANDOM forest algorithms , *CLASSIFICATION algorithms , *VOCATIONAL high schools , *ALGORITHMS , *DECISION trees , *DATA mining - Abstract
This paper focuses on finding the best classification algorithm model in the case study of student performance prediction and comparing the algorithm performance before and after using the normalization method. To achieve this goal, this study uses data mining classification techniques to analyse student performance at Vocational High School in 2020-2021. The steps of the research carried out include: data collection, data pre-processing, build algorithm models without using normalization and with using normalization, and final step are comparing algorithm performance before and after using normalization. The algorithms that will be used include: Random Forest, Decision Tree, Logistic Regression, SVM, Naive Bayes, and KNN. While the normalization methods used are Standard Scaler, Min-Max Scaler, and Robust Scaler. The result of this research is that the normalization method is able to significantly increase the accuracy of the model. Based on the tests and evaluations carried out, the normalization method using the Min-Max Scaler has the biggest impact in improving the overall model performance and the algorithm with the best performance is Random Forest. This paper reviews the effect of the normalization method to improve algorithm performance in predicting student performance, where based on previous research no one has used the normalization method to gain accuracy of the model which actually has a considerable impact on gaining accuracy of the model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Performance evaluation of ensemble methods for predictive analysis: An experiment for smart cities.
- Author
-
Siddiqui, Farheen, Tariq, Aliya, and Zafar, Sherin
- Subjects
- *
SMART cities , *DECISION trees , *RANDOM forest algorithms , *EVALUATION methodology - Abstract
Ensembles of individually trained models (such as various decision trees combined together) are said to have more effective and accurate results than any of the respective individual model when tested on the same data set instances. In this paper, authors have conducted an evaluation to measure the authenticity of this theory, for which authors have trained some individual classification models such as KNN classifier and SVM classifier along with a decision tree classifier and then the respective ensembles of those individual classifiers and a random forest classifier (an ensemble of various decision trees combined into an individual model). For training/fitting the models authors have used three real world datasets and then compared the predicted results from both the classifiers' categories. Research specifies that the ensemble methods give out more accurate predictions and they outperform their individual classifier models in almost every simulation which can be a great prediction based analysis for smart cities. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
33. Improvisation of Reddit flair detection using TF-IDF and countvectorizer.
- Author
-
Singhal, Mayank, Singhal, Nivedita, Khera, Shivansh, Upmanyu, Aakash, and Nagrath, Preeti
- Subjects
- *
APPLICATION program interfaces , *SOCIAL networks , *RANDOM forest algorithms , *DECISION trees , *SOCIAL media - Abstract
The internet has become an essential part of everyone. Through the internet, many tasks are made easier, like online payments. With the increase in the number of internet users, the data is also increasing. People are fond of social media platforms like Facebook, Instagram, Reddit and Twitter. Reddit, a social networking website that provides a platform for users to create posts, discussion groups, etc., generates a massive amount of data daily. This data needs to be organized and analyzed to label it and divide it into specific categories. Posts on this social networking platform are organized with the help of categories defined by Reddit and are known as "subreddits". Through this study, an effort has been made to design a model that can detect the flair (category) of a Reddit post. The dataset is collected from the Reddit Application Program Interface using PRAW Library. This requires word embedding; that is, words that have a meaning similar to each other are represented analogously. Word embedding is done using techniques like Countvectorizer and TF-IDF. For the prediction of the flairs various algorithms are used, such as Logistic Regression, Decision Tree, Random Forest, Gaussian Naive Bayes, etc. The results of this study illustrate the importance of the proposed work. The dataset for this study is self-scraped from Reddit API where the instance with even a single blank column was removed. There is limited research done on flair identification on the Reddit dataset, which add-ons the uniqueness of the research done. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
34. EEG based confused mental state detection and analysis.
- Author
-
Wyawahare, Medha, Kumari, Ankita, Awale, Chinmayee, Aurangabadkar, Gayatri, and Awale, Sakshi
- Subjects
- *
BOOSTING algorithms , *MACHINE learning , *ELECTROENCEPHALOGRAPHY , *STUDENT engagement , *RANDOM forest algorithms , *DECISION trees , *AFFECTIVE neuroscience - Abstract
The state of confusion while learning can reduce the performance of students in their studies. Monitoring the engagement of students in online courses is a tedious job. This paper aims to detect the state of confusion state among students watching MOOC videos based on EEG signals. EEG (Electroencephalography) signals are the brain's electrical signals that can be used to detect activities like engagement, happiness, stress and many other emotions. Machine learning is an easy and efficient method to deal with complex EEG data and analyze it. The dataset used is a combination of two datasets from Kaggle. The first one is EEG data recorded from 10 students and the other consists of demographic information of the students. Various machine learning algorithms such as gradient boosting, decision tree, random forest, KNN and Naïve Bayes are used to classify the data as confused or not confused. Random forest classifiers proved to be the best algorithms for classification from the comparative analysis of all the implemented algorithms with an accuracy of 96%. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
35. A survey on the use of machine learning approaches for analysis of anemia.
- Author
-
Rane, Sakshi, Yadav, Arvind, Patel, Geetika, Gurjwar, Rajiv, Barve, Amit, and Gagan Kumar, K.
- Subjects
- *
MACHINE learning , *ANEMIA , *CHILDBEARING age , *RANDOM forest algorithms , *DECISION trees - Abstract
Anemia is the most common blood disorder where the blood lacks a sufficient amount of red blood cells which results in an insufficient amount of oxygen being supplied to the body tissues. The condition is most commonly seen to occur in women during their pregnancy and children in the age group of 9 to 18 months and is not easily detected at the early stages. This survey paper depicts different machine learning (ML) approaches used for the analysis of anemia. Further, it also highlights the use of the ML approach for determining the prevalence of anemia, categorizing the level/type of anemia among patients, young children, and women of reproductive age (including pregnant women). It is observed that amongst the ML approaches used by the researcher's random forest (RF) and decision tree (DT) algorithms had outperformed the other algorithms for the analysis of anemia. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
36. Human activity recognition with machine learning.
- Author
-
Kaushal, Payal, Gandhi, Vidhyotma, and Chahal, Jasmeen Kaur
- Subjects
- *
HUMAN activity recognition , *MACHINE learning , *RANDOM forest algorithms , *DECISION trees , *INTERNET of things - Abstract
The huge advancement of the Internet of Things and sensors technology encourages researchers to work on sensor-based applications such as activity recognition. Although a lot of work has been done in the last decade still there are considerable challenges that can affect the performance of the activity recognition system in real-time scenarios. In the last decades, machine learning has confirmed its usefulness in enhancing the efficiency of the system. Several machine learning algorithms have been inspected to address the existing issues in activity recognition systems. This study presents a real-time implementation of human activity recognition. The four machine learning algorithms; decision tree, KNN, SVM, and random forest were implied to verify the accuracy of the system. The random forest gives the highest accuracy at 99.7%. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
37. Effective comparison of logistic regression (LR) and decision tree (DT) classifier to predict enhanced employee attrition for increasing accuracy of non-numerical data.
- Author
-
Abhiraj, N. and Deepa, N.
- Subjects
- *
DECISION trees , *REGRESSION trees , *RANDOM forest algorithms , *TREE size , *SAMPLE size (Statistics) , *FORECASTING - Abstract
To predict enhanced employee attrition for increasing accuracy of non-numerical data using logistic regression and decision tree classifier. Materials and Methods: Accuracy is performed with dataset Employee Attrition with samples of 1470 samples. Classification of Employee Attrition is performed by Logistic Regression of sample size (N=62) and Decision Tree of sample size (N=62) obtained using G-power value 80%. Results: The accuracy rate of logistic regression is 83.26 % whereas results of random forest accuracy rate are 77.99%. The significance value is determined as 0.487 (p>0.05) for accuracy. Logistic Regression performs better in finding accuracy when compared to Decision Tree. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
38. Medical image prediction for the diagnosis of breast cancer and comparing machine learning algorithms: SVM, logistic regression, random forest and decision tree to measure accuracy of prediction.
- Author
-
Dinesh, Paidipati and Kalyanasundaram, P.
- Subjects
- *
RANDOM forest algorithms , *DECISION trees , *COMPUTER-assisted image analysis (Medicine) , *LOGISTIC regression analysis , *MACHINE learning , *CANCER diagnosis , *DIAGNOSTIC imaging - Abstract
Breast cancer is a major concern to middle-aged women across the globe, and it is now the second leading cause of cancer mortality in women. SVM, KNN, and Random Forest, Decision Trees are the primary focus of my research. Samples from the University of California, Irvine's Machine Learning Laboratory total 569. SVM, Decision Tree, Random Forest, and KNN are used to classify the samples as either benign or malignant based on the appearance of the cells. G-power calculations are used to determine the number of samples needed for this study. The analysis's minimum power is set at 0.8, while the tolerated error level is set at 0.5. As compared to SVM, KNN, and Decision Tree, Random Forest predicts a 95 percent accuracy rate (compared to 92 percent, 90 percent, and 88 percent, respectively). This system has a significance of 0.22. Random Forest outperforms SVM, Decision Tree, and KNN in the identification of breast cancer in this innovative picture prediction. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
39. Development of the best personality traits for identifying the career option of students by applying different classification techniques.
- Author
-
Aggarwal, Mukul, Yadav, Neha, Sharma, Kamal Kant, Kumar, Veepin, and Pandey, Adesh Kumar
- Subjects
- *
PERSONALITY , *RANDOM forest algorithms , *PSYCHOLOGICAL typologies , *DECISION trees , *STOCHASTIC processes - Abstract
This paper runs over regions where it approaches a lot of individual conduct information. This information can be useful to arrange the personality traits as per attribute received. In this paper, the framework proposes a mechanism have been studied the different type of personality attributes and do compose as per the best five super attributes so that under 5 personality traits covers all attributes as subset of all 5 developed personality traits. The framework utilizes learning calculations like Naive Bayes, j48, Decision tree and Random Forest alongside cutting-edge information mining to mine user attributes information and gain from the examples. This learning would now be able to be utilized to group/anticipate user identity in view of past orders. In this paper we have done the personality classification and our target audience are the students who want to know what kind of career path will be most appropriate for them. We have collected the data from students and then processed to classify their personality on the basis of that data. By using it will be very helpful to all students who don't know which career line are suitable for them? According to the output of this process Random Forest got the higher accuracy i.e., the personality traits of a particular individual we can suggest them a suitable career path based on the result. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
40. Machine learning based heart disease prediction system.
- Author
-
Ambika, Anju, Freeda, Adline, Venket, Krithikaa, Ram, Dinesh, Tanu, Logesh, and Kumar, Praveen
- Subjects
- *
HEART diseases , *JOB classification , *RANDOM forest algorithms , *RISK perception , *DECISION trees , *MACHINE learning - Abstract
In human life, healthcare is an unavoidable and important task to be done. A set of disorders that affect the heart and blood vessels is known as cardiovascular disease. The earlier methods of estimating the uncertainty levels of cardiovascular diseases helped in taking decisions to reduce the risk in high-risk patients. This project provides a model for predicting whether a person has heart disease or not, as well as providing patient awareness or diagnostic of the risk. The prediction model is projected with mixtures of various options and a number of other classification techniques. This is accomplished by comparing the accuracies of various algorithms to the separate values of KNN, Decision Tree, SVM and Random Forest, and then selecting the most accurate algorithm for prediction. The objective is to improve the model's performance by deleting irrelevant and unneeded features from the dataset and therefore only collecting the ones that are most beneficial for the classification job. Thus, the main focus of the system is to make use data analytics to predict the presence of the disease and level of disease among patients. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
41. Detection of ethanol quality using random forest algorithm in comparison with decision tree algorithm to measure accuracy.
- Author
-
Manikanta, Relangi and Rag, S. Adarsh
- Subjects
- *
RANDOM forest algorithms , *DECISION trees , *ETHANOL - Abstract
This article's objective is to compare the Random Forest algorithm to the Decision Tree method to assess the quality of ethanol. The Wine Quality Reds dataset used in the proposed work is taken from the UCI library and has a total sample size of 1599. Training data (n=1199; 75% of total samples) and testing data (n=400; 25% of total samples) are separated from the gathered samples. Alpha and power are two distinct groups that are calculated using the G power tool. Random forest technique accuracy score values are computed for better ethanol quality identification. In comparison to the Decision Tree method, the accuracy of Random Forest was higher (82.21%). The model's derived significance value is 0.00 (p 0.05), and G power is discovered to be 0.8. This study discovered that the Random Forest algorithm detects ethanol quality substantially more accurately than the Decision Tree technique. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
42. Heart disease prediction using machine learning: A comprehensive study.
- Author
-
Odah, Suhier, Sbenaty, Batool, Ahmad, Esraa, Alhilo, Ghada, Tarawneh, Omar, and Otair, Mohammed
- Subjects
- *
HEART diseases , *MACHINE learning , *RANDOM forest algorithms , *DATA mining , *DECISION trees , *LOGISTIC regression analysis - Abstract
Heart Disease is one of the diseases that cause deaths in large numbers, so predicting these diseases is one of the most necessary tasks in the field of heart to reduce the number of deaths by providing health care to patients, so the process of forecasting these diseases aids in reducing disease-related risks and alerting patients to them. In this paper, Machine Learning (ML) algorithms were applied to a set of medical data to help diagnose these diseases and predict the chances of infection by using different data mining techniques like Decision Tree (DT), Naive Bayes (NB), Random Forest (RF), and Logistic Regression (LR). The dataset used 615 samples of medical data from the Kaggle Repository. After a comparison between these algorithms, the RF algorithm had the highest accuracy of 97.40% then LR with 90.24%, NB at 87.1545 %, and finally, DT with the low accuracy of 72.3577 %. It also used other evaluation matrices such as precision, recall, and f-measure for each algorithm, then compare accuracy founding with another paper using the same dataset and algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
43. A comparison of machine learning methods on intrusion detection systems for internet of things.
- Author
-
Widodo, Anteng, Warsito, Budi, and Wibowo, Adi
- Subjects
- *
INTERNET of things , *MACHINE learning , *RANDOM forest algorithms , *DECISION trees , *INTRUSION detection systems (Computer security) , *ALGORITHMS - Abstract
In recent years, the internet of things is prevalent and widely used. The new problem with IoT is security, which needs to be considered carefully because of the technology heterogeneity. These threats can affect IoT performance; therefore, it is necessary for effective monitoring. This paper examines several machine learning methods in intrusion detection systems that possibly run on IoT. Random Forests and Decision Tree are employed in this study for performance comparison. The experimental results show that the Random Forest and Decision tree algorithms application produces good performance with a faster response time and possible running on IoT. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
44. Towards a method to predict possible obstructions in a sewage system: A case study applied in the Aburrá Valley, Colombia.
- Author
-
Laverdee, Diego Andrés Valderrama and Tabares, Marta S.
- Subjects
- *
DECISION trees , *SEWAGE , *RANDOM forest algorithms , *SUPPORT vector machines , *RAINFALL , *FEATURE selection - Abstract
This article presents a predictive model to determine possible obstructions in the sewerage system of the Valle de Aburrá metropolitan area in Colombia. This city has special characteristics for being located in a valley surrounded by high mountains, susceptible to rainfall almost all year round and with a population of over 4 million inhabitants. To achieve this, we apply the CRISP-DM methodology. Initially, the way how the system determines the obstructions through preventive and corrective maintenance was identified. Subsequently, the data provided by the main assets that make up the system, especially the spillways, were understood; then, the dynamic and exogenous variables that could influence the occurrence of obstructions were selected. A case study with a dataset of 3980 records with information between the years 2018 - 2021, facilitated the application of feature selection techniques to analyze the most influential variables, and the predictive model was developed to identify possible obstructions in the system. We use algorithms such as neural networks, decision trees, random forests, logistic regression, and support vector machines. The results of the neural networks yielded an accuracy of 84% that was calculated with the cross-validation method and an F1 Score of 70%. In the evaluation, the model showed that it was determined by the variables number of previous obstructions, density of afferent clients, previous maintenance, and the average rainfall of the last days. These results expand the decision-making capacity to define preventive maintenance and reduce corrective maintenance in the spillways that make up the sewerage system. Finally, it is concluded that the model can correctly predict the possible obstructions, however, it is necessary to deploy it in the complete system so that the prediction increases its level of accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
45. Easy precision matrix ML based algorithm for student academic performance improvement.
- Author
-
Balamurugan, R. and Nirmala, K.
- Subjects
- *
RANDOM forest algorithms , *ACADEMIC improvement , *MACHINE learning , *ACADEMIC achievement , *VECTOR spaces , *DECISION trees - Abstract
Easy precision is obtained using a Machine Learning algorithm in student's academic performance analysis. If the semester result for the student is above certain grade, then the student obtained good academic result. On the other hand, if the test for the student is below grade 3 then he or she needs some training or counselling. To improve the students performance and to achieve good grade can be carried out by a machine learning algorithm. In the study of a student's achievement, a Machine Learning method provides easy precision. If a student's semester grade is above a given level, the student has done well academically. If the student's test score is less than a grade three, he or she will need some training or counselling. A machine learning algorithm is used to improve a student's performance and help them to get a good mark. Five methods were utilised to evaluate student performance in this paper: Random Forest algorithm, SVC (Space Vector Machine), decision tree, logistic regression, and Nave Bayes. In comparison to other methods, Decision Tree has a high level of accuracy. It is evident from this research that we might simply train the pupils in the objective aspect by applying machine learning methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
46. Mobility aid for identification of bus and auto-rikshaw for visually challenged people.
- Author
-
Bhatlawande, Shripad, Hegde, Ashish, Jain, Aryan, Jain, Naman, Shilaskar, Swati, and Madake, Jyoti
- Subjects
- *
PEOPLE with visual disabilities , *RANDOM forest algorithms , *BUSES , *PUBLIC transit , *DECISION trees , *MACHINE learning - Abstract
In this paper a machine learning based solution for detection of public transport the visually impaired people have been proposed. The proposed solution helps visually challenged people to detect cabs and buses. One of the major challenges faced by individuals with complete loss of vision is to navigate around places. Lack of awareness about the public transport facilities available for them in their vicinity is a challenge for the visually impaired. Existing assistive aids are not directed towards solving this problem. The solution proposed in this paper employs five machine learning classifiers namely Decision tree, Random Forest, SVM, Gaussian Naïve Bayes and KNN which are trained on a custom dataset. SIFT is employed for feature detection and PCA is used for dimensionality reduction. The trained model precisely detects cabs and buses, and then provides audio and haptic feedback to the user. The distinct points to note about the solution are the low power requirements and reduced latency both of which are essential for its purpose. In this approach maximum accuracy achieved is 98.1% with Random Forest classifier model. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
47. Electronic system for detection of vacant seats in public transport for visually impaired people.
- Author
-
Madake, Jyoti, Hude, Manan, Jadhav, Deepak, Bhatlawande, Shripad, and Shilaskar, Swati
- Subjects
- *
PEOPLE with visual disabilities , *PUBLIC transit , *ELECTRONIC systems , *RANDOM forest algorithms , *DECISION trees - Abstract
This paper is about introducing a new electronic seat detector (ESD) for providing detection of an empty seat in public transport to visually impaired persons. The main motive behind this paper is to make public transportation a convenient and suitable choice for visually impaired persons. A mundane task like vacant seat detection is challenging for a visually impaired person. They will feel empowered and comfortable if they can detect a vacant seat by themselves. In this work, a machine learning-based system is proposed to identify the vacant seat and give an audio clue of its position to the visually impaired person. Implementation is carried out with a dataset of 6000 images for training the model. The algorithm includes K-Means to form clusters of features and dimensionality reduction using PCA. Six classifiers such as Logistic Regression, KNN, Naive Bayes, SVC, Decision Tree and Random Forest are used in this work. The accuracy of detection was maximum for a random forest classifier. KNN 71.94%, Logistic Regression 61.55%, Naïve Bayes 61.63%, SVC 62.54%, Decision Tree 75.74% and Random Forest 90.09% accuracies are noticed. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
48. An expert system for detection of household furniture for safe mobility of blind people.
- Author
-
Bhatlawande, Shripad, Katkalambekar, Varad, Singh, Khushi, Kulkarni, Chinmay, Madake, Jyoti, and Shilaskar, Swati
- Subjects
- *
HOME furnishings , *RANDOM forest algorithms , *DECISION trees , *PEOPLE with visual disabilities , *INLAND navigation , *ASSISTIVE technology , *EXPERT systems - Abstract
The goal of this paper is to develop an assistive device for the visually impaired that will allow them to move more freely in unfamiliar indoor situations by recognizing household objects that may hinder their route. This solution focuses on safe navigation in unfamiliar interior environments. Human-assisted mobility and assistive aids are expensive, especially for people residing in developing countries like India. These are not affordable for the population which has a per capita GDP of 1500$. This solution aims to provide cues for movement in real-time which can be initiated by a simple obstacle detection. The basic feature of obstacle detection is to give a better experience and also be pocket-friendly. This system implements KNN, Random Forest, Decision Trees, and SVM classifiers. The proposed system analyzed that the Random Forest classifier gave the best performance accuracy for indoor obstacle detection. This system has implemented a simple guidance system that gives a tactile impulse to the user for moving in the right direction. The real-time assistive solution demonstrated 98.3% accuracy for object detection and reliable execution on a low-power device. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
49. Staircase detection for safe mobility of visually impaired people.
- Author
-
Madake, Jyoti, Bhatlawande, Shripad, Shilaskar, Swati, Deshpande, Piyush, and Deshpande, Viraj
- Subjects
- *
PEOPLE with visual disabilities , *STAIRCASES , *RANDOM forest algorithms , *FEATURE extraction , *DECISION trees , *VISION disorders - Abstract
Vision is extremely important to human beings, due to the fact that it perceives and interprets everything around by simply looking at it and its visual features. But some individuals have some kind of visual impairment and face many difficulties in their day-to-day life. They need assistance for navigation. This paper proposes a cane-based system to detect the presence of a staircase and alert the user. For the detection of stairs, a machine learning-based mode is developed. In this work SIFT is used for feature extraction. PCA is used for dimensionality reduction. 5 classifier algorithms are compared to classify the staircase. The classification accuracy for the classifiers is KNN 79.89%, Random Forest 95.54%, Logistic Regression 61.62%, SVM 72.29%, Decision Tree 82.99%. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
50. Detection and classification of cracking in nonlocal nanobeams using modal data and Haar wavelets.
- Author
-
Hein, Helle and Jaanuska, Ljubov
- Subjects
- *
RANDOM forest algorithms , *K-nearest neighbor classification , *DECISION trees , *INVERSE problems , *STRAIN energy , *CLASSIFICATION - Abstract
In this paper, classification and parameter prediction problems in vibrating cracked nanobeams are considered. The size-dependent nonlocal elasticity theory developed by Eringen is applied. The crack is modelled as massless rotational and longitudinal spring which connects the adjacent segments of the nanobeam. The spring is introduced to take into account the additional strain energy caused by the presence of a crack. The databases of modal data for inverse problems are calculated numerically. The classification and regression problems are studied by k-nearest neighbors, decision trees, and random forest algorithms. The importance of features in patterns is studied as well. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.