Descriptor: "Machine learning" / Journal: ieee access - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Machine learning"' showing total 7,063 results

Start Over Descriptor "Machine learning" Journal ieee access

7,063 results on '"Machine learning"'

1. Comprehensive Review of Privacy, Utility, and Fairness Offered by Synthetic Data

Author: A. Kiran, P. Rubini, and S. Saravana Kumar
Subjects: Artificial intelligence, machine learning, synthetic data, statistical disclosure control, differential privacy, privacy enhancing technology, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Automation is the core transformation strategy that every industry wants to get on its roadmap today. Artificial Intelligence (AI) and Machine Learning (ML) are the key components of automation. It is increasingly used in both data analysis and building predictive models from the data. Growing privacy concerns, data confidentiality, and disclosure risks have posed a challenge to the accessibility of right and meaningful data. Several privacy-preserving and disclosure-limiting techniques have come up through research. One such disclosure limiting technique is Synthetic Data. Early research efforts have shown that synthetic data is an effective substitute for real data which can be effectively used to train AI and ML models. However, this needs a comprehensive evaluation before the data user can be confident enough that it is indeed a good substitute for real data. In this paper, we look at three main parameters of synthetic data which should provide a holistic assessment of the quality of synthetic data. First and foremost, how well synthetic data can preserve privacy and control disclosure, second is how good is its utility, and third, are they able to give fair results without any bias when used in machine learning. We review the existing literature to understand various disclosure control limiting methods, synthetic data generators, and then the validation methodologies and evaluation techniques. We understand how data privacy, utility and the fairness of synthetic data intervene with each other and identify the areas for future work.
Published: 2025
Full Text: View/download PDF

2. Metabolomics Biomarkers in Prediction of Sudden Infant Death Syndrome: The Role of Short Chain Fatty Acids

Author: Maria Aslam, Omer Riaz, Jawaria Aslam, Dost Muhammad Khan, Mustafa Hameed, Muhammad Suleman, Rizwan Shahid, Turke Althobaiti, and Naeem Ramzan
Subjects: Sudden infant death syndrome, biomarkers, machine learning, fatty acids, short chain fatty acids, healthcare, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Sudden Infant Death Syndrome (SIDS) presents a significant challenge, necessitating ongoing research and preventive measures. The intricate landscape of lipid metabolism plays a crucial role in SIDS, with disruptions in key lipid components like Short Chain Fatty Acids (SCFA), alongside other lipids such as triglycerides (TG) and phospholipids (PL), being significant. In this context, SCFA are essential products of the fermentation process by gut microbiota, hold particular interest. SCFA are integral to energy regulation and metabolism, influencing overall well-being. Their unique characteristics, such as chain length and saturation level, provide insights into their potential effects. Alterations in SCFA metabolism can disrupt energy balance, adding to the complexity of SIDS. Leveraging machine learning (ML) presents a promising avenue for unraveling the intricate profiles of SCFA and decoding patterns indicative of heightened SIDS risk. Ensuring interpretability in healthcare is essential for building trust and developing effective prevention strategies. This research delves into understanding SIDS, with a specific focus on SCFA and their role in metabolic health. The application of ML, particularly the Artificial Neural Network (ANN) and Stacking model, demonstrated exceptional accuracy of 94% and 96.15% with a recall of 100% and 92.31%, respectively. The models also demonstrated strong classification capabilities, as indicated by a high True Positive Rate (TPR) in the AUC, a low Root Mean Square Error (RMSE) of 0.20, Mean Absolute Error (MAE) of 0.04 and Standard deviation (SD) of 0.10, emphasizing the robustness and precision of the approach. These results underscore the potential of ML in the early assessment of SIDS risk, highlighting the critical role of SCFA and advancing the prospects for preventative healthcare.
Published: 2025
Full Text: View/download PDF

3. Ad Click Fraud Detection Using Machine Learning and Deep Learning Algorithms

Author: Reem A. Alzahrani, Malak Aljabri, and Rami A. Mustafa Mohammad
Subjects: Click fraud, machine learning, deep learning, online-advertising, bot detection, pay-per-click, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: In online advertising, click fraud poses a significant challenge, draining budgets and threatening the industry’s integrity by redirecting funds away from legitimate advertisers. Despite ongoing efforts to combat these fraudulent practices, recent data emphasizes their widespread and persistent nature. Toward detecting click fraud effectively, this study employed a comprehensive feature engineering and extraction approach to identify subtle differences in click behavior that could be used to distinguish fraudulent from legitimate clicks. Subsequently, a thorough evaluation was conducted involving nine diverse machine learning (ML) and Deep Learning (DL) models. After Recursive Feature Elimination (RFE), the ML models consistently demonstrated robust performance. DT and RF surpassed 98.99% accuracy, while GB, LightGBM, and XGBoost achieved 98.90% or higher. Precision scores, measuring accurate identification of fraudulent clicks, exceeded 98% for models like ANN. In parallel, deep learning (DL) models, including Convolutional Neural Network (CNN), Deep Neural Network (DNN), and Recurrent Neural Network (RNN), showcased strong performance. RNN, in particular, achieved 97.34% accuracy, emphasizing its efficacy. The study underscores the prowess of tree-based methods and advanced algorithms in detecting click fraud, as evidenced by high accuracy, precision, and recall scores. These findings contribute valuable insights to combat click fraud and establish the groundwork for the strategic development of anti-fraud measures in online advertising.
Published: 2025
Full Text: View/download PDF

4. Machine Learning-Integrated Usability Evaluation and Monitoring of Human Activities for Individuals With Special Needs During Hajj and Umrah

Author: Ghadah Naif Alwakid
Subjects: Machine learning, usability evaluation, monitoring, human activities, anomaly detection, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Human activity recognition (HAR) is an important aspect of the safety and accessibility of individuals with disabilities. This is especially essential during large-scale events like Hajj and Umrah, which attract millions of participants each year. These gatherings pose significant challenges for people with disabilities and affect both their mobility and security. To address these issues, this study introduces a new approach to improve usability and tracking for disabled pilgrims performing the two holy pilgrimages: Hajj and Umrah, which are performed annually by millions of Muslims. These acts of worship are still very challenging for the mobility, security, and accessibility of people with special needs. Using clustering, anomaly detection, and predictive modeling, it was intended to enhance the safety and security of sensitive participants. Using the K-Means algorithm and the Elbow Method as the initial indicators to classify the clusters, we reveal four clusters that relate to different human activities that are further visualized through principal component analysis (PCA). Activities are clustered based on the behavioral patterns observed among participants, followed by the performance of anomaly detection. The analysis reveals that the string ‘WALKING_DOWNSTAIRS’ represents a sensitive and rare event in the training data set with a count of 36, which indicates possible walking disabilities that human beings may experience. The proposed study used two machine learning models, i.e., random forest and sequential neural networks, both with 93% accuracy. The performance was fairly accurate, with RF scoring 1 for ‘LAYING’ and SNN scoring 0.99 for ‘WALKING_UPSTAIRS’. The usefulness of this study is to improve the safety of the disabled participants in performing Hajj and Umrah. Besides extending the knowledge base and technical development of HAR and machine learning, this work has implications for specific real-life problems of accessibility and security in large-scale religious gatherings.
Published: 2025
Full Text: View/download PDF

5. Model-Oriented Training of Coordinators of the Decentralized Control System of Technological Facilities With Resource Interaction

Author: Volodymyr M. Dubovoi, Maria S. Yukhimchuk, Viacheslav V. Kovtun, and Krzysztof R. Grochla
Subjects: Machine learning, simulation model, distributed control system, decentralized coordination, model-based learning, collaborative federated learning, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: The control process of technological facilities with resource interaction in a decentralized system requires coordination of local systems for control of the state of objects. For the implementation of coordination methods, learning systems have an advantage since they can flexibly adapt to the specifics of each facility control. However, the coordinators’ training process is complicated by the lack of labelled datasets for technological facilities. In decentralized control systems, the problem is complicated by the need to train all coordinators, with the outcome depending on the coordinator’s position within the structure of the distributed control system. This article explores the prospects of model-based learning for solving the problem of missing datasets used for coordinators’ training. An approach to determining the optimal statistics of the training dataset for the coordination control of nonlinear technological facilities with resource interaction is proposed. A combined three-stage process of coordinator training for the decentralized system is proposed. In the first stage, one coordinator is trained on the basis of a distributed system simulation. In the second stage, the settings of the trained coordinator are applied to other coordinators, which are retrained in parallel on the basis of simulation models of local control systems of the relevant parts of the technological facilities. In the third stage, coordinators are fine-tuned to real conditions using Bayesian random search. Conducted experimental studies of the proposed method of training neural network coordinators, implemented on Python TensorFlow, showed greater effectiveness of Collaborative Federated Learning compared to independent training of coordinators or direct transfer of learning outcomes between coordinators.
Published: 2025
Full Text: View/download PDF

6. Imbalanced Data Problem in Machine Learning: A Review

Author: Manahel Altalhan, Abdulmohsen Algarni, and Monia Turki-Hadj Alouane
Subjects: Imbalanced data, machine learning, balance techniques, evaluation methods, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: One of the prominent challenges encountered in real-world data is an imbalance, characterized by unequal distribution of observations across different target classes, which complicates achieving accurate model classifications. This survey delves into various machine learning techniques developed to address the difficulties posed by imbalanced data. It discusses data-level methods such as oversampling and undersampling, algorithm-level solutions including ensemble learning and specific algorithm adjustments, cost-sensitive algorithms, and hybrid strategies that combine multiple approaches. Moreover, this paper emphasizes the crucial role of evaluation methods like Precision, F1 Score, Recall, G-mean, and AUC in measuring the effectiveness of these strategies under imbalanced conditions. A detailed review of recent research articles helps pinpoint persistent gaps in generalizability, scalability, and robustness across these methods, underscoring the necessity for ongoing improvements. The survey seeks to offer an extensive overview of current approaches that improve the efficiency and effectiveness of machine learning models dealing with imbalanced datasets, thus equipping researchers with the insights needed to develop robust and effective models ready for real-world application.
Published: 2025
Full Text: View/download PDF

7. Enhancing Indoor mmWave Communication With ML-Based Propagation Models

Author: Gustavo Adulfo Lopez-Ramirez and Alejandro Aragon-Zavala
Subjects: 5G, mmWave, path loss, wireless communications, indoor propagation modeling, machine learning, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: With the advancement of 5G and emerging wireless communication technologies, accurate modeling of wave propagation in indoor environments has become increasingly crucial. This study focuses on demonstrating how machine learning (ML) techniques can be applied to predict path loss within the millimeter wave (mmWave) spectrum in a specific indoor environment. We address high-frequency challenges like path loss and complex building layouts that impact signal propagation. We employ various ML models, including Artificial Neural Networks (ANNs), hybrid models integrating linear regression, ANNs, and Gaussian Processes, and Extreme Gradient Boosting (XGBoost), to predict and analyze the propagation loss in a controlled indoor setting. The models were trained and validated using data collected from a comprehensive measurement campaign at 28 GHz, which involved high precision radio equipment in a complex indoor environment. Our results demonstrate that while traditional models provide a baseline for understanding path loss, advanced ML models, particularly hybrid approaches, significantly enhance prediction accuracy and provide a deeper understanding of indoor propagation dynamics within this specific environment. The study highlights the potential of ML in overcoming the limitations of empirical models and showcases methodologies that can be adapted for similar indoor scenarios. This research advances our understanding of mmWave propagation indoors and sets a framework for utilizing ML in telecommunication system design and optimization in specific environments.
Published: 2025
Full Text: View/download PDF

8. Lithium-Ion Battery State of Health Degradation Prediction Using Deep Learning Approaches

Author: Talal Alharbi, Muhammad Umair, and Abdulelah Alharbi
Subjects: Lithium-ion batteries, deep learning, State of Health (SoH), electric vehicle, battery state, machine learning, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Timely prediction of the State of Health (SoH) of lithium-ion batteries is important for battery management and longevity. Traditional centralized deep learning models have shown promising results, but they raise concerns related to data privacy, as data needed to be collected and trained on a single node. This study addresses this challenge by utilizing both centralized (i.e., deep learning) and decentralized (i.e., federated learning) approaches for SoH prediction. The NASA battery dataset, containing charging and discharging cycles, is used for model training and evaluation. Three deep learning architectures 1D Convolutional Neural Networks (CNN), CNN plus Long Short-Term Memory (LSTM), and CNN plus Gated Recurrent Units (GRU) are used in the centralized approach. The 1D CNN model outperforms, demonstrating strong predictive capabilities, thus for decentralized learning (i.e., federated learning), the 1D CNN model is utilized with federated averaging technique across five clients, allowing for local training without sharing raw data. Obtained results shows that the highest testing RMSE (0.666) and MAPE (0.980) are observed during decentralized learning, while the centralized approach shows varying performance across different batteries. The decentralized approach effectively balances performance and privacy, highlighting the reliability of federated learning in SoH prediction for lithium-ion batteries.
Published: 2025
Full Text: View/download PDF

9. Targeted Discrepancy Attacks: Crafting Selective Adversarial Examples in Graph Neural Networks

Author: Hyun Kwon and Jang-Woon Baek
Subjects: Graph neural network, adversarial example, evasion attack, node classification, machine learning, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: In this study, we present a novel approach to adversarial attacks for graph neural networks (GNNs), specifically addressing the unique challenges posed by graphical data. Unlike traditional adversarial attacks, which aim to perturb the input data to induce misclassifications in the target model, our approach strategically crafts adversarial examples to exploit discrepancies in model behavior. We introduce the concept of selective adversarial examples, which are instances that are correctly classified by a “friendly” model but misclassified by an “adversary” model. To achieve this, we propose a novel loss function formulation that simultaneously maximizes the probability of correct classification using a friendly model and minimizes the probability of correct classification using an adversary model. This approach facilitates the generation of adversarial examples that are both subtle and effective, necessitating minimal perturbations in the input graph. We systematically explain the principles and structure of our method and evaluate its performance through experiments conducted on a GNN using the Reddit, ogbn-product, and Citeseer datasets. Our results demonstrate the effectiveness of the proposed approach in generating selective adversarial examples, highlighting its potential applications in military environments, where the ability to selectively target adversary models is crucial. In addition, we provide visualizations of graph adversarial examples to aid in understanding the nature of the attacks. Overall, our contributions are threefold: First, we pioneer the concept of selective adversarial examples within the graph domain. Second, we provide comprehensive insights into the systematic generation and evaluation of these examples. Third, we furnish empirical evidence demonstrating their effectiveness in compromising the robustness of models.
Published: 2025
Full Text: View/download PDF

10. Dust Storm Attenuation Prediction Using a Hybrid Machine Learning Model Based on Measurements in Sudan

Author: Elfatih A. A. Elsheikh, E. I. Eltahir, Abdulkadir Tasdelen, Mosab Hamdan, Md Rafiqul Islam, Mohamed Hadi Habaebi, and Aisha H. Abdullah Hashim
Subjects: Dust storm attenuation, microwave propagation, meteorological parameters, terrestrial communication, machine learning, XGBoost, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Sand and dust storms significantly challenge microwave and millimeter-wave communications, particularly in arid and semi-arid regions. Various models have been developed to predict attenuation caused by these storms theoretically and empirically based on two meteorological parameters, namely visibility and humidity. However, these models are found unable to predict most of the attenuation measurements. This study presents a hybrid Machine Learning (ML) model that predicts dust storm attenuation for 22 GHz terrestrial links using meteorological data. The received signal levels were measured for a 22 GHz link over a month in Khartoum, Sudan. The visibility, humidity, atmospheric pressure, temperature and wind speed were also monitored simultaneously by Automatic Weather Station (AWS). The proposed model incorporates XGBoost for feature selection and combines Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) layers to capture both short-term and long-term dependencies in meteorological data. The results demonstrate a strong correlation between meteorological parameters and dust storm attenuation. The model’s performance is validated against the measured data at 22 GHz, outperforming existing empirical and theoretical models. The RMSE for the proposed model is 0.07, while all existing theoretical and empirical models are higher than 0.25. Furthermore, the proposed model demonstrates significant enhancements over the available ML model for dust attenuation prediction. This hybrid ML approach offers a more accurate and robust solution for predicting microwave and millimetre wave attenuation during dust storms, enhancing the reliability of communication systems in affected regions.
Published: 2025
Full Text: View/download PDF

11. XAI-Enhanced Machine Learning for Obesity Risk Classification: A Stacking Approach With LIME Explanations

Author: Mohammad Azad, Md Faraz Kabir Khan, and Sameh Abd El-Ghany
Subjects: Obesity, machine learning, stacking, explainable AI, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Obesity remains a critical global health challenge, necessitating early risk assessment to guide preventive measures and mitigate potential complications. While various research endeavors have explored obesity classification, many existing approaches lack reliability due to limited integration with explainable artificial intelligence (XAI) methodologies. In this study, we propose a robust machine learning framework that incorporates Explainable AI (XAI) principles to accurately estimate obesity levels and provide insights into the factors influencing the predictions. We utilize the publicly available dataset from Palechor and Manotas available in the UCI ML repository which contains relevant information on individuals’ physical characteristics and behaviors. Our proposed model employs an ensemble approach, specifically a stacking algorithm, where the base estimators include the Light Gradient Boosting Machine (LGBM) classifier, the Logistic Regression (LR) classifier, and the Random Forest (RF) Classifier, and the Stochastic Gradient Descent (SGD) classifier is selected as the final estimator. To enhance model interpretability and reliability, we integrate a widely accepted XAI method, Local Interpretable Model-agnostic Explanations (LIME). Our proposed framework achieves a peak accuracy of 98.82%, surpassing most existing techniques. By incorporating LIME, we not only enhance model trustworthiness but also provide deeper insights into the factors driving obesity risk. Overall, our approach contributes to advancing personalized interventions and bridging the gap between model complexity and human understanding.
Published: 2025
Full Text: View/download PDF

12. Path Planning for Fully Autonomous UAVs-A Taxonomic Review and Future Perspectives

Author: Geeta Sharma, Sanjeev Jain, and Radhe Shyam Sharma
Subjects: Path planning, deep learning, deep reinforcement learning, heuristic approaches, machine learning, unmanned aerial vehicle (UAV), Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Autonomous Unmanned Aerial Vehicles (UAVs) rely on advanced path planning to operate independently, especially in unfamiliar settings without human intervention. The process typically involves localization, mapping, optimal path selection, motion planning, and control. Achieving autonomous navigation from one point to another requires balancing various factors, such as energy efficiency, speed, cost, path length, and computation time. In this study, a novel taxonomy that categorizes UAV path planning into four distinct classifications is introduced. A systematic review of the technological advancements in path planning methodologies, tracking the evolution from classical techniques to cutting-edge solutions, with a particular focus on dynamic environments is also represented. This review includes a comparative analysis of various methods based on factors including approach utilized, testing platform, environment type, time efficiency, etc. Various relevant parameters and benchmark datasets that are crucial for UAV path planning are also explored. Despite widespread use, current methodologies still face significant challenges, such as handling unknown threats in dynamic environments, effective obstacle avoidance, limited payload capacity, real-time responsiveness, and energy consumption. These issues limit their overall usefulness. This study aims to highlight these challenges and suggest potential directions for future research, ultimately contributing to the advancement of real-time UAV path planning.
Published: 2025
Full Text: View/download PDF

13. Enhancing Credit Risk Decision-Making in Supply Chain Finance With Interpretable Machine Learning Model

Author: Guanglan Zhou and Shiru Wang
Subjects: Supply chain finance, sustainable, credit risk, XGBoost, SHAP, machine learning, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: The increasing complexity of supply chain finance poses significant challenges to effective credit risk assessment. Traditional black-box models often fail to provide insights into the factors driving credit risk, which is essential for stakeholders when making informed decisions. By conducting analysis of interpretable machine learning models, the study evaluated their performance in assessing credit risks. Specifically, we applied Extreme Gradient Boosting (XGBoost), Random Forest (RF), Least Squares Support Vector Machine (LSSVM) and Convolutional Neural Network (CNN) models for risk assessment. Our methodology included an ablation experiment along with utilizing Shapley Additive Explanation (SHAP) to elucidate the contribution and significance of specific risk factors. The results indicated that the asset-liability ratio, cash ratio, and quick ratio notably influence credit risk. This study clarified the applicability and limitations of various models, highlighting the superior performance and interpretability of XGBoost through the SHAP algorithm. Ultimately, the insights from this study provided valuable guidance for companies and financial institutions, fostering more sustainable allocation of financial resources.
Published: 2025
Full Text: View/download PDF

14. A Novel Pattern Recognition Method for Self-Powered TENG Sensor Embedded to the Robotic Hand

Author: Azat Balapan, Rauan Yeralkhan, Alikhan Aryslanov, Gulnur Kalimuldina, and Azamat Yeshmukhametov
Subjects: Dataset collection, machine learning, robot hand design, signal processing, TENG sensor, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: This paper presents the development and implementation of a human-like robotic hand integrated with advanced triboelectric nanogenerator (TENG) based tactile sensors for shape and material recognition. Meanwhile, traditional piezo sensors’ effectiveness is limited, sensitive to the temperature, and the manufacturing cost is high. TENG sensors offer a self-powered alternative with simplified circuitry, cost-effective fabrication, and enhanced durability. To capitalize on these benefits, we propose a novel machine learning approach that represents time-series data as two-dimensional images processed using a two-dimensional convolutional neural network (2D CNN). This method is compared against the traditional one-dimensional convolutional neural network (1D CNN) method. The research methodology encompasses TENG sensor preparation, noise cancellation, robotic hand design, and control electronics. Experimental results demonstrate that the proposed 2D CNN method significantly improves shape and material recognition accuracy, achieving 98% and 99%, respectively, compared to 94% and 98% with the 1D CNN method. Real-time evaluation further validates the robustness and adaptability of the proposed model in unstructured environments. These findings underscore the potential of integrating TENG sensors with advanced neural network architectures for autonomous dexterous manipulation in various industrial applications, paving the way for future advancements in robotic tactile sensing.
Published: 2025
Full Text: View/download PDF

15. RFMVDA: An Enhanced Deep Learning Approach for Customer Behavior Classification in E-Commerce Environments

Author: Kwanhee Kim, Mingyu Jo, Ilkyeun Ra, and Sangoh Park
Subjects: Customer segmentation, customer classification, machine learning, deep neural network (DNN), customer data platform (CDP), customer relationship management (CRM), Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Customer Relationship Management (CRM) systems, widely used in enterprises, have evolved into Software-as-a-Service (SaaS) platforms. With the advent of Customer Data Platforms (CDP), these systems continuously store customer behavior data for purposes such as creating single customer profiles, analyzing, tracking, and managing customer interactions from various perspectives. With the global expansion of the e-commerce market, research on customer analysis and classification optimized for the e-commerce environment has been actively conducted. The RFM (Recency, Frequency, Monetary) model is a straightforward method for classifying customers and is applied across various industries. However, in the e-commerce environment, where customers can access services at any time, there are limitations in collecting, storing, and reflecting customer behavior data for classification. To resolve these limitations, this paper proposes the RFMVDA (Recency, Frequency, Monetary, Visits, Durations, Actions) model. This model is designed to capture customer data, sessions, and behavior units suitable for the e-commerce environment. By utilizing the RFMVDA model for customer behavior-based segmentation and classification, we constructed a Deep Neural Network (DNN) to predict customer behavior-based classifications. As a result, the proposed model demonstrated a segmentation prediction accuracy of 92.98% for customers in the e-commerce environment.
Published: 2025
Full Text: View/download PDF

16. Machine Learning-Based Detection of Anomalies, Intrusions, and Threats in Industrial Control Systems

Author: Denis Benka, Dusan Horvath, Lukas Spendla, Gabriel Gaspar, and Maximilian Stremy
Subjects: Anomaly detection, intrusion detection systems, machine learning, threat detection, programmable logic controllers, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Industrial Control Systems (ICS) are critical to the efficient operation of essential sectors such as manufacturing, energy, and water management. However, their increasing integration with IT systems exposes them to sophisticated cyberattacks, particularly lateral attacks targeting Programmable Logic Controllers (PLCs). Advanced preventive measures are necessary because, despite their significance, many ICS continue to rely on outdated technologies with few security features. This paper proposes a machine learning (ML)-based approach to anomaly detection in ICS communication networks, focusing on techniques such as 1D Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM) networks, Support Vector Machines (SVMs), and Isolation Forest (iForest) algorithms. We generated a dataset by capturing both normal and manipulated ICS communication patterns, including TCP/IP traffic. Simulated lateral attacks provided realistic data for training and testing the ML models. The results demonstrate that the 1D CNN model achieved the highest accuracy (0.92) and F1 score (0.91) with minimal processing time, making it ideal for real-time intrusion detection. This research highlights the potential of ML techniques to fortify ICS cybersecurity and lays the groundwork for future advancements in critical infrastructure resilience.
Published: 2025
Full Text: View/download PDF

17. Comprehensive Analysis of Masking Techniques in Molecular Graph Representation Learning

Author: Bonyou Koo and Sunyoung Kwon
Subjects: Graph neural network, masking, molecular graph, representation learning, machine learning, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Molecule representation learning is a primary area of focus in drug discovery and molecular property prediction. In previous studies, molecules have been modeled as graphs, enabling graph neural networks (GNNs) to capture essential structural information. Recent approaches have enhanced molecular representations by introducing advanced masking strategies, such as extending granularity from nodes to subgraphs, shifting masking locations, and applying masking during downstream tasks. However, comprehensive analyses of these strategies remain limited. In this study, we systematically evaluate masking techniques across various phases, granularities, locations, feature types, and ratios. Our findings reveal that node feature masking during pre-training achieves high performance, while rich features may reduce gains, and the commonly used 25% masking ratio is not universally optimal, with alternative ratios performing better depending on the dataset. Our study provides deeper insights into the benefits of masking techniques in molecular graphs and highlights their potential to improve semantic understanding and predictive accuracy in graph-based learning.
Published: 2025
Full Text: View/download PDF

18. Smell-ML: A Machine Learning Framework for Detecting Rarely Studied Code Smells

Author: Esraa Hamouda, Abeer El-Korany, and Soha Makady
Subjects: Code smell detection, machine learning, data balancing, ensemble learning, multi-level classification, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Code smells are design flaws that reduce the software quality and maintainability. Machine learning classification models have been used to detect different code smells. However, such studies targeted code smells in depth, while leaving other under-explored smells; even so, such smells have a significant impact on source code quality. Recent surveys have highlighted a group of code smells that has rarely been studied by researchers. Furthermore, some machine learning classification models were evaluated on a subset of the source code features while ignoring significant features during classification. This paper proposes a novel approach, called Smell-ML, for detecting five rarely studied code smells: Middle Man (MM), Class Data Should Be Private (CDSBP), Inappropriate Intimacy (II), Refused Bequest (RB), and Speculative Generality (SG). The novelty of this approach stems from the improvement in both the data preparation and classification phases. During data preparation, Smell-ML relies on data balancing and an extended source code feature list to improve accuracy. In the classification phase, different classifiers were assessed, including traditional, ensemble, and multi-level classifiers. We evaluated Smell-ML on a dataset composed of 13 open source Java projects with 125 versions per project. The results show that Smell-ML’s detection F1-score values surpass those of previous studies with significant improvements across various code smells. The F1-score measure of the 11 machine learning classifiers improved after using the extended feature list. Data balancing and multi-level classification notably boosted accuracy.
Published: 2025
Full Text: View/download PDF

19. Analyzing the Efficacy of Computer-Aided Detection in Cerebral Aneurysm Diagnosis Using MRI Modality: A Review

Author: Keerthi A. S. Pillai, Preena K. P., and Madhu S. Nair
Subjects: Cerebral aneurysms, computer aided detection, magnetic resonance imaging, machine learning, deep learning, convolutional neural networks, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Computer-aided detection (CAD) models play a critical role in the clinical diagnosis of cerebral aneurysms, significantly contributing to the reduction of mortality rates associated with this condition. This article provides a comprehensive overview of the evolution of CAD models for aneurysm detection, with a particular focus on MRI modalities. It explores the motivations behind CAD systems, the methodologies employed, and their respective advantages and limitations, offering valuable insights into the current state-of-the-art (SOTA) CAD systems. The research papers selected for this review focus on research utilizing TOF MRA as the imaging modality and emphasize computer-aided detection through both traditional and deep learning techniques, with a particular emphasis on Convolutional Neural Networks (CNNs). CNNs have proven to be a crucial component in improving the accuracy and efficiency of aneurysm detection by automatically learning features from raw imaging data, bypassing the need for manual feature extraction. The article also presents a detailed experimental analysis of deep learning models, benchmarked using TOF MRA datasets. Key research gaps are identified, including the need for large training samples, challenges in Maximum Intensity Projection (MIP) imaging, limitations of 2D architectures, and issues related to overfitting and computational complexity. The review also observes that shallow networks and pretrained models are effective in addressing these challenges. In addition to identifying these gaps, the review outlines future directions for the development of CAD systems, aiming to further advance CAD models for aneurysm detection.
Published: 2025
Full Text: View/download PDF

20. Depression Detection in Social Media: A Comprehensive Review of Machine Learning and Deep Learning Techniques

Author: Waleed Bin Tahir, Shah Khalid, Sulaiman Almutairi, Mohammed Abohashrh, Sufyan Ali Memon, and Jawad Khan
Subjects: Deep learning, depression detection, machine learning, natural language processing, sentiment analysis, social media, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Depression is a widespread mental health disorder that may remain undiagnosed by conventional clinical methods. The rapidly growing world of social media sites such as Twitter, Reddit, Facebook, Instagram, and Weibo has provided new avenues for depression detection using Machine Learning (ML) as well as Deep Learning (DL), which analyze user behavior patterns and linguistic cues for more accurate detection of depression. Many techniques have been developed for this aim over the years. Identifying relevant publications on this topic using current academic search systems is challenging due to the rapid growth of research publications, unclear or limited search terms, and the complexity of citation networks. Several review papers have been published to ease this task by summarizing the methodologies, key findings, and recommendations for future research. However, most current reviews often do not provide a clear overview of the evolution, latest techniques, and challenges. This paper aims to address that gap by providing a comprehensive review of ML and DL methodologies for detecting depression on social media. We propose a generic architecture for these systems and present a detailed analysis of methodologies and datasets used for evaluation in this field. In addition, we highlight key open research areas, providing a useful starting point for further research and development. By narrowing our focus to social media, this review contributes to advancing the understanding and application of cutting-edge methods for depression detection. While this review highlights advancements in social media-based depression detection, it excludes alternative approaches like graph-based systems and reinforcement learning, and its focus on social media may limit its applicability to other domains.
Published: 2025
Full Text: View/download PDF

21. A Comprehensive Review of AI’s Current Impact and Future Prospects in Cybersecurity

Author: Abdullah Al Siam, Moutaz Alazab, Albara Awajan, and Nuruzzaman Faruqui
Subjects: Artificial intelligence, cyber security, cyberattack, machine learning, deep learning, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: The use of artificial intelligence (AI) technology signifies a significant milestone in the swiftly evolving domain of cybersecurity. This study offers a comprehensive literature review on the role, effect, and future prospects of AI across five critical areas of cybersecurity: threat detection, endpoint security, phishing and fraud detection, network security, and adaptive authentication. The study examines contemporary developments in AI for cybersecurity, highlighting the use of these technologies to enhance security protocols. We examine cutting-edge AI methodologies and principal models across many domains, including machine learning algorithms, deep learning architectures, natural language processing techniques, and anomaly detection algorithms, emphasizing their distinct contributions to enhancing security. Essential comparisons of AI models are presented for each area, outlining their main applications, advantages, and drawbacks. The article examines the assessment criteria and performance outcomes of AI-driven cybersecurity solutions. This report synthesizes previous research while identifying gaps and future prospects, including the integration of emerging AI approaches, the enhancement of real-time threat detection capabilities, and the addressing of changing attack vectors. By providing a holistic view of the current state and future potential of AI in cybersecurity, this paper aims to serve as a foundational reference for researchers and practitioners seeking to leverage AI for robust and adaptive security solutions.
Published: 2025
Full Text: View/download PDF

22. Low-Cost Driver Monitoring System Using Deep Learning

Author: Hady A. Khalil, Sherif A. Hammad, Hossam E. Abd El Munim, and Shady A. Maged
Subjects: Deep learning, machine learning, AI, Raspberry Pi, driver monitoring system, tinyML, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Driver monitoring systems are becoming an essential part of Advanced Driver Assistance Systems (ADAS) safety features in modern vehicles. The U.S. National Highway Traffic Safety Administration reports that drowsy/fatigued driving results in almost 100,000 road accidents per year. Driver’s fatigue can have different causes, such as lack of sleep, long journeys, restlessness, mental pressure and alcohol consumption. Early monitoring systems relied on data from vehicle sensors, and modern systems commonly use driver’s eye tracking. Recently, there has been growing interest in utilizing machine vision and deep learning for driver monitoring. Using machine vision can create more advanced driver monitoring systems capable of detecting driver attention state as well as other features like smartphone usage while driving and seat belts. Machine vision systems usually require extensive processing power, which raises the cost of such systems. In this paper, we present a low-cost driver monitoring system using a 15 Raspberry Pi Zero 2 W board and deep learning CNN to deliver a system capable of monitoring and identifying different states of the driver like safe driving, distracted, drowsy, and smartphone usage, the system achieves an inference rate for 10 Frames Per Second (FPS) and above 90% accuracy with the testing dataset. In addition to the deep learning CNN which runs on Raspberry Pi CPU, we utilize the Raspberry Pi GPU to run a head pose estimation algorithm to boost the system’s accuracy.
Published: 2025
Full Text: View/download PDF

23. Leveraging Simplex Gradient Variance and Bias Reduction for Black-Box Optimization of Noisy and Costly Functions

Author: Mircea-Bogdan Radac and Titus Nicolae
Subjects: Black-box, control systems, deep learning, finite differences, machine learning, optimization, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Gradient variance errors in gradient-based search methods are largely mitigated using momentum, however the bias gradient errors may fail the numerical search methods in reaching the true optimum. We investigate the reduction in both bias and variance errors attributed to the simplex gradient estimated from noisy function measurements, in favor of the finite-differences gradient, when both are used for black-box optimization methods. Regardless of the simplex orientation, while reducing the gradient bias error owned to several factors such as truncation, numerical or measurement noise, we claim and verify that, under relaxed assumptions about the underlying function’s differentiability, the estimated gradient by the simplex method has at most half the variance of the finite-difference gradient. The findings are validated with two comprehensive and representative case studies, one related to the minimization of a nonlinear feedback control system cost function and the second related to a deep machine learning classification problem whose hyperparameters are tuned. We conclude that in up to medium-size practical black-box optimization problems with unknown variable domains and where the noisy function measurements are expensive, a simplex gradient-based search is an attractive option.
Published: 2025
Full Text: View/download PDF

24. A Cloud-Based Optimized Ensemble Model for Risk Prediction of Diabetic Progression—An Azure Machine Learning Perspective

Author: V. K. Daliya and T. K. Ramesh
Subjects: Diabetic prediction, ensemble learning, KNN, LightGBM, Machine learning, voting classifier, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: The application of Machine Learning for predictive analysis in healthcare, particularly for diseases like diabetes, has proven highly beneficial. This study introduces an optimized Light Gradient-Boosting Machine (Light GBM) and K-Nearest Neighbour (KNN) based ensemble algorithm for predicting diabetic progression of Type 2 Diabetes, classifying it as high or low risk, using patient health parameters and serum measurements. Our model uses LightGBM, a rapid and efficient gradient boosting framework, coupled with KNN, which uses proximity to classify data points. The proposed model uses various optimization techniques, such as 10 fold cross validation, grid search method etc. to get the best results out of the ensemble model. As the model combines optimized version of LightGBM and KNN through a voting classifier which uses soft voting technique to find the final class, it utilizes the predictive capabilities of both the methods in an effective manner. The experiment is performed and implemented in Microsoft’s Azure cloud, using Azure Machine Learning service, that leverages the advantages of cloud computing with respect to scalability, security and its potential integration possibilities into IoT-based smart healthcare systems.This aspect highlights its versatility and impact with respect to remote monitoring of patients as well. The ensemble achieves an 83.2% Area Under the Curve (AUC) of Receiver Operating Characteristics (ROC) score, indicating good classification efficiency. It produced 75% accuracy as well. The proposed model is compared with other classification and ensemble models, showcasing its superiority against other models.The ensemble is also tested with some meta heuristic optimization methods, which produced comparable scores. The method’s effectiveness is validated against another risk prediction dataset, proving its reliability. The model’s accurate predictions can aid individuals in understanding disease progression risks and guide medical professionals in intervention strategies.
Published: 2025
Full Text: View/download PDF

25. Integrating Multilayer Perceptron and Support Vector Regression for Enhanced State of Health Estimation in Lithium-Ion Batteries

Author: Sadiqa Jafari, Jisoo Kim, Wonil Choi, and Yung-Cheol Byun
Subjects: Lithium-ion battery, SOH, optimization algorithms, ensemble learning, machine learning, battery performance, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Accurately evaluating the State of Health (SOH) of batteries is crucial for guaranteeing the secure and dependable functioning of Electric Vehicles (EVs). This paper presents a novel strategy for tackling the difficulties associated with intricate preprocessing and the demand for extensive data in conventional approaches to SOH measurement. Using sophisticated machine learning algorithms, we suggest an all-encompassing methodology for predicting the SOH. Our approach includes meticulous data preparation, which includes analyzing crucial operating elements such as voltage, current, and temperature. We utilized Support Vector Regression (SVR) and Multilayer Perceptron (MLP) models, which were fine-tuned using hyperparameter optimization. The models were assessed using evaluation metrics such as Root Mean Squared Error (RMSE), Mean Squared Error (MSE), and R-squared $R^{2}$ . In order to improve the accuracy of our predictions, we combined these models into a stacked ensemble using a Random Forest (RF) meta-model. This resulted in an $R^{2}$ value of 0.987, MAE of 0.02559, MSE of 0.0013, and RMSE of 0.00624. The results indicate that the ensemble outperforms individual models in predicting SOH. This research highlights the capacity of ensemble learning in predictive maintenance and battery management.
Published: 2025
Full Text: View/download PDF

26. Literature Review of Machine Learning and Threat Intelligence in Cloud Security

Author: Rrezearta Thaqi, Bujar Krasniqi, Artan Mazrekaj, and Blerim Rexha
Subjects: Cloud computing, security, threat intelligence, machine learning, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Cloud computing has transformed IT services by making them more scalable and cost-effective. However, this shift has also introduced new security challenges that traditional methods are finding hard to tackle. This review paper looks at how combining machine learning (ML) with threat intelligence can improve cloud security — an approach that hasn’t been widely explored yet. By reviewing recent studies, we show that ML and threat intelligence does more than detect known threats. They can also adapt to new and evolving ones, making cloud systems more secure against cyberattacks. Our analysis highlights how this combined approach provides better protection and flexibility. We also identify some important gaps in the current research and suggest areas for future study to make these security systems even more effective. This review aims to provide useful insights for researchers, helping to build more proactive cloud security strategies.
Published: 2025
Full Text: View/download PDF

27. Emotion-Aware Ensemble Learning (EAEL): Revolutionizing Mental Health Diagnosis of Corporate Professionals via Intelligent Integration of Multi-Modal Data Sources and Ensemble Techniques

Author: Gaurav Yadav, Mohammad Ubaidullah Bokhari, Saleh I. Alzahrani, Shadab Alam, and Mohammed Shuaib
Subjects: Mental health diagnosis, machine learning, ensemble learning, deep learning, corporate well-being, predictive analytics, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: In this contemporary landscape of corporate environments, the increasing prevalence of mental health challenges necessitates the development of innovative diagnostic methodologies. This research introduces the Emotion-Aware Ensemble Learning (EAEL) framework, a cutting-edge approach designed to revolutionize early mental health diagnosis among corporate professionals. EAEL integrates machine learning and deep learning paradigms to process multimodal data, including facial expression analysis and typing pattern recognition, offering a holistic evaluation of emotional well-being. Our investigation methodically trains base classifiers, such as Support Vector Machines (SVM), Convolutional Neural Networks (CNN), and Random Forests (RF), on distinct and combined datasets derived from facial expressions and typing patterns. The EAEL framework demonstrates robust performance, achieving an accuracy of 0.95, precision of 0.96, recall of 0.94, and F1-Score of 0.95 when applied to the integrated dataset. These findings underscore EAEL’s transformative potential as a proactive tool for mental health interventions in corporate settings. Future iterations could enhance the framework by incorporating physiological signals, such as heart rate variability and EEG data, further improving diagnostic accuracy. EAEL’s ability to seamlessly integrate diverse data modalities not only sets a new standard for technology-driven mental health assessments but also promises substantial benefits for employee welfare and organizational effectiveness, with the potential for adaptation in clinical environments as well.
Published: 2025
Full Text: View/download PDF

28. Classification Based on the Support Vector Machine for Determining Operational Targets for Controlling Electricity Usage With Conventional Meters: A Case Study of Industrial and Business Tariff Customers From PT PLN (Persero) Indonesia

Author: Galih Arisona, Alief Pascal Taruna, Dwi Irwanto, Arif Bijak Bestari, and Wildan Juniawan
Subjects: Classification, conventional meters, electricity theft, machine learning, support vector machine, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Electricity theft remains a significant challenge for PT PLN (Persero), Indonesia’s primary electricity provider, serving over 89 million customers as of 2023. The study focuses on industrial and business tariff customers, using a dataset from 2019 to 2023, which includes monthly consumption data from PLN’s postpaid customers across thirty operational units with the highest Electricity Use Control (P2TL) levels, covering customers with a maximum power of 6,600 VA. This approach differs from previous studies that rely on open or smart meter data, as this study uses conventional meters for data collection. In the dataset used for this research, losses from confirmed electricity theft amounted to approximately IDR 19 billion. This research aims to improve the detection of electricity theft through a machine learning-based model utilizing the Support Vector Machine (SVM) classification technique. The goal is to enhance the P2TL mechanism by accurately identifying potential targets for field verification. Various SVM kernels were tested, including Radial Basis Function (RBF), Linear, Polynomial (Poly), and Sigmoid, alongside classifiers such as SVM, Logistic Regression, Decision Tree, and Naïve Bayes. Results show that the SVM model, particularly with the RBF kernel, achieves optimal performance, with balanced precision and recall, especially with 30 months of historical data. This optimized model contributes to improving PLN’s operational efficiency, offering more accurate identification of electricity theft cases, leading to substantial financial savings by reducing losses from unpaid consumption. The findings offer practical benefits for reducing electricity theft and improving PLN’s monitoring system, especially in industrial and business sectors.
Published: 2025
Full Text: View/download PDF

29. Monitoring Bone Healing: Integrating RF Sensing With AI

Author: Ahmad Aldelemy, Ebenezer Adjei, Prince O. Siaw, Ali Al-Dulaimi, Viktor Doychinov, Nazar T. Ali, Rami Qahwaji, John G. Buckley, Pete Twigg, and Raed A. Abd-Alhameed
Subjects: RF sensing, artificial intelligence, bone fracture monitoring, machine learning, non-invasive assessment, healing process, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: This study presents the development of an advanced machine learning model based on a two-dimensional (2D) Radio Frequency (RF) sensing framework for refined monitoring of femoral bone fractures. Utilising MATLAB simulations, we created a comprehensive dataset enhanced with variations in bone diameter, muscle thickness, fat thickness, and hematoma size, augmented with multiple sensor configurations (two, four, six, and eight sensors). The model aims to provide a frequent, non-invasive assessment of the fracture healing process compared to conventional imaging methods. Our approach leverages data from six RF sensors, achieving a high overall accuracy of 99.2% in classifying different fracture stages, including “no fracture” and varying degrees of hematoma sizes. The findings indicate that increasing the number of sensors up to six significantly enhances detection accuracy and sensitivity across all fracture stages. However, the marginal improvement from six to eight sensors was not statistically significant, suggesting that a six-sensor configuration offers an optimal balance between performance and system complexity. The results demonstrate significant potential for this technology to revolutionise orthopaedic treatment and recovery management by offering continuous, real-time monitoring without radiation exposure. The proposed system enhances personalised patient care by integrating RF sensing with artificial intelligence, enabling timely interventions and more informed, data-driven treatment strategies. This research lays a robust foundation for future advancements, including three-dimensional modelling and clinical validations, toward the practical implementation of non-invasive fracture monitoring systems.
Published: 2025
Full Text: View/download PDF

30. FArSS: Fast and Efficient Semantic Question Similarity in Arabic

Author: Mohamed Alkaoud
Subjects: Arabic NLP, efficient machine learning, fastText, machine learning, natural language processing, neural networks, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: This paper addresses the challenge of efficient semantic question similarity in Arabic by leveraging fastText embeddings and a simple neural network architecture. Our model (FArSS) avoids the complexities of recurrent connections and attention mechanisms, resulting in a streamlined and efficient approach. With strategic data augmentation, our model achieves an F1-score of 0.928, closely competing with state-of-the-art models that rely on advanced architectures employing self-attention mechanisms. Additionally, our model outperforms both GPT-4o and GPT-4 in semantic question similarity in Arabic, underscoring the potential of specialized, efficient models to surpass large language models in specific tasks. This work demonstrates that our method not only maintains high performance but also ensures fast training and inference times. The practical advantages of our approach make it especially suitable for real-time applications, contributing to the development of more effective and efficient natural language processing systems. Our findings highlight the continued importance of efficient tailored models in addressing specific natural language processing challenges.
Published: 2025
Full Text: View/download PDF

31. The Role of Big Data Analytics in Revolutionizing Diabetes Management and Healthcare Decision-Making

Author: Muhammad Nauman, Ahmad S. Almadhor, Mohammed Albekairi, Ali R. Ansari, Muhammad A. B. Fayyaz, and Raheel Nawaz
Subjects: Big data analytics, healthcare decision-making, diabetes management, data analytics, machine learning, apache spark, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: The evolving healthcare domain necessitates an upgrade through digitization, integrating patient data, and advanced medical results. In the last couple of decades, advances in information and storage technologies in healthcare have produced vast amounts of data. The remarkable increases in data volumes, along with the enticing prospects and potential inherent in data analysis, have contributed to the concept of Big Data. There is a pressing need within the research community to analyze these large volumes of Big Data. To address this challenge, Big Data Analytics (BDA), the systematic process of examining large and complex datasets to uncover hidden patterns, correlations, and insights for informed decision-making, has emerged. It employs various methodologies and techniques to enable informed decision-making. This study delves into using Machine Learning (ML) in big data environments, explicitly utilizing the MLib library in Apache Spark to derive meaningful insights from diabetic healthcare dataset. The CDC’s Behavioral Risk Factor Surveillance System (BRFSS) was used to empirically demonstrate the advantages of integrating BDA with ML for medical decision-making in Big Data environments. The research finding highlighted the superior performance of Logistic Regression (LR) models compared to other models like Naive Bayes (NB), providing valuable insights for healthcare applications.
Published: 2025
Full Text: View/download PDF

32. Degrade or Super-Resolve to Recognize? Bridging the Domain Gap for Cross-Resolution Face Recognition

Author: Klemen Grm, Berk Kemal Ozata, Alperen Kantarci, Vitomir Struc, and Hazim Kemal Ekenel
Subjects: Biometrics, image processing, machine learning, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: In this work, we address the problem of cross-resolution face recognition, where a low-resolution probe face is compared against high-resolution gallery faces. To address this challenging problem, we investigate two approaches for bridging the quality gap between low-quality probe faces and high-quality gallery faces. The first approach focuses on degrading the quality of high-resolution gallery images to bring them closer to the quality of the probe images. The second approach involves enhancing the resolution of the probe images using face hallucination. Our experiments on the SCFace and DroneSURF datasets reveal that the success of face hallucination is highly dependent on the quality of the original images, since poor image quality can severely limit the effectiveness of the hallucination technique. Therefore, the selection of the appropriate face recognition method should consider the quality of the images. Additionally, our experiments also suggest that combining gallery degradation and face hallucination in a hybrid recognition scheme provides the best overall results for cross-resolution face recognition with relatively high-quality probe images, while the degradation process on its own is the more suitable option for low-quality probe images. Our results show that the combination of standard computer vision approaches such as degradation, super-resolution, feature fusion, and score fusion can be used to substantially improve performance on the task of low resolution face recognition using off-the-shelf face recognition models without re-training on the target domain.
Published: 2025
Full Text: View/download PDF

33. Machine Learning-Driven Optimization for Solution Space Reduction in the Quadratic Multiple Knapsack Problem

Author: Diego Yanez-Oyarce, Carlos Contreras-Bolton, Fredy Troncoso-Espinosa, and Carlos Rey
Subjects: Machine learning, combinatorial optimization, knapsack problem, quadratic multiple knapsack problem, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: The quadratic multiple knapsack problem (QMKP) is a well-studied problem in operations research. This problem involves selecting a subset of items that maximizes the linear and quadratic profit without exceeding a set of capacities for each knapsack. While its solution using metaheuristics has been explored, exact approaches have recently been investigated. One way to improve the performance of these exact approaches is by reducing the solution space in different instances, considering the properties of the items in the context of QMKP. In this paper, machine learning (ML) models are employed to support an exact optimization solver by predicting the inclusion of items with a certain level of confidence and classifying them. This approach reduces the solution space for exact solvers, allowing them to tackle more manageable problems. The methodological process is detailed, in which ML models are generated and the best one is selected to be used as a preprocessing approach. Finally, we conduct comparison experiments, demonstrating that using a ML model is highly beneficial for reducing computing times and achieving rapid convergence.
Published: 2025
Full Text: View/download PDF

34. Machine Learning Approach to Predict the DC Bias for Adaptive OFDM Transmission in Indoor Li-Fi Applications

Author: Marwah T. Salman, David R. Siddle, and Amadi G. Udu
Subjects: Adaptive transmission, clipping distortion, DCO-OFDM scheme, DC bias optimization, indoor Li-Fi applications, machine learning, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Multilevel quadrature amplitude modulation (M-QAM) combined with DC-bias in optical orthogonal frequency division multiplexing (DCO-OFDM) offers a spectrally efficient solution and adaptive transmission rates for indoor light-fidelity (Li-Fi) systems. However, a significant challenge posed by the DCO-OFDM scheme is the additional power of the DC bias required to ensure that the amplitudes of the transmitted signals are nonnegative. These biased signals are clipped according to optical power constraints, imposing clipping noise that affects the transmission bit error rate (BER). This performance degradation is conditioned by the adjustments made to the DC bias, which requires continuous modification to support adaptive transmission. Therefore, simultaneously addressing DC bias optimization and clipping mitigation is essential to provide reliable and power-efficient transmissions. This paper proposes a machine learning (ML) approach to predict the optimum DC bias based on the statistical properties of the OFDM signal and system characteristics. A robust ML regressor selection process using LazyPredict algorithm (LPA) was employed to identify the optimal regressors for developing the predictive model. The model demonstrated significant prediction accuracy for DC bias across a wide range of transmission settings. In particular, the models built on variants of gradient boosting regressor (GBR) and support vector regressor (SVR) demonstrated superior performance, with R-squared evaluation scores of 0.9792 and 0.9225, respectively, for two different sets of features. Furthermore, the BER performance of our adaptive DC bias approach was compared to a fixed DC bias in adaptive DCO-OFDM transmission, demonstrating the superiority of our approach in effectively mitigating clipping noise at high transmission rates while maintaining power efficiency at lower rates. These results provide a promising solution for the future practical deployment of Li-Fi systems in indoor applications.
Published: 2025
Full Text: View/download PDF

35. Enhancing Image-Based JPEG Compression: ML-Driven Quantization via DCT Feature Clustering

Author: Shahrzad Sabzavi and Reza Ghaderi
Subjects: Genetic algorithm, JPEG compression, machine learning, quantization table, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: JPEG compression is a widely used technique for reducing the file size of digital images, but it often compromises visual quality. The purpose of this research is to explore a novel approach that combines machine learning, discrete cosine transform (DCT) feature clustering, and genetic algorithms to customize image compression methods. The goal is to enhance visual quality while maintaining an appropriate bit-rate. In this study, an auto-encoder neural network is utilized to extract DCT features from images. These features are then clustered, and optimized quantization tables for each cluster center are generated using a genetic algorithm. The resulting tables are assigned to their respective clusters, enabling the preservation of visual quality during compression. Experimental evaluations were conducted on 1800 random images using this machine learning-based approach. The results demonstrate superior visual quality compared to traditional JPEG compression, while maintaining comparable bit-rates. The research shows significant improvements in peak signal-to-noise ratio (PSNR) by 2.34 dB and structural similarity index (SSIM) by 1.26%, indicating enhanced image quality. The findings of this research highlight the potential of combining machine learning, DCT feature clustering, and genetic algorithms to customize image compression techniques. The proposed approach enables effective image compression with improved visual quality preservation and maintained bit-rates. This research contributes to the advancement of image-based methods in achieving optimized image compression.
Published: 2025
Full Text: View/download PDF

36. Temporal Forecasting of Distributed Temperature Sensing in a Thermal Hydraulic System With Machine Learning and Statistical Models

Author: Stella Pantopoulou, Matthew Weathered, Darius Lisowski, Lefteri H. Tsoukalas, and Alexander Heifetz
Subjects: Distributed temperature sensing, Rayleigh scattering fiber optic sensors, thermal hydraulic temperature sensing, machine learning, statistical methods, ARIMA, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: We benchmark performance of long-short term memory (LSTM) network machine learning model and autoregressive integrated moving average (ARIMA) statistical model in temporal forecasting of distributed temperature sensing (DTS). Data in this study consists of fluid temperature transient measured with two co-located Rayleigh scattering fiber optic sensors (FOS) in a forced convection mixing zone of a thermal tee. We treat each gauge of a FOS as an independent temperature sensor. We first study prediction of DTS time series using Vanilla LSTM and ARIMA models trained on prior history of the same FOS that is used for testing. The results yield maximum absolute percentage error (MaxAPE) and root mean squared percentage error (RMSPE) of 1.58% and 0.06% for ARIMA, and 3.14% and 0.44% for LSTM, respectively. Next, we investigate zero-shot forecasting (ZSF) with LSTM and ARIMA trained on history of the co-located FOS only, which is advantageous when limited training data is available. The ZSF MaxAPE and RMSPE values for ARIMA are comparable to those of the Vanilla use case, while the error values for LSTM increase. We show that in ZSF, performance of LSTM network can be improved by training on most correlated gauges between the two FOS, which are identified by calculating the Pearson correlation coefficient. The improved ZSF MaxAPE and RMSPE for LSTM are 4.4% and 0.33%, respectively. Performance of ZSF LSTM can be further enhanced through transfer learning (TL), where LSTM is re-trained on a subset of the FOS that is the target of forecasting. We show that LSTM pre-trained on correlated dataset and re-trained on 30% of testing target dataset achieves MaxAPE and RMSPE values of 2.32% and 0.28%, respectively.
Published: 2025
Full Text: View/download PDF

37. Design a Robust DDoS Attack Detection and Mitigation Scheme in SDN-Edge-IoT by Leveraging Machine Learning

Author: Habtamu Molla Belachew, Mulatu Yirga Beyene, Abinet Bizuayehu Desta, Behaylu Tadele Alemu, Salahadin Seid Musa, and Alemu Jorgi Muhammed
Subjects: Distributed denial of service, edge computing, Internet of Things, machine learning, software defined networking, SDN-Edge-IoT, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: The Internet of Things (IoT) has rapidly expanded, providing significant benefits across various fields. However, the complexity of IoT networks, with their resource-constrained devices, presents substantial security challenges, particularly Distributed Denial of Service (DDoS) attacks. Integrating Software Defined Networking (SDN) with IoT has emerged as a promising solution to enhance security. Despite this, DDoS attacks through IoT botnets remain a significant threat. Existing studies on DDoS detection in SDN-IoT networks often suffer from inefficient detection accuracy due to poor algorithm design and latency issues arising from deploying models in the control plane. This study aims to improve DDoS detection accuracy by training a robust Machine Learning (ML) model using effective hyper-parameter tuning and Cross-Validation (CV) techniques. To mitigate latency issues, we deploy the model at the edge of the SDN-IoT network, enforcing mitigation rules through the SDN controller. We evaluated four popular classifiers (K-Nearest Neighbor (K-NN), Random Forest (RF), eXtreme Gradient Boosting (XGBoost), and FeedForward Neural Network (FFNN)) on benchmark datasets CICIDS2017 and Edge-IIoTset, conducting both binary and multi-class classifications. Our implementation using the Mininet-WiFi emulation tool revealed that XGBoost outperformed others in binary DDoS detection, achieving accuracy, precision, recall, and F1-score all above 99.997%, with a testing time of 3.559 seconds on the Edge-IIoTset dataset. Compared to recent studies, the proposed approach demonstrates XGBoost’s clear superiority. Consequently, XGBoost was deployed at the edge of the SDN-IoT for live traffic classification, showing improved performance by classifying live traffic within 3.946 ms and using only 8.80% of memory with a 0.5-second window size.
Published: 2025
Full Text: View/download PDF

38. Signal Processing-Free Intelligent Model for Power Quality Disturbances Identification

Author: Mohammed F. Al-Mashdali, Asif Islam, Abdulbasit Hassan, Md Shafiullah, Mujahed Al-Dhaifallah, and Khalid Al Fuwail
Subjects: Convolutional neural networks (CNN), power quality (PQ), renewable energy, machine learning, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Integrating different types of renewable energy sources in the power system substantially challenges the power quality (PQ), directly affecting the system’s stability and service life span. The rise of power quality disturbances (PQD) generates irregularities in voltage and current waveforms, harming smart grid networks and linked devices. Traditional methods for PQD classification use complicated feature extraction techniques, which can be computationally expensive and lack scalability. This research proposes applying basic convolutional neural network (CNN) models for automated PQD detection and categorization as a prospective solution to these issues. By directly examining PQD images generated from signal data, these models reduce the requirement for human-crafted features. The study analyzes alternative CNN setups, training datasets, and disturbance types to measure model performance. The results demonstrate that these simple CNN models maintain stable accuracy values in normal and noisy environments, even with increasing classes and noise, the models managed to maintain a high-performance level with up to 99.39% accuracy for 17 classes when the Adam optimizer was used instead of RMSprop. The models could deal with noise-related disturbances, still achieving accuracy as high as 96.42% when trained by just 50% of the dataset under 30dB SNR (Signal to Noise Ratio) conditions. Moreover, comparing the two frequencies on 50Hz and 60Hz performance does not show the equivalent models’ robustness over different operating levels. This study highlights the potential of CNNs in boosting power quality disturbance categorization and presents paths for further inquiry in model refining and optimization. The study focuses on CNN-based models applied in power quality disturbance detection and classification research.
Published: 2025
Full Text: View/download PDF

39. Study on Finger Gesture Interface Using One-Channel EMG

Author: Hee-Yeong Yang, Young-Shin Han, and Choon-Sung Nam
Subjects: Data preprocessing, EMG, finger gesture recognition, machine learning, one channel data user interface, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Electromyography (EMG) is used to recognize user finger gestures for applications in real-time interfaces. Finger movements are classified by preprocessing to extract the features from the collected EMG data, which are then used for machine learning. The data were extracted using the overlapped segmentation method to ensure sufficient training data. The preprocessing of EMG data uses standard formulae, such as integrated EMG (IEMG) and mean absolute value (MAV). Furthermore, preprocessing involves using original data, simple moving average (SMA), and Fast Fourier transform (FFT) for feature extraction. Subsequently, these preprocessed data sets are used to train machine learning models, facilitating a comparative analysis. Four machine learning models were used: eXtreme Gradient Boost, Random Forest, k-Nearest Neighbors, and Logistic Regression. The experimental results revealed the best accuracy from preprocessing using a simple moving average followed by a Fourier transform, but classification was not possible using all nine finger movements. On the other hand, it showed more than 90% accuracy because the model learned by reducing it to a specific finger gesture. Rest movements, index finger taps, and force-taps movements achieved the highest accuracy, approximately 95%.
Published: 2025
Full Text: View/download PDF

40. Residential Electrical Load Forecasting Based on a Real-Time Evidential Time Series Prediction Method

Author: M. Mroueh, M. Doumiati, C. Francis, and M. Machmoum
Subjects: Load forecasting, belief functions theory, machine learning, electrical microgrid, energy management, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Load forecasting is essential for efficient microgrid management, providing key advantages in operational efficiency, cost control, and grid reliability. As microgrids become increasingly critical in the global transition toward decentralized renewable energy systems, accurately predicting load demand is vital for optimizing performance and ensuring a stable, resilient, and sustainable power supply. This study introduces a novel short-term load forecasting approach based on Belief Functions Theory (BFT). The proposed method employs information fusion techniques to combine multiple predictors, each with its own forecasting mechanism. Using lagged power values and weather data, the predictors generate estimated power values along with corresponding uncertainty or error levels. A mass function is assigned to each predictor, taking into account both prediction and error data, even when some information is missing. These mass functions are then merged to produce a final, reliable prediction. Application of this method to publicly available load datasets demonstrates its effectiveness, achieving a 12% reduction in forecasting error compared to state-of-the-art methods and delivering substantial improvements in computational efficiency.
Published: 2025
Full Text: View/download PDF

41. Electricity Theft Detection Using Machine Learning in Traditional Meter Postpaid Residential Customers: A Case Study on State Electricity Company (PLN) Indonesia

Author: Alief Pascal Taruna, Galih Arisona, Dwi Irwanto, Arif Bijak Bestari, and Wildan Juniawan
Subjects: Electricity theft detection, machine learning, PLN, traditional meter, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Electricity theft is a major challenge for PT PLN (Persero), particularly in managing 27 million postpaid customers, most of whom still use traditional meters. Detecting and addressing electricity theft has become increasingly complex, requiring more efficient approaches. Unlike smart meters, traditional meters lack communication capabilities, making detection reliant on manual processes. This research develops a machine learning model to optimize the Target Operation (TO) process. TO is a list of customers targeted for on-site verification due to suspected electricity theft. This study focuses on optimizing the formation of TO by analyzing monthly electricity usage, particularly in the 450 VA household segment receiving government subsidies. The model aims to reduce reliance on subjective manual observations while ensuring proper subsidy allocation. Various classification models, including Decision Tree, Naive Bayes, Random Forest, K-Nearest Neighbors, Logistic Regression, and Deep Neural Network, were evaluated, with Random Forest achieving the best performance across simulations. A sequential evaluation method is introduced to enhance accuracy through layered filtering, where detection results from the three-theft model are further filtered using the two-theft and one-theft models, resulting in a more precise TO. The combination of Random Forest and K-Nearest Neighbors achieved the highest performance, with an accuracy of 0.89, precision of 0.83, recall of 0.98, F1-Score of 0.90, and AUC of 0.89. These findings demonstrate the model’s effectiveness in delivering reliable TO recommendations, supporting PLN’s operational strategies, and offering practical benefits through a more objective, standardized TO process that minimizes human error and improves efficiency.
Published: 2025
Full Text: View/download PDF

42. Machine Learning in Money Laundering Detection Over Blockchain Technology

Author: Algimantas Venckauskas, Sarunas Grigaliunas, Linas Pocius, Rasa Bruzgiene, and Andrejs Romanovs
Subjects: Machine learning, blockchain, cybercrime, cryptocurrency, money laundering, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Layering through cryptocurrency transactions represents a sophisticated mechanism for laundering money within cybercrime circles. This process methodically merges illegal funds into the legitimate financial system. Blockchain technology plays a crucial role in this integration by facilitating the quick and automated dispersal of assets across various digital wallets and exchanges. Machine learning emerges as a powerful tool for analyzing and identifying illicit transactions within Blockchain networks; however, a significant challenge remains in the form of a gap in advanced pattern recognition algorithms. This paper introduces a novel machine learning-based approach called Value-driven-Transactional tracking Analytics for Crypto compliance (VTAC) for the detection of illegal crypto transactions via Blockchain. The approach combines machine learning algorithms with a pre-training process, normalization, model training, and a de-anonymization process to analyze and identify illicit transactions effectively. Experimental evaluations show VTAC’s capability to detect illegal transactions with a 97.5% accuracy using the XG Boost model, outperforming existing methods with an accuracy of up to 95.9%. Key performance metrics, including precision, recall, and F1-score, consistently exceeded 95%, highlighting VTAC’s enhanced precision and reliability. The proposed solution will serve as an advisory framework to help financial crime investigators enhance the detection and reporting of suspicious cryptocurrency transactions in cyberspace.
Published: 2025
Full Text: View/download PDF

43. Fuzzy Enhanced Kidney Tumor Detection: Integrating Machine Learning Operations for a Fusion of Twin Transferable Network and Weighted Ensemble Machine Learning Classifier

Author: Ananya Ghosh and Jyotismita Chaki
Subjects: Deep neural network, ensemble learning, kidney tumor, machine learning, transfer learning, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Kidney tumors, often asymptomatic, can lead to serious health problems if left undiagnosed. This study tackles the crucial issue of kidney tumor detection using CT scans. The proposed approach leverages the power of image enhancement using fuzzy systems, deep learning, and machine learning for automated kidney tumor detection in CT images. The study proposes a fuzzy inference system to enhance kidney CT image contrast. This system analyzes image data and uses fuzzy logic to adjust pixel intensities, aiming to improve the distinction between features in the image without creating over-enhancement. Two pre-trained deep convolutional neural networks (PT-DCNNs), DenseNet121 and ResNet101, are used to extract features from the enhanced CT images. These features capture essential characteristics that differentiate between normal and tumor-containing scans. Combining features from twin PT-DCNNs (ensemble approach) creates a richer representation of the image content. The informative features are fed into a combined classifier where Support Vector Machines and Random Forests are combined using a weighted average to achieve the final and potentially more robust classification of kidney tumors. To improve training, we amplified the original dataset by creating variations with added noise and artificial modifications to simulate real-world image imperfections. The integration of Machine Learning Operations practices ensures the scalability, reproducibility, and clinical deployment of the system. The model achieved an impressive accuracy of 99.2% on high-quality images and 98.5% on noisy images, surpassing traditional methods. This automated approach can assist urologists in confirming the presence of kidney tumors, minimizing human error during physical inspection and potentially leading to improved patient outcomes.
Published: 2025
Full Text: View/download PDF

44. Hyperparameter Optimization in Generative Adversarial Networks (GANs) Using Gaussian AHP

Author: Thiago Serafim Rodrigues and Placido Rogerio Pinheiro
Subjects: Generative adversarial networks, hyperparameter optimization, gaussian analytical hierarchy process, multicriteria decision-making, machine learning, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: This study explores optimizing hyperparameters in Generative Adversarial Networks (GANs) using the Gaussian Analytical Hierarchy Process (Gaussian AHP). By integrating machine learning techniques and multi-criteria decision methods, the aim is to enhance the performance and efficiency of GAN models. It trains GAN models using the Fashion MNIST dataset. It applies Gaussian AHP to optimize hyperparameters based on multiple performance criteria, such as the quality of generated images, training stability, and training time. Iterative experiments validate the methodology by automatically adjusting hyperparameters based on the obtained scores, thereby maximizing the model’s efficiency and quality. Results indicate significant improvements in image generation quality and computational efficiency. The study highlights the effectiveness of combining Gaussian AHP with GANs for systematic hyperparameter optimization, providing insights into achieving higher performance in image generation tasks. Future research could extend this approach to other neural network architectures and diverse datasets, further demonstrating the versatility of this optimization technique. This method’s potential applications extend across various domains, including data augmentation and anomaly detection, indicating its broad applicability and impact.
Published: 2025
Full Text: View/download PDF

45. Transmission Failure Prediction Using AI and Structural Modeling Informed by Distribution Outages

Author: Sita Nyame, William O. Taylor, William Hughes, Mingguo Hong, Marika Koukoula, Feifei Yang, Aaron Spaulding, Xiaochuan Luo, Slava Maslennikov, and Diego Cerrai
Subjects: Failure prediction, machine learning, transmission system, structural modeling, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Understanding and quantifying the impact of severe weather events on the electric transmission and distribution system is crucial for ensuring its resilience in the context of the increasing frequency and intensity of extreme weather events caused by climate change. While weather impact models for the distribution system have been widely developed during the past decade, transmission system impact models lagged behind because of the scarcity of data. This study demonstrates a weather impact model for predicting the probability of failure of transmission lines. It builds upon a recently developed model and focuses on reducing model bias, through multi-model integration, feature engineering, and the development of a storm index that leverages distribution system data to aid the prediction of transmission risk. We explored three methods for integrating machine learning with mechanistic models. They consist of: (a) creating a linear combination of the outputs of the two modeling approaches, (b) including fragility curves as additional inputs to machine learning models, and (c) developing a new machine learning model that uses the outputs of the weather-based machine learning model, fragility curve estimates, and wind data to make new predictions. Moreover, due to the limited number of historical failures in transmission networks, a storm index was developed leveraging a dataset of distribution outages to learn about storm behavior to improve model skills. In the current version of the model, we substantially reduced the overestimation in the sum of predicted values of transmission line probability of failure that was present in the previously published model by a factor of 10. This has led to a reduction of model bias from 3352% to 14.46–15.43%. The model with the integrated approach and storm index demonstrates substantial improvements in the estimation of the probability of failure of transmission lines and their ranking by risk level. The improved model is able to capture 60% of the failures within the top 22.5% of the ranked power lines, compared to a value of 34.9% for the previous model. With an estimate of the probability of failure of transmission lines ahead of storms, power system planning and maintenance engineers will have critical information to make informed decisions, to create better mitigation plans and minimize power disruptions. Long term, this model can assist with resilience investments as it highlights areas of the system more susceptible to damage.
Published: 2025
Full Text: View/download PDF

46. Comprehensive Bibliographic Survey and Forward-Looking Recommendations for Software Defect Prediction: Datasets, Validation Methodologies, Prediction Approaches, and Tools

Author: Mohd Mustaqeem, Mahfooz Alam, Suhel Mustajab, Faisal Alshanketi, Shadab Alam, and Mohammed Shuaib
Subjects: Software defect prediction, classification, artificial intelligence, machine learning, statistical validation, bibliographic survey, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: The development of reliable software depends heavily on the effective collaboration between teams responsible for development and testing. Despite ongoing efforts, many software programs still contain bugs that can lead to financial losses and business risks. Therefore, detecting and fixing software defects after release is crucial. While binary classification methods have been commonly used for this purpose, recent Artificial Intelligence (AI) advancements offer new opportunities for software teams to create more robust software. To address challenges in Software Defect Prediction (SDP), we conducted a thorough bibliographic survey of 79 research articles from the year 2011 to 2023 that examined previous models, datasets, data validation techniques, defect detection, prediction methods, and SDP tools. The survey revealed that previous research often lacked appropriate datasets with the necessary characteristics and data validation methods. Additionally, many standard datasets suffer from a lack of labels, which hinders effective defect detection. Systematic literature reviews on SDP are scarce, further emphasizing the importance of this study. Based on the findings, we provide crucial recommendations for designing effective SDP models and tools. The proposed survey outlines an architecture for constructing SDP datasets with the appropriate characteristics, as well as multi-label classification and data validation methodologies for software defects. This approach aims to enhance SDP research and contribute to the development of high-quality software products by improving defect prediction accuracy.
Published: 2025
Full Text: View/download PDF

47. Noise Suppression Method With Low-Complexity Noise Estimation Model and Heuristic Noise-Masking Algorithm for Real-Time Processing of Robot Vacuum Cleaners

Author: Seunghyeon Shin, Minhan Kim, Inkoo Jeon, Ju-Man Song, Yongjin Park, Jungkwan Son, and Seokjin Lee
Subjects: Source separation, low-complexity, low-SNR, machine learning, mask estimation, mono channel, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Noise suppression in a high-level noise environment using a low-complexity method is challenging. This study proposes a low-complexity noise suppression algorithm for robot vacuum cleaner processors. We collected working noise from a robot vacuum cleaner along with speech signals and developed a method to extract the desired speech signal while estimating the noise. Our approach estimates the noise in the existing signal and converts it into the desired signal. In addition, we designed a low-complexity neural network capable of operating on mobile processors. The evaluation results demonstrate that our method achieves a performance comparable to that of highly computational methods. Notably, our method maintains superior performance when the intensity of the desired signal is low, and its performance is less degraded than that of other methods. It exhibits less degradation than existing methods, and in contrast to other neural networks, it avoids generating incorrect signals. Furthermore, we simplified the neural network architecture reducing its size by approximately 25% with minimal performance loss.
Published: 2025
Full Text: View/download PDF

48. PermGuard: A Scalable Framework for Android Malware Detection Using Permission-to-Exploitation Mapping

Author: Arvind Prasad, Shalini Chandra, Mueen Uddin, Taher Al-Shehari, Nasser A. Alsadhan, and Syed Sajid Ullah
Subjects: Android malware detection, machine learning, permissions exploitation, cybersecurity, mobile security, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Android, the world’s most widely used mobile operating system, is increasingly targeted by malware due to its open-source nature, high customizability, and integration with Google services. The increasing reliance on mobile devices significantly raises the risk of malware attacks, especially for non-technical users who often grant permissions without thorough evaluation, leading to potentially devastating effects. This paper introduces PermGuard, a scalable framework for Android malware detection that maps permissions into exploitation techniques and employs incremental learning to detect malicious apps. It presents a novel technique for constructing the PermGuard dataset by mapping Android permissions to exploitation techniques, providing a comprehensive understanding of how permissions can be misused by malware. The dataset consists of 55,911 benign and 55,911 malware apps, providing a balanced and comprehensive foundation for analysis. Additionally, a new strategy using similarity-based selective training reduces the amount of data required for the training of an incremental learning-based model, focusing on the most relevant data to improve efficiency. To ensure robustness and accuracy, the model adopts a test-then-train approach, initially testing on application data to identify weaknesses and refine the training process. The framework’s resilience is tested against adversarial attacks, demonstrating its ability to withstand attempts to bypass or deceive detection mechanisms and enhance overall security. Designed for scalability, PermGuard can handle large and continuously growing datasets, making it suitable for real-world applications. Empirical results indicate that the model achieved an accuracy of 0.9933 on real datasets and 0.9828 on synthetic datasets, demonstrating strong resilience against both real and adversarial attacks.
Published: 2025
Full Text: View/download PDF

49. Devanagari Character Recognition: A Comprehensive Literature Review

Author: Sandhya Arora, Latesh Malik, Sonakshi Goyal, Debotosh Bhattacharjee, Mita Nasipuri, and Ondrej Krejcar
Subjects: OCR, handwritten Devanagari character recognition, machine learning, deep learning, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: The Devanagari script originated from the ancient Brahmi script and is a widely used Indic script for writing different languages, like Sanskrit, Hindi, Marathi, Nepali, and Konkani. Recognizing handwritten Devanagari characters poses significant challenges due to their complexity and handwriting variability. This literature review examines the evolution of handwritten Devanagari character recognition (HDCR), exploring early template matching and feature extraction methods that struggled with the script’s intricacy. Advances introduced structural and statistical techniques, improving accuracy by analyzing geometric properties and patterns. The advent of machine learning, particularly deep learning, revolutionized HDCR with convolutional neural networks (CNNs) and recurrent neural networks (RNNs), significantly enhancing performance. Hybrid approaches that combine multiple techniques have shown promising results, balancing accuracy and computational complexity. Challenges remain, including handwriting variability, noise, and the need for real-time performance. The lack of large, diverse datasets for training and evaluation is a significant hurdle. This review highlights efforts to create annotated datasets and benchmarks, providing a comprehensive overview of HDCR methodologies, strengths, limitations, and future research directions. These insights aim to advance HDCR, contributing to more accurate and efficient recognition systems and enhancing digital text processing for linguistic, educational, and archival purposes.
Published: 2025
Full Text: View/download PDF

50. A Hybrid Machine Learning Model for Efficient XML Parsing

Author: Muhammad Ali, Minhaj Ahmad Khan, and Raihan Ur Rasool
Subjects: Artificial intelligence, artificial neural network, eXtensible markup language, framework, machine learning, model, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: The Extensible Markup Language (XML) files are extensively used for representing structured data on the web for file configuration, exchanging data between distinct applications, web development, and many other applications. Consequently, effective parsing techniques are necessary for XML files to enhance the performance of applications. The existing parsing techniques have their strengths and weaknesses affecting the performance of applications. Researchers point out that the selection of an efficient and appropriate parser is the most challenging issue regarding a particular condition. This paper proposes a framework XML Parsing Optimization using Hybrid Machine Learning (XPOHML) that makes use of Artificial Neural Network (ANN) and Support Vector Machine (SVM) machine learning techniques for efficient XML parsing. The newly developed framework performs analysis and prediction of different XML parsers using profiling, classification, performance evaluation, and finally generates code for efficient parsing. The XML profiling phase of the XPOHML framework generates a dataset by evaluating the performance of PXTG, SAX, StAX, DOM, and JDOM parsing models on separate cores by applying numerous file sizes. The Classification phase produces the classification model by applying ANN and SVM techniques to identify the appropriate parsing model. The performance evaluation phase of XPOHML assesses the performance of both parsing models through classification metrics (accuracy). Additionally, based on evaluation outcomes, the code generation phase produces an efficient parsing model of XML files. The newly designed and developed XPOHML framework has shown a meaningful improvement in the performance of parsing XML files.
Published: 2025
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Database

Publisher

7,063 results on '"Machine learning"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources