6,700 results
Search Results
2. A review on over-sampling techniques in classification of multi-class imbalanced datasets: insights for medical problems.
- Author
-
Yuxuan Yang, Khorshidi, Hadi Akbarzadeh, and Aickelin, Uwe
- Subjects
DATABASE management ,PREDICTION models ,MEDICAL informatics ,STATISTICAL sampling ,ARTIFICIAL intelligence ,RESEARCH bias ,MACHINE learning ,ALGORITHMS - Abstract
There has been growing attention to multi-class classification problems, particularly those challenges of imbalanced class distributions. To address these challenges, various strategies, including data-level re-sampling treatment and ensemble methods, have been introduced to bolster the performance of predictive models and Artificial Intelligence (AI) algorithms in scenarios where excessive level of imbalance is present. While most research and algorithm development have been focused on binary classification problems, in health informatics there is an increased interest in the field to address the problem of multi-class classification in imbalanced datasets. Multi-class imbalance problems bring forth more complex challenges, as a delicate approach is required to generate synthetic data and simultaneously maintain the relationship between the multiple classes. The aim of this review paper is to examine over-sampling methods tailored for medical and other datasets with multi-class imbalance. Out of 2,076 peer-reviewed papers identified through searches, 197 eligible papers were chosen and thoroughly reviewed for inclusion, narrowing to 37 studies being selected for in-depth analysis. These studies are categorised into four categories: metric, adaptive, structure-based, and hybrid approaches. The most significant finding is the emerging trend toward hybrid resampling methods that combine the strengths of various techniques to effectively address the problem of imbalanced data. This paper provides an extensive analysis of each selected study, discusses their findings, and outlines directions for future research. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Deep Learning Algorithms for Traffic Forecasting: A Comprehensive Review and Comparison with Classical Ones.
- Author
-
Afandizadeh, Shahriar, Abdolahi, Saeid, Mirzahossein, Hamid, and Li, Ruimin
- Subjects
MACHINE learning ,TRAFFIC estimation ,TRANSPORTATION management system ,DEEP learning ,INTELLIGENT transportation systems ,ALGORITHMS ,FORECASTING ,TRAFFIC safety - Abstract
Accurate and timely forecasting of critical components is pivotal in intelligent transportation systems and traffic management, crucially mitigating congestion and enhancing safety. This paper aims to comprehensively review deep learning algorithms and classical models employed in traffic forecasting. Spanning diverse traffic datasets, the study encompasses various scenarios, offering a nuanced understanding of traffic forecasting methods. Reviewing 111 seminal research works since the 1980s, encompassing both deep learning and classical models, the paper begins by detailing the data sources utilized in transportation systems. Subsequently, it delves into the theoretical underpinnings of prevalent deep learning algorithms and classical models prevalent in traffic forecasting. Furthermore, it investigates the application of these algorithms and models in forecasting key traffic characteristics, informed by their utility in transport and traffic analyses. Finally, the study elucidates the merits and drawbacks of proposed models through applied research in traffic forecasting. Findings indicate that while deep learning algorithms and classic models serve as valuable tools, their suitability varies across contexts, necessitating careful consideration in future studies. The study underscores research opportunities in road traffic forecasting, providing a comprehensive guide for future endeavors in this domain. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Reinforcement Machine Learning for Sparse Array Antenna Optimization with PPO.
- Author
-
Mohammad-Ali-Nezhad, Sajad and Kassem, Mohammad H.
- Subjects
ANTENNA arrays ,ANTENNAS (Electronics) ,TELECOMMUNICATION systems ,MACHINE learning ,ALGORITHMS - Abstract
This paper focuses on optimizing the radiation pattern of sparse array antennas using reinforcement learning, with many algorithms. The paper aims to leverage Proximal Policy Optimization’s (PPO’s) advantages in optimization and its effectiveness in handling stochastic transitions and rewards to achieve a reduced number of elements while maintaining desired signal performance and minimizing unnecessary side lobe signals. By removing a few of the antennas using reinforcement learning and PPO optimization, the same results as a complete array have been obtained. The anticipated outcomes of this research hold the promise of significantly enhancing the effectiveness and utility of sparse array antennas in communication systems. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Applying Machine Learning in Marketing: An Analysis Using the NMF and k-Means Algorithms.
- Author
-
Gallego, Victor, Lingan, Jessica, Freixes, Alfons, Juan, Angel A., and Osorio, Celia
- Subjects
K-means clustering ,MACHINE learning ,ARTIFICIAL intelligence ,ADVERTISING effectiveness ,DATABASES - Abstract
The integration of machine learning (ML) techniques into marketing strategies has become increasingly relevant in modern business. Utilizing scientific manuscripts indexed in the Scopus database, this article explores how this integration is being carried out. Initially, a focused search is undertaken for academic articles containing both the terms "machine learning" and "marketing" in their titles, which yields a pool of papers. These papers have been processed using the Supabase platform. The process has included steps like text refinement and feature extraction. In addition, our study uses two key ML methodologies: topic modeling through NMF and a comparative analysis utilizing the k-means clustering algorithm. Through this analysis, three distinct clusters emerged, thus clarifying how ML techniques are influencing marketing strategies, from enhancing customer segmentation practices to optimizing the effectiveness of advertising campaigns. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Research on User Default Prediction Algorithm Based on Adjusted Homogenous and Heterogeneous Ensemble Learning.
- Author
-
Lu, Yao, Wang, Kui, Sun, Hui, Qu, Hanwen, Chen, Jiajia, Liu, Wei, and Chang, Chenjie
- Subjects
DEFAULT (Finance) ,FORECASTING ,FEATURE selection ,ALGORITHMS ,CREDIT risk ,ECONOMETRIC models ,MACHINE learning ,GREEN technology - Abstract
In the field of risk assessment, the traditional econometric models are generally used to assess credit risk. And with the introduction of the "dual-carbon" goals to promote the development of a low-carbon economy, the scale of green credit in China has rapidly expanded. But with the advent of the big data era, due to the poor interpretability of a traditional single machine learning model, it is difficult to capture nonlinear relationships, and there are shortcomings in prediction accuracy and robustness. This paper selects the adjusted ensemble learning model based on the homogeneous and heterogeneous factors for user default prediction, which can efficiently process large quantities of high-dimensional data. This article adjusts each model to adapt to the task and innovatively compares various models. In this paper, the missing value filling method, feature selection, and ensemble model are studied and discussed, and the optimal ensemble model is obtained. When comparing the predictions of single models and ensemble models, the accuracy, sensitivity, specificity, F1-Score, Kappa, and MCC of Categorical Features Gradient Boosting (CatBoost) and Random undersampling Boosting (RUSBoost) all reach 100%. The experimental results prove that the algorithm based on adjusted homogeneous and heterogeneous ensemble learning can predict the user default efficiently and accurately. This paper also provides some references for establishing a risk assessment index system. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Privacy-Preserving Federated Deep Learning Diagnostic Method for Multi-Stage Diseases.
- Author
-
Jinbo Yang, Hai Huang, Lailai Yin, Jiaxing Qu, and Wanjuan Xie
- Subjects
ARTIFICIAL neural networks ,MACHINE learning ,INTEGRATED circuits ,DATA privacy ,ALGORITHMS ,NATURAL languages ,DEEP learning - Abstract
Diagnosing multi-stage diseases typically requires doctors to consider multiple data sources, including clinical symptoms, physical signs, biochemical test results, imaging findings, pathological examination data, and even genetic data. When applying machine learning modeling to predict and diagnose multi-stage diseases, several challenges need to be addressed. Firstly, the model needs to handle multimodal data, as the data used by doctors for diagnosis includes image data, natural language data, and structured data. Secondly, privacy of patients' data needs to be protected, as these data contain the most sensitive and private information. Lastly, considering the practicality of the model, the computational requirements should not be too high. To address these challenges, this paper proposes a privacy-preserving federated deep learning diagnostic method for multi-stage diseases. This method improves the forward and backward propagation processes of deep neural network modeling algorithms and introduces a homomorphic encryption step to design a federated modeling algorithm without the need for an arbiter. It also utilizes dedicated integrated circuits to implement the hardware Paillier algorithm, providing accelerated support for homomorphic encryption in modeling. Finally, this paper designs and conducts experiments to evaluate the proposed solution. The experimental results show that in privacy-preserving federated deep learning diagnostic modeling, the method in this paper achieves the same modeling performance as ordinary modeling without privacy protection, and has higher modeling speed compared to similar algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. RF-KELM indoor positioning algorithm based on WiFi RSS fingerprint.
- Author
-
Hou, Bingnan and Wang, Yanchun
- Subjects
HUMAN fingerprints ,MACHINE learning ,ALGORITHMS ,FINGERPRINT databases ,SIGNAL processing ,ELECTRONIC data processing - Abstract
WiFi-based fingerprint indoor positioning technology has been widely concerned, but it has been facing the challenge of lack of robustness to signal changes, and the positioning service requires fast and accurate positioning estimation. Therefore, an random forest-kernel extreme learning machine (RF-KELM) positioning algorithm with good comprehensive performance is proposed in this paper. Both offline and online phases are included by this algorithm. In the offline phase, the original data of WiFi fingerprint is first transformed into a form more suitable for positioning. Then, access point (AP) selection is performed on the fingerprint database containing many useless APs, in which an RF which can evaluate the importance of features is used. Finally, the KELM is trained with the sub-database that have undergone data transformation and AP selection. In the online phase, firstly, the obtained signal is processed, and then the trained KELM is used to predict the position of the data processed signal. In this paper, the performance of the proposed RF-KELM positioning algorithm is thoroughly tested on a publicly available dataset, and the experimental results demonstrate that the proposed algorithm not only has high positioning accuracy and robustness, but also takes only 0.08 s to position online. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Machine Learning Models to Predict Readmission Risk of Patients with Schizophrenia in a Spanish Region.
- Author
-
Góngora Alonso, Susel, Herrera Montano, Isabel, Ayala, Juan Luis Martín, Rodrigues, Joel J. P. C., Franco-Martín, Manuel, and de la Torre Díez, Isabel
- Subjects
MACHINE learning ,MENTAL health services ,PATIENT readmissions ,PEOPLE with schizophrenia ,PUBLIC hospitals - Abstract
Currently, high hospital readmission rates have become a problem for mental health services, because it is directly associated with the quality of patient care. The development of predictive models with machine learning algorithms allows the assessment of readmission risk in hospitals. The main objective of this paper is to predict the readmission risk of patients with schizophrenia in a region of Spain, using machine learning algorithms. In this study, we used a dataset with 6089 electronic admission records corresponding to 3065 patients with schizophrenia disorders. Data were collected in the period 2005–2015 from acute units of 11 public hospitals in a Spain region. The Random Forest classifier obtained the best results in predicting the readmission risk, in the metrics accuracy = 0.817, recall = 0.887, F1-score = 0.877, and AUC = 0.879. This paper shows the algorithm with highest accuracy value and determines the factors associated with readmission risk of patients with schizophrenia in this population. It also shows that the development of predictive models with a machine learning approach can help improve patient care quality and develop preventive treatments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Ensemble Learning Improves the Efficiency of Microseismic Signal Classification in Landslide Seismic Monitoring.
- Author
-
Xin, Bingyu, Huang, Zhiyong, Huang, Shijie, and Feng, Liang
- Subjects
SIGNAL classification ,DATABASES ,RANDOM forest algorithms ,DECISION trees ,ALGORITHMS ,LANDSLIDES - Abstract
A deep-seated landslide could release numerous microseismic signals from creep-slip movement, which includes a rock-soil slip from the slope surface and a rock-soil shear rupture in the subsurface. Machine learning can effectively enhance the classification of microseismic signals in landslide seismic monitoring and interpret the mechanical processes of landslide motion. In this paper, eight sets of triaxial seismic sensors were deployed inside the deep-seated landslide, Jiuxianping, China, and a large number of microseismic signals related to the slope movement were obtained through 1-year-long continuous monitoring. All the data were passed through the seismic event identification mode, the ratio of the long-time average and short-time average. We selected 11 days of data, manually classified 4131 data into eight categories, and created a microseismic event database. Classical machine learning algorithms and ensemble learning algorithms were tested in this paper. In order to evaluate the seismic event classification performance of each algorithmic model, we evaluated the proposed algorithms through the dimensions of the accuracy, precision, and recall of each model. The validation results demonstrated that the best performing decision tree algorithm among the classical machine learning algorithms had an accuracy of 88.75%, while the ensemble algorithms, including random forest, Gradient Boosting Trees, Extreme Gradient Boosting, and Light Gradient Boosting Machine, had an accuracy range from 93.5% to 94.2% and also achieved better results in the combined evaluation of the precision, recall, and F1 score. The specific classification tests for each microseismic event category showed the same results. The results suggested that the ensemble learning algorithms show better results compared to the classical machine learning algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. A Method for Reducing Training Time of ML-Based Cascade Scheme for Large-Volume Data Analysis.
- Author
-
Izonin, Ivan, Muzyka, Roman, Tkachenko, Roman, Dronyuk, Ivanna, Yemets, Kyrylo, and Mitoulis, Stergios-Aristoteles
- Subjects
PRINCIPAL components analysis ,FEATURE extraction ,DATA analysis ,TRAINING needs ,ALGORITHMS - Abstract
We live in the era of large data analysis, where processing vast datasets has become essential for uncovering valuable insights across various domains of our lives. Machine learning (ML) algorithms offer powerful tools for processing and analyzing this abundance of information. However, the considerable time and computational resources needed for training ML models pose significant challenges, especially within cascade schemes, due to the iterative nature of training algorithms, the complexity of feature extraction and transformation processes, and the large sizes of the datasets involved. This paper proposes a modification to the existing ML-based cascade scheme for analyzing large biomedical datasets by incorporating principal component analysis (PCA) at each level of the cascade. We selected the number of principal components to replace the initial inputs so that it ensured 95% variance retention. Furthermore, we enhanced the training and application algorithms and demonstrated the effectiveness of the modified cascade scheme through comparative analysis, which showcased a significant reduction in training time while improving the generalization properties of the method and the accuracy of the large data analysis. The improved enhanced generalization properties of the scheme stemmed from the reduction in nonsignificant independent attributes in the dataset, which further enhanced its performance in intelligent large data analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. A High-Performance Anti-Noise Algorithm for Arrhythmia Recognition.
- Author
-
Feng, Jianchao, Si, Yujuan, Zhang, Yu, Sun, Meiqi, and Yang, Wenke
- Subjects
BLIND source separation ,INDEPENDENT component analysis ,ARRHYTHMIA ,SIGNAL separation ,PRINCIPAL components analysis ,ALGORITHMS - Abstract
In recent years, the incidence of cardiac arrhythmias has been on the rise because of changes in lifestyle and the aging population. Electrocardiograms (ECGs) are widely used for the automated diagnosis of cardiac arrhythmias. However, existing models possess poor noise robustness and complex structures, limiting their effectiveness. To solve these problems, this paper proposes an arrhythmia recognition system with excellent anti-noise performance: a convolutionally optimized broad learning system (COBLS). In the proposed COBLS method, the signal is convolved with blind source separation using a signal analysis method based on high-order-statistic independent component analysis (ICA). The constructed feature matrix is further feature-extracted and dimensionally reduced using principal component analysis (PCA), which reveals the essence of the signal. The linear feature correlation between the data can be effectively reduced, and redundant attributes can be eliminated to obtain a low-dimensional feature matrix that retains the essential features of the classification model. Then, arrhythmia recognition is realized by combining this matrix with the broad learning system (BLS). Subsequently, the model was evaluated using the MIT-BIH arrhythmia database and the MIT-BIH noise stress test database. The outcomes of the experiments demonstrate exceptional performance, with impressive achievements in terms of the overall accuracy, overall precision, overall sensitivity, and overall F1-score. Specifically, the results indicate outstanding performance, with figures reaching 99.11% for the overall accuracy, 96.95% for the overall precision, 89.71% for the overall sensitivity, and 93.01% for the overall F1-score across all four classification experiments. The model proposed in this paper shows excellent performance, with 24 dB, 18 dB, and 12 dB signal-to-noise ratios. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. An effective video inpainting technique using morphological Haar wavelet transform with krill herd based criminisi algorithm.
- Author
-
Srinivasan, M. Nuthal, Chinnadurai, M., Senthilkumar, S., and Dinesh, E.
- Subjects
WAVELET transforms ,MACHINE learning ,INPAINTING ,ANIMAL herds ,ALGORITHMS ,SIGNAL-to-noise ratio - Abstract
In recent times, video inpainting techniques have intended to fill the missing areas or gaps in a video by utilizing known pixels. The variety in brightness or difference of the patches causes the state-of-the-art video inpainting techniques to exhibit high computation complexity and create seams in the target areas. To resolve these issues, this paper introduces a novel video inpainting technique that employs the Morphological Haar Wavelet Transform combined with the Krill Herd based Criminisi algorithm (MHWT-KHCA) to address the challenges of high computational demand and visible seam artifacts in current inpainting practices. The proposed MHWT-KHCA algorithm strategically reduces computation times and enhances the seamlessness of the inpainting process in videos. Through a series of experiments, the technique is validated against standard metrics such as peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM), where it demonstrates superior performance compared to existing methods. Additionally, the paper outlines potential real-world applications ranging from video restoration to real-time surveillance enhancement, highlighting the technique's versatility and effectiveness. Future research directions include optimizing the algorithm for diverse video formats and integrating machine learning models to advance its capabilities further. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. Probabilistic Confusion Matrix: A Novel Method for Machine Learning Algorithm Generalized Performance Analysis.
- Author
-
Markoulidakis, Ioannis and Markoulidakis, Georgios
- Subjects
MACHINE learning ,MATRICES (Mathematics) ,MACHINE performance ,ALGORITHMS ,CLASSIFICATION - Abstract
The paper addresses the issue of classification machine learning algorithm performance based on a novel probabilistic confusion matrix concept. The paper develops a theoretical framework which associates the proposed confusion matrix and the resulting performance metrics with the regular confusion matrix. The theoretical results are verified based on a wide variety of real-world classification problems and state-of-the-art machine learning algorithms. Based on the properties of the probabilistic confusion matrix, the paper then highlights the benefits of using the proposed concept both during the training phase and the application phase of a classification machine learning algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Unmanned Ground Vehicle Path Planning Based on Improved DRL Algorithm.
- Author
-
Liu, Lisang, Chen, Jionghui, Zhang, Youyuan, Chen, Jiayu, Liang, Jingrun, and He, Dongwei
- Subjects
DEEP reinforcement learning ,MACHINE learning ,AUTONOMOUS vehicles ,REMOTELY piloted vehicles ,ALGORITHMS ,SUCCESSIVE approximation analog-to-digital converters ,REINFORCEMENT learning - Abstract
Path planning and obstacle avoidance are fundamental problems in unmanned ground vehicle path planning. Aiming at the limitations of Deep Reinforcement Learning (DRL) algorithms in unmanned ground vehicle path planning, such as low sampling rate, insufficient exploration, and unstable training, this paper proposes an improved algorithm called Dual Priority Experience and Ornstein–Uhlenbeck Soft Actor-Critic (DPEOU-SAC) based on Ornstein–Uhlenbeck (OU noise) and double-factor prioritized sampling experience replay (DPE) with the introduction of expert experience, which is used to help the agent achieve faster and better path planning and obstacle avoidance. Firstly, OU noise enhances the agent's action selection quality through temporal correlation, thereby improving the agent's detection performance in complex unknown environments. Meanwhile, the experience replay is based on double-factor preferential sampling, which has better sample continuity and sample utilization. Then, the introduced expert experience can help the agent to find the optimal path with faster training speed and avoid falling into a local optimum, thus achieving stable training. Finally, the proposed DPEOU-SAC algorithm is tested against other deep reinforcement learning algorithms in four different simulation environments. The experimental results show that the convergence speed of DPEOU-SAC is 88.99% higher than the traditional SAC algorithm, and the shortest path length of DPEOU-SAC is 27.24, which is shorter than that of SAC. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. A Machine Learning Model to Predict Citation Counts of Scientific Papers in Otology Field.
- Author
-
Alohali, Yousef A., Fayed, Mahmoud S., Mesallam, Tamer, Abdelsamad, Yassin, Almuhawas, Fida, and Hagr, Abdulrahman
- Subjects
DECISION trees ,SERIAL publications ,NATURAL language processing ,BIBLIOMETRICS ,MACHINE learning ,REGRESSION analysis ,RANDOM forest algorithms ,CITATION analysis ,DESCRIPTIVE statistics ,PREDICTION models ,ARTIFICIAL neural networks ,MEDICAL research ,MEDICAL specialties & specialists ,ALGORITHMS - Abstract
One of the most widely used measures of scientific impact is the number of citations. However, due to its heavy-tailed distribution, citations are fundamentally difficult to predict but can be improved. This study was aimed at investigating the factors and parts influencing the citation number of a scientific paper in the otology field. Therefore, this work proposes a new solution that utilizes machine learning and natural language processing to process English text and provides a paper citation as the predicted results. Different algorithms are implemented in this solution, such as linear regression, boosted decision tree, decision forest, and neural networks. The application of neural network regression revealed that papers' abstracts have more influence on the citation numbers of otological articles. This new solution has been developed in visual programming using Microsoft Azure machine learning at the back end and Programming Without Coding Technology at the front end. We recommend using machine learning models to improve the abstracts of research articles to get more citations. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
17. A term extraction algorithm based on machine learning and comprehensive feature strategy.
- Author
-
Gong, Xiuliang, Cheng, Bo, Hu, Xiaomei, and Bo, Wen
- Subjects
MACHINE learning ,NATURAL language processing ,ALGORITHMS ,RANDOM fields ,ONTOLOGIES (Information retrieval) ,DATABASES ,MACHINE translating - Abstract
Manual term extraction is similar to literal meaning: A translator browses text, classifies words, and prepares for translation. Terminology, as a centralized carrier of expertise, creation, popularization, and disappearance, dynamically reflects the development and evolution of an industry. The automatic extraction of terminology is a key technology for creating a professional terminology database, and it is also a key topic in the field of natural language processing. The purpose of this paper is to study how to analyse a term extraction algorithm based on machine learning and a comprehensive feature strategy. Focusing on the problems of poor generality and single statistical features of current term extraction algorithms, this paper proposes an improved domain ontology term extraction algorithm based on a comprehensive feature strategy. Moreover, automatic term extraction experiments based on a word-based maximum entropy model and a conditional random field model based on machine learning are conducted in this paper. Its word-based conditional random field model outperforms the maximum entropy model. The experimental results show that the algorithm based on the comprehensive feature strategy improves the accuracy by 8.6% compared with the TF-IDF algorithm and the C-value term extraction algorithm. This algorithm can be used to effectively extract the terms in a text and has good generality. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. What do algorithms explain? The issue of the goals and capabilities of Explainable Artificial Intelligence (XAI).
- Author
-
Renftle, Moritz, Trittenbach, Holger, Poznic, Michael, and Heil, Reinhard
- Subjects
ARTIFICIAL intelligence ,MACHINE learning ,ALGORITHMS - Abstract
The increasing ubiquity of machine learning (ML) motivates research on algorithms to "explain" models and their predictions—so-called Explainable Artificial Intelligence (XAI). Despite many publications and discussions, the goals and capabilities of such algorithms are far from being well understood. We argue that this is because of a problematic reasoning scheme in the literature: Such algorithms are said to complement machine learning models with desired capabilities, such as interpretability or explainability. These capabilities are in turn assumed to contribute to a goal, such as trust in a system. But most capabilities lack precise definitions and their relationship to such goals is far from obvious. The result is a reasoning scheme that obfuscates research results and leaves an important question unanswered: What can one expect from XAI algorithms? In this paper, we clarify the modest capabilities of these algorithms from a concrete perspective: that of their users. We show that current algorithms can only answer user questions that can be traced back to the question: "How can one represent an ML model as a simple function that uses interpreted attributes?". Answering this core question can be trivial, difficult or even impossible, depending on the application. The result of the paper is the identification of two key challenges for XAI research: the approximation and the translation of ML models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Algorithmic Exploitation in Social Media Human Trafficking and Strategies for Regulation.
- Author
-
Moore, Derek M.
- Subjects
SOCIAL media ,TRAFFIC regulations ,HUMAN trafficking ,THEMATIC analysis ,MACHINE learning ,RESEARCH personnel ,EXPLOITATION of humans - Abstract
Human trafficking thrives in the shadows, and the rise of social media has provided traffickers with a powerful and unregulated tool. This paper delves into how these criminals exploit online platforms to target and manipulate vulnerable populations. A thematic analysis of existing research explores the tactics used by traffickers on social media, revealing how algorithms can be manipulated to facilitate exploitation. Furthermore, the paper examines the limitations of current regulations in tackling this online threat. The research underscores the urgent need for collaboration between governments and researchers to combat algorithmic exploitation. By harnessing data analysis and machine learning, proactive strategies can be developed to disrupt trafficking networks and protect those most at risk. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. Intelligent Stroke Disease Prediction Model Using Deep Learning Approaches.
- Author
-
Gao, Chunhua, Wang, Hui, and Mezzapesa, Domenico Maria
- Subjects
STROKE diagnosis ,RISK assessment ,RANDOM forest algorithms ,PREDICTION models ,DATABASE management ,RESEARCH funding ,SYMPTOMS ,SUPPORT vector machines ,DEEP learning ,ARTIFICIAL neural networks ,STROKE ,COMPARATIVE studies ,MACHINE learning ,DECISION trees ,REGRESSION analysis ,ALGORITHMS ,DISEASE risk factors - Abstract
Stroke is a high morbidity and mortality disease that poses a serious threat to people's health. Early recognition of the various warning signs of stroke is necessary so that timely clinical intervention can help reduce the severity of stroke. Deep neural networks have powerful feature representation capabilities and can automatically learn discriminant features from large amounts of data. This paper uses a range of physiological characteristic parameters and collaborates with deep neural networks, such as the Wasserstein generative adversarial networks with gradient penalty and regression network, to construct a stroke prediction model. Firstly, to address the problem of imbalance between positive and negative samples in the stroke public data set, we performed positive sample data augmentation and utilized WGAN‐GP to generate stroke data with high fidelity and used it for the training of the prediction network model. Then, the relationship between observable physiological characteristic parameters and the predicted risk of suffering a stroke was modeled as a nonlinear mapping transformation, and a stroke prediction model based on a deep regression network was designed. Finally, the proposed method is compared with commonly used machine learning‐based classification algorithms such as decision tree, random forest, support vector machine, and artificial neural networks. The prediction results of the proposed method are optimal in the comprehensive measurement index F. Further ablation experiments also show that the designed prediction model has certain robustness and can effectively predict stroke diseases. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Bio-Inspired Intelligent Swarm Confrontation Algorithm for a Complex Urban Scenario.
- Author
-
Cai, He, Luo, Yaoguo, Gao, Huanli, and Wang, Guangbin
- Subjects
BIOLOGICALLY inspired computing ,MACHINE learning ,WILDLIFE films ,REINFORCEMENT learning ,ALGORITHMS - Abstract
This paper considers the confrontation problem for two tank swarms of equal size and capability in a complex urban scenario. Based on the Unity platform (2022.3.20f1c1), the confrontation scenario is constructed featuring multiple crossing roads. Through the analysis of a substantial amount of biological data and wildlife videos regarding animal behavioral strategies during confrontations for hunting or food competition, two strategies are been utilized to design a novel bio-inspired intelligent swarm confrontation algorithm. The first one is the "fire concentration" strategy, which assigns a target for each tank in a way that the isolated opponent will be preferentially attacked with concentrated firepower. The second one is the "back and forth maneuver" strategy, which makes the tank tactically retreat after firing in order to avoid being hit when the shell is reloading. Two state-of-the-art swarm confrontation algorithms, namely the reinforcement learning algorithm and the assign nearest algorithm, are chosen as the opponents for the bio-inspired swarm confrontation algorithm proposed in this paper. Data of comprehensive confrontation tests show that the bio-inspired swarm confrontation algorithm has significant advantages over its opponents from the aspects of both win rate and efficiency. Moreover, we discuss how vital algorithm parameters would influence the performance indices. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Design and Optimization of Power Shift Tractor Starting Control Strategy Based on PSO-ELM Algorithm.
- Author
-
Qian, Yu, Wang, Lin, and Lu, Zhixiong
- Subjects
CLUTCHES (Machinery) ,FARM tractors ,PARTICLE swarm optimization ,MACHINE learning ,FUZZY algorithms ,ALGORITHMS ,TRACTORS - Abstract
Power shift tractors have been widely used in agricultural tractors in recent years because of their advantages of uninterrupted power during shifting, high transmission efficiency and high stability. As one of the indispensable driving states of the power shift tractor, the starting process requires a small impact and a starting speed that meets the driver's requirements. In this paper, aiming at such contradictory requirements, the starting control strategy of a power shift tractor is formulated with the goal of starting quality and the driver's intention. Firstly, the identification characteristics of the driver under three starting intentions are obtained by a real vehicle test. An extreme learning machine with fast identification speed and short training time is used to establish the basic driver's intention identification model. For the instability of the identification results of the Extreme Learning Machine (ELM), the particle swarm optimization algorithm (PSO) is used to optimize the ELM. The optimized extreme learning machine model has an accuracy of 96.891% for driver's intention identification. The wet clutch is an important part of the power shift gearbox. In this paper, the starting control strategy knowledge base of the starting clutch is established by a combination of bench tests and simulation tests. Through the fuzzy algorithm, the driver's intention is combined with the starting control strategy. Different drivers' intentions will affect the comprehensive evaluation model of the clutch (the single evaluation index of the clutch is: the maximum sliding power, the sliding power, the speed stability time, the impact degree), thus affecting the final choice of the starting clutch control strategy considering the driver's intention. On this basis, this paper studies and establishes the MPC starting controller for the power shift gearbox. Compared with the linear control strategy, the PSO-ELM-fuzzy weight starting strategy proposed in this paper can reduce the maximum sliding friction power by 45%, the sliding friction power by 69.45%, and the speed stabilization time by 0.11 s. The effectiveness of the starting control strategy considering the driver's intention proposed in this paper to improve the starting quality of the power shift tractor is verified. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. VIS-SLAM: A Real-Time Dynamic SLAM Algorithm Based on the Fusion of Visual, Inertial, and Semantic Information.
- Author
-
Wang, Yinglong, Liu, Xiaoxiong, Zhao, Minkun, and Xu, Xinlong
- Subjects
MOBILE robots ,MACHINE learning ,MOBILE learning ,DEEP learning ,ALGORITHMS ,INFORMATION measurement ,PROBABILITY theory ,GEOMETRY - Abstract
A deep learning-based Visual Inertial SLAM technique is proposed in this paper to ensure accurate autonomous localization of mobile robots in environments with dynamic objects. Addressing the limitations of real-time performance in deep learning algorithms and the poor robustness of pure visual geometry algorithms, this paper presents a deep learning-based Visual Inertial SLAM technique. Firstly, a non-blocking model is designed to extract semantic information from images. Then, a motion probability hierarchy model is proposed to obtain prior motion probabilities of feature points. For image frames without semantic information, a motion probability propagation model is designed to determine the prior motion probabilities of feature points. Furthermore, considering that the output of inertial measurements is unaffected by dynamic objects, this paper integrates inertial measurement information to improve the estimation accuracy of feature point motion probabilities. An adaptive threshold-based motion probability estimation method is proposed, and finally, the positioning accuracy is enhanced by eliminating feature points with excessively high motion probabilities. Experimental results demonstrate that the proposed algorithm achieves accurate localization in dynamic environments while maintaining real-time performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. An Algorithm for Distracted Driving Recognition Based on Pose Features and an Improved KNN.
- Author
-
Gong, Yingjie and Shen, Xizhong
- Subjects
DISTRACTED driving ,MACHINE learning ,K-nearest neighbor classification ,ALGORITHMS ,DEEP learning ,TRAFFIC safety ,MOTOR vehicle driving - Abstract
To reduce safety accidents caused by distracted driving and address issues such as low recognition accuracy and deployment difficulties in current algorithms for distracted behavior detection, this paper proposes an algorithm that utilizes an improved KNN for classifying driver posture features to predict distracted driving behavior. Firstly, the number of channels in the Lightweight OpenPose network is pruned to predict and output the coordinates of key points in the upper body of the driver. Secondly, based on the principles of ergonomics, driving behavior features are modeled, and a set of five-dimensional feature values are obtained through geometric calculations. Finally, considering the relationship between the distance between samples and the number of samples, this paper proposes an adjustable distance-weighted KNN algorithm (ADW-KNN), which is used for classification and prediction. The experimental results show that the proposed algorithm achieved a recognition rate of 94.04% for distracted driving behavior on the public dataset SFD3, with a speed of up to 50FPS, superior to mainstream deep learning algorithms in terms of accuracy and speed. The superiority of ADW-KNN was further verified through experiments on other public datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. Explainable Rules and Heuristics in AI Algorithm Recommendation Approaches--A Systematic Literature Review and Mapping Study.
- Author
-
García-Peñalvo, Francisco José, Vázquez-Ingelmo, Andrea, and García-Holgado, Alicia
- Subjects
ARTIFICIAL intelligence ,LITERATURE reviews ,SOFTWARE engineering ,ALGORITHMS ,HEURISTIC ,SOFTWARE engineers - Abstract
The exponential use of artificial intelligence (AI) to solve and automated complex tasks has catapulted its popularity generating some challenges that need to be addressed. While AI is a powerful means to discover interesting patterns and obtain predictive models, the use of these algorithms comes with a great responsibility, as an incomplete or unbalanced set of training data or an unproper interpretation of the models' outcomes could result in misleading conclusions that ultimately could become very dangerous. For these reasons, it is important to rely on expert knowledge when applying these methods. However, not every user can count on this specific expertise; non-AI-expert users could also benefit from applying these powerful algorithms to their domain problems, but they need basic guidelines to obtain the most out of AI models. The goal of this work is to present a systematic review of the literature to analyze studies whose outcomes are explainable rules and heuristics to select suitable AI algorithms given a set of input features. The systematic review follows the methodology proposed by Kitchenham and other authors in the field of software engineering. As a result, 9 papers that tackle AI algorithm recommendation through tangible and traceable rules and heuristics were collected. The reduced number of retrieved papers suggests a lack of reporting explicit rules and heuristics when testing the suitability and performance of AI algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
26. Reproducibility of Deep Learning Algorithms Developed for Medical Imaging Analysis: A Systematic Review.
- Author
-
Moassefi, Mana, Rouzrokh, Pouria, Conte, Gian Marco, Vahdati, Sanaz, Fu, Tianyuan, Tahmasebi, Aylin, Younis, Mira, Farahani, Keyvan, Gentili, Amilcare, Kline, Timothy, Kitamura, Felipe C., Huo, Yuankai, Kuanar, Shiba, Younis, Khaled, Erickson, Bradley J., and Faghani, Shahriar
- Subjects
DEEP learning ,RESEARCH evaluation ,SYSTEMATIC reviews ,ARTIFICIAL intelligence ,DIAGNOSTIC imaging ,DESCRIPTIVE statistics ,ALGORITHMS ,WORLD Wide Web - Abstract
Since 2000, there have been more than 8000 publications on radiology artificial intelligence (AI). AI breakthroughs allow complex tasks to be automated and even performed beyond human capabilities. However, the lack of details on the methods and algorithm code undercuts its scientific value. Many science subfields have recently faced a reproducibility crisis, eroding trust in processes and results, and influencing the rise in retractions of scientific papers. For the same reasons, conducting research in deep learning (DL) also requires reproducibility. Although several valuable manuscript checklists for AI in medical imaging exist, they are not focused specifically on reproducibility. In this study, we conducted a systematic review of recently published papers in the field of DL to evaluate if the description of their methodology could allow the reproducibility of their findings. We focused on the Journal of Digital Imaging (JDI), a specialized journal that publishes papers on AI and medical imaging. We used the keyword "Deep Learning" and collected the articles published between January 2020 and January 2022. We screened all the articles and included the ones which reported the development of a DL tool in medical imaging. We extracted the reported details about the dataset, data handling steps, data splitting, model details, and performance metrics of each included article. We found 148 articles. Eighty were included after screening for articles that reported developing a DL model for medical image analysis. Five studies have made their code publicly available, and 35 studies have utilized publicly available datasets. We provided figures to show the ratio and absolute count of reported items from included studies. According to our cross-sectional study, in JDI publications on DL in medical imaging, authors infrequently report the key elements of their study to make it reproducible. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
27. Regional 3D geological modeling along metro lines based on stacking ensemble model.
- Author
-
Xia Bian, Zhuyi Fan, Jiaxing Liu, Xiaozhao Li, and Peng Zhao
- Subjects
GEOLOGICAL modeling ,BOREHOLES ,MACHINE learning ,STACKING machines ,ALGORITHMS - Abstract
This paper presents a regional 3D geological modeling method based on the stacking ensemble technique to overcome the challenges of sparse borehole data in large-scale linear underground projects. The proposed method transforms the 3D geological modeling problem into a stratigraphic property classification problem within a subsurface space grid cell framework. Borehole data is pre-processed and trained using stacking method with five different machine learning algorithms. The resulting modelled regional cells are then classified, forming a regional 3D grid geological model. A case study for an area of 324 km2 along Xuzhou metro lines is presented to demonstrate the effectiveness of the proposed model. The study shows an overall prediction accuracy of 85.4%. However, the accuracy for key stratigraphy layers influencing the construction risk, such as karst carve strata, is only 4.3% due to the limited borehole data. To address this issue, an oversampling technique based on the synthetic minority oversampling technique (SMOTE) algorithm is proposed. This technique effectively increases the number of sparse stratigraphic samples and significantly improves the prediction accuracy for karst caves to 65.4%. Additionally, this study analyzes the impact of sampling distance on model accuracy. It is found that a lower sampling interval results in higher prediction accuracy, but also increases computational resources and time costs. Therefore, in this study, an optimal sampling distance of 1 m is chosen to balance prediction accuracy and computation cost. Furthermore, the number of geological strata is found to have a negative effect on prediction accuracy. To mitigate this, it is recommended to merge less significant stratigraphy layers, reducing computation time. For key strata layers, such as karst caves, which have a significant impact on construction risk, further onsite sampling or oversampling using the SMOTE technique is recommended. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Discriminative shapelet learning via temporal clustering and matrix factorization.
- Author
-
Chen, Bo, Fang, Min, and Wang, GuiZhi
- Subjects
MACHINE learning ,MATRIX decomposition ,TIME series analysis ,CLASSIFICATION ,ALGORITHMS - Abstract
Identifying discriminative patterns, known as shapelets, within time series is a critical step in many time series classification tasks. A major limitation of shapelet learning is that often hindered by their unsupervised methods, treating shapelet learning as an unsupervised subsequence clustering process and discovery based on pre-defined metric, which performed sequentially. This sequential procedure presents challenges, as it fails to establish a direct connection between shapelets and samples, and lacks the capacity to explicitly incorporate label information. In this paper, we proposed a novel shapelet learning algorithm called Discriminative Shapelet Learning via Temporal Clustering and Matrix Factorization (DSLMF). DSLMF introduced a joint framework that combines matrix factorization and coherent temporal clustering to discovery salient and coherent feature subsets. To further enhance discriminability and prevent arbitrary shapelet shapes, DSLMF integrates a label-specific shapelet regularization as a guiding mechanism enabling the learning of shapelets optimized for higher classification performance. The proposed algorithm has shown to be effective for capturing the temporal cluster structure and interpretability of shapelet-based method. The results of experiments showcased in this paper highlight DSLMF's effectiveness in capturing temporal cluster structures and learning meaningful shapelets, ultimately leading to promising performance on benchmark datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. CyMac: Diving Deep into the Application of Machine Learning Algorithms in Cyber Security.
- Author
-
Das, Bishwajit, Yadav, Nikita, Chauhan, Deepa, and Gupta, Sanju
- Subjects
INTERNET security ,ALGORITHMS ,MACHINE learning ,PHISHING prevention ,JURISDICTION - Abstract
Machine learning has emerged as a climatic technology in contemporary and prospective cyber threat intel systems, with numerous jurisdictions seamlessly integrating it into their operations. However, the current state of machine learning in cyber defence is still in its early stages, foreshadowing a noticeable unexplored research territory and practical implementation. This paper marks the initial endeavour to offer a comprehensive understanding of machine learning within the entire spectrum of cybersecurity jurisdictions, catering to potential end users with enthusiasm in this field of study. This paper aims to serve as a source of inspiration for significant advancements in ML within the cyber defence zone, laying the groundwork for the broader adoption of ML mitigations to safeguard present and heuristic systems. [ABSTRACT FROM AUTHOR]
- Published
- 2024
30. Non-intrusive residential load identification based on load feature matrix and CBAM-BiLSTM algorithm.
- Author
-
Shunfu Lin, Bing Zhao, Yinfeng Zhan, Junsu Yu, Xiaoyan Bian, Dongdong Li, and Yongxin Xiong
- Subjects
ARTIFICIAL neural networks ,IDENTIFICATION ,OPTIMIZATION algorithms ,ALGORITHMS ,MACHINE learning ,MATRICES (Mathematics) - Abstract
With the increasing demand for the refined management of residential loads, the study of the non-invasive load monitoring (NILM) technologies has attracted much attention in recent years. This paper proposes a novel method of residential load identification based on load feature matrix and improved neural networks. Firstly, it constructs a unified scale bitmap format gray image consisted of multiple load feature matrix including: V-I characteristic curve, 1-16 harmonic currents, 1- cycle steady-state current waveform, maximum and minimum current values, active and reactive power. Secondly, it adopts a convolutional layer to extract image features and performs further feature extraction through a convolutional block attention module (CBAM). Thirdly, the feature matrix is converted and input to a bidirectional long short-term memory (BiLSTM) for training and identification. Furthermore, the identification results are optimized with dynamic time warping (DTW). The effectiveness of the proposed method is verified by the commonly used PLAID database. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Research on ELoran Demodulation Algorithm Based on Multiclass Support Vector Machine.
- Author
-
Liu, Shiyao, Yan, Baorong, Guo, Wei, Hua, Yu, Zhang, Shougang, Lu, Jun, Xu, Lu, and Yang, Dong
- Subjects
SUPPORT vector machines ,PULSE modulation ,DEMODULATION ,ALGORITHMS ,EXHIBITIONS - Abstract
Demodulation and decoding are pivotal for the eLoran system's timing and information transmission capabilities. This paper proposes a novel demodulation algorithm leveraging a multiclass support vector machine (MSVM) for pulse position modulation (PPM) of eLoran signals. Firstly, the existing demodulation method based on envelope phase detection (EPD) technology is reviewed, highlighting its limitations. Secondly, a detailed exposition of the MSVM algorithm is presented, demonstrating its theoretical foundations and comparative advantages over the traditional method and several other methods proposed in this study. Subsequently, through comprehensive experiments, the algorithm parameters are optimized, and the parallel comparison of different demodulation methods is carried out in various complex environments. The test results show that the MSVM algorithm is significantly superior to traditional methods and other kinds of machine learning algorithms in demodulation accuracy and stability, particularly in high-noise and -interference scenarios. This innovative algorithm not only broadens the design approach for eLoran receivers but also fully meets the high-precision timing service requirements of the eLoran system. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. NICE: an algorithm for nearest instance counterfactual explanations.
- Author
-
Brughmans, Dieter, Leyman, Pieter, and Martens, David
- Subjects
MACHINE learning ,ALGORITHMS ,EXPLANATION ,CLASSIFICATION - Abstract
In this paper we propose a new algorithm, named NICE, to generate counterfactual explanations for tabular data that specifically takes into account algorithmic requirements that often emerge in real-life deployments: (1) the ability to provide an explanation for all predictions, (2) being able to handle any classification model (also non-differentiable ones), (3) being efficient in run time, and (4) providing multiple counterfactual explanations with different characteristics. More specifically, our approach exploits information from a nearest unlike neighbor to speed up the search process, by iteratively introducing feature values from this neighbor in the instance to be explained. We propose four versions of NICE, one without optimization and, three which optimize the explanations for one of the following properties: sparsity, proximity or plausibility. An extensive empirical comparison on 40 datasets shows that our algorithm outperforms the current state-of-the-art in terms of these criteria. Our analyses show a trade-off between on the one hand plausibility and on the other hand proximity or sparsity, with our different optimization methods offering users the choice to select the types of counterfactuals that they prefer. An open-source implementation of NICE can be found at https://github.com/ADMAntwerp/NICE. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. DeepEPhishNet: a deep learning framework for email phishing detection using word embedding algorithms.
- Author
-
Somesha, M and Pais, Alwyn Roshan
- Subjects
DEEP learning ,PHISHING ,SOCIAL engineering (Fraud) ,ARTIFICIAL neural networks ,EMAIL ,ALGORITHMS ,MACHINE learning - Abstract
Email phishing is a social engineering scheme that uses spoofed emails intended to trick the user into disclosing legitimate business and personal credentials. Many phishing email detection techniques exist based on machine learning, deep learning, and word embedding. In this paper, we propose a new technique for the detection of phishing emails using word embedding (Word2Vec, FastText, and TF-IDF) and deep learning techniques (DNN and BiLSTM network). Our proposed technique makes use of only four header based (From, Returnpath, Subject, Message-ID) features of the emails for the email classification. We applied several word embeddings for the evaluation of our models. From the experimental evaluation, we observed that the DNN model with FastText-SkipGram achieved an accuracy of 99.52% and BiLSTM model with FastText-SkipGram achieved an accuracy of 99.42%. Among these two techniques, DNN outperformed BiLSTM using the same word embedding (FastText-SkipGram) techniques with an accuracy of 99.52%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. An end-to-end framework for private DGA detection as a service.
- Author
-
Maia, Ricardo J. M., Ray, Dustin, Pentyala, Sikha, Dowsley, Rafael, De Cock, Martine, Nascimento, Anderson C. A., and Jacobi, Ricardo
- Subjects
MACHINE learning ,DATA entry ,ALGORITHMS ,PRIVACY ,MALWARE - Abstract
Domain Generation Algorithms (DGAs) are used by malware to generate pseudorandom domain names to establish communication between infected bots and command and control servers. While DGAs can be detected by machine learning (ML) models with great accuracy, offering DGA detection as a service raises privacy concerns when requiring network administrators to disclose their DNS traffic to the service provider. The main scientific contribution of this paper is to propose the first end-to-end framework for privacy-preserving classification as a service of domain names into DGA (malicious) or non-DGA (benign) domains. Our framework achieves these goals by carefully designed protocols that combine two privacy-enhancing technologies (PETs), namely secure multi-party computation (MPC) and differential privacy (DP). Through MPC, our framework enables an enterprise network administrator to outsource the problem of classifying a DNS (Domain Name System) domain as DGA or non-DGA to an external organization without revealing any information about the domain name. Moreover, the service provider's ML model used for DGA detection is never revealed to the network administrator. Furthermore, by using DP, we also ensure that the classification result cannot be used to learn information about individual entries of the training data. Finally, we leverage post-training float16 quantization of deep learning models in MPC to achieve efficient, secure DGA detection. We demonstrate that by using quantization achieves a significant speed-up, resulting in a 23% to 42% reduction in inference runtime without reducing accuracy using a three party secure computation protocol tolerating one corruption. Previous solutions are not end-to-end private, do not provide differential privacy guarantees for the model's outputs, and assume that model embeddings are publicly known. Our best protocol in terms of accuracy runs in about 0.22s. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. Single-Machine Scheduling with Simultaneous Learning Effects and Delivery Times.
- Author
-
Liu, Zheng and Wang, Ji-Bo
- Subjects
MACHINE learning ,HEURISTIC algorithms ,TARDINESS ,CONSUMERS ,ALGORITHMS ,COMPUTER scheduling - Abstract
This paper studies the single-machine scheduling problem with truncated learning effect, time-dependent processing time, and past-sequence-dependent delivery time. The delivery time is the time that the job is delivered to the customer after processing is complete. The goal is to determine an optimal job schedule to minimize the total weighted completion time and maximum tardiness. In order to solve the general situation of the problem, we propose a branch-and-bound algorithm and other heuristic algorithms. Computational experiments also prove the effectiveness of the given algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. Prioritized Experience Replay–Based Path Planning Algorithm for Multiple UAVs.
- Author
-
Ren, Chongde, Chen, Jinchao, Du, Chenglie, and Chen, Pengyun
- Subjects
MACHINE learning ,REINFORCEMENT learning ,DRONE aircraft ,COMBINATORIAL optimization ,ALGORITHMS - Abstract
Unmanned aerial vehicles (UAVs) have been extensively researched and deployed in both military and civilian applications due to their tiny size, low cost, and great ease. Although UAVs working together on complicated jobs can significantly increase productivity and reduce costs, they can cause major issues with path planning. In complex environments, the path planning problem, which is a multiconstraint combinatorial optimization problem and hard to settle, requires considering numerous constraints and limitations and generates the best paths for each UAV to accomplish group tasks. In this paper, we study the path planning problem for multiple UAVs and propose a reinforcement learning algorithm: PERDE‐MADDPG based on prioritized experience replay (PER) and delayed update skills. First, we adopt a PER mechanism based on temporal difference (TD) error to enhance the efficiency of experience utilization and accelerate the convergence speed of the algorithm. Second, we use delayed updates in the process of updating network parameters to ensure stability in training multiple agents. Finally, we propose the PERDE‐MADDPG algorithm based on PER and delayed update skills, which is evaluated against the MATD3, MADDPG, and SAC methods in simulation scenarios to confirm its efficacy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. A novel interpretable predictive model based on ensemble learning and differential evolution algorithm for surface roughness prediction in abrasive water jet polishing.
- Author
-
Xie, Shutong, He, Zongbao, Loh, Yee Man, Yang, Yu, Liu, Kunhong, Liu, Chao, Cheung, Chi Fai, Yu, Nan, and Wang, Chunjin
- Subjects
DIFFERENTIAL evolution ,WATER jets ,PREDICTION models ,MACHINE learning ,ABRASIVES ,ALGORITHMS ,GRINDING & polishing ,SURFACE roughness - Abstract
As an important indicator of the surface quality of workpieces, surface roughness has a great impact on production costs and the quality performance of the finished components. Effective surface roughness prediction can not only increase productivity but also reduce costs. However, the current methods for surface roughness prediction have some limitations. On the one hand, the prediction accuracy of classical experimental and statistical-based surface roughness prediction methods is low. On the other hand, the results of deep learning-based surface roughness prediction methods are uninterpretable due to their black-box learning mechanism. Therefore, this paper presents an ensemble learning with a differential evolution algorithm, applies it to the prediction of surface roughness of abrasive water jet polishing (AWJP), and conducts an interpretability analysis to identify key factors contributing to the prediction accuracy of surface roughness. First, we proposed automatically constructing features by an Evolution Forest algorithm to train the base regression models. The differential evolution algorithm with a simplified encoding mechanism was then used to search for the best weighted-ensemble to integrate the base regression models for obtaining highly accurate prediction results. Extensive experiments have been conducted on AWJP to validate the effectiveness of our proposed methods. The results show that the prediction accuracy of our proposed method is higher than the existing machine learning algorithms. In addition, this is the first of its time for the contributions of machining parameters (i.e., features) on surface roughness prediction by using interpretable analysis methods. The analysis results can provide a reference basis for subsequent experiments and studies. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. Research on Model Selection-Based Weighted Averaged One-Dependence Estimators.
- Author
-
Zhang, Chengzhen, Chen, Shenglei, and Ke, Huihang
- Subjects
BAYESIAN analysis ,MACHINE learning ,CLASSIFICATION ,ALGORITHMS - Abstract
The Averaged One-Dependence Estimators (AODE) is a popular and effective method of Bayesian classification. In AODE, selecting the optimal sub-model based on a cross-validated risk minimization strategy can further enhance classification performance. However, existing cross-validation risk minimization strategies do not consider the differences in attributes in classification decisions. Consequently, this paper introduces an algorithm for Model Selection-based Weighted AODE (SWAODE). To express the differences in attributes in classification decisions, the ODE corresponding to attributes are weighted, with mutual information commonly used in the field of machine learning adopted as weights. Then, these weighted sub-models are evaluated and selected using leave-one-out cross-validation (LOOCV) to determine the best model. The new method can improve the accuracy and robustness of the model and better adapt to different data features, thereby enhancing the performance of the classification algorithm. Experimental results indicate that the algorithm merges the benefits of weighting with model selection, markedly enhancing the classification efficiency of the AODE algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Enhanced Computational Biased Proportional Navigation with Neural Networks for Impact Time Control.
- Author
-
Zhang, Xue and Hong, Haichao
- Subjects
PROPORTIONAL navigation ,MACHINE learning ,ALGORITHMS - Abstract
Advanced computational methods are being applied to address traditional guidance problems, yet research is still ongoing regarding how to utilize them effectively and scientifically. A numerical root-finding method was proposed to determine the bias in biased proportional navigation to achieve the impact time control without time-to-go estimation. However, the root-finding algorithm in the original method might experience efficiency and convergence issues. This paper introduces an enhanced method based on neural networks, where the bias is directly output by the neural networks, significantly improving computational efficiency and addressing convergence issues. The novelty of this method lies in the development of a reasonable structure that appropriately integrates off-the-shelf machine learning techniques to effectively enhance the original iteration-based methods. In addition to demonstrating its effectiveness and performance of its own, two comparative scenarios are presented: (a) Evaluate the time consumption when both the proposed and the original methods operate at the same update frequency. (b) Compare the achievable update frequencies of both methods under the condition of equal real-world time usage. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. Three-layer data center-based intelligent slice admission control algorithm for C-RAN using approximate reinforcement learning.
- Author
-
Khani, Mohsen, Jamali, Shahram, and Sohrabi, Mohammad Karim
- Subjects
MACHINE learning ,RADIO access networks ,5G networks ,ALGORITHMS ,REINFORCEMENT learning - Abstract
C-RAN (Cloud Radio Access Network) is a 5G architecture that consists of sites and three-layer Data Centers (DCs), which include the central office DC, local DC, and regional DC. Network slicing, which enables infrastructure providers (InP) to create independent logical networks, is essential in this architecture. By utilizing this technology, InPs can maximize the utility of the network by providing slices to service providers in response to their slice requests. However, almost all of the recent research on slice admission control (SAC) schemes has only considered one or two layers of DCs, which limits the efficiency of the slicing process and decreases network utility. To address these issues, this paper proposes an intelligent SAC scheme called ISAC that considers all three-layer DCs. Instead of relying on reinforcement learning algorithms like Q-learning, which are effective in discrete environments with limited state space but give poor performance in continuous environments, ISAC employs the Approximate Reinforcement Learning (ARL) algorithm. ARL is better suited for 5G network modeling because it can adapt to continuous environments, allowing for a more accurate representation of the underlying physical processes. Extensive simulation studies demonstrate that ISAC significantly improves performance in terms of slice request rejection rates, InP revenue, accepting more slices, and optimizing resource utilization. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. Deep Learning System for User Identification Using Sensors on Doorknobs.
- Author
-
Vegas, Jesús, Rao, A. Ravishankar, and Llamas, César
- Subjects
DEEP learning ,SYSTEM identification ,MACHINE learning ,DOOR knobs ,ALGORITHMS ,GYROSCOPES - Abstract
Door access control systems are important to protect the security and integrity of physical spaces. Accuracy and speed are important factors that govern their performance. In this paper, we investigate a novel approach to identify users by measuring patterns of their interactions with a doorknob via an embedded accelerometer and gyroscope and by applying deep-learning-based algorithms to these measurements. Our identification results obtained from 47 users show an accuracy of 90.2%. When the sex of the user is used as an input feature, the accuracy is 89.8% in the case of male individuals and 97.0% in the case of female individuals. We study how the accuracy is affected by the sample duration, finding that is its possible to identify users using a sample of 0.5 s with an accuracy of 68.5%. Our results demonstrate the feasibility of using patterns of motor activity to provide access control, thus extending with it the set of alternatives to be considered for behavioral biometrics. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. A Fair Contribution Measurement Method for Federated Learning.
- Author
-
Guo, Peng, Yang, Yanqing, Guo, Wei, and Shen, Yanping
- Subjects
FEDERATED learning ,COOPERATIVE game theory ,DATA privacy ,MACHINE learning ,ALGORITHMS - Abstract
Federated learning is an effective approach for preserving data privacy and security, enabling machine learning to occur in a distributed environment and promoting its development. However, an urgent problem that needs to be addressed is how to encourage active client participation in federated learning. The Shapley value, a classical concept in cooperative game theory, has been utilized for data valuation in machine learning services. Nevertheless, existing numerical evaluation schemes based on the Shapley value are impractical, as they necessitate additional model training, leading to increased communication overhead. Moreover, participants' data may exhibit Non-IID characteristics, posing a significant challenge to evaluating participant contributions. Non-IID data have greatly affected the accuracy of the global model, weakened the marginal effect of the participants, and led to the underestimated contribution measurement results of the participants. Current work often overlooks the impact of heterogeneity on model aggregation. This paper presents a fair federated learning contribution measurement scheme that addresses the need for additional model computations. By introducing a novel aggregation weight, it enhances the accuracy of the contribution measurement. Experiments on the MNIST and Fashion MNIST dataset show that the proposed method can accurately compute the contributions of participants. Compared to existing baseline algorithms, the model accuracy is significantly improved, with a similar time cost. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. 基于强化和模仿学习的多智能体 寻路干扰者鉴别通信机制.
- Author
-
李梦甜, 向颖岑, 谢志峰, and 马利庄
- Subjects
MACHINE learning ,REINFORCEMENT learning ,PROBLEM solving ,ALGORITHMS ,SCALABILITY - Abstract
Copyright of Application Research of Computers / Jisuanji Yingyong Yanjiu is the property of Application Research of Computers Edition and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
44. Research on Unsupervised Feature Point Prediction Algorithm for Multigrid Image Stitching.
- Author
-
Li, Jun, Chen, Yufeng, and Mu, Aiming
- Subjects
MACHINE learning ,MATRIX inversion ,FIX-point estimation ,ALGORITHMS ,PARAMETERIZATION - Abstract
The conventional feature point-based image stitching algorithm exhibits inconsistencies in the quality of feature points across diverse scenes. This may result in the deterioration of the alignment effect or even the inability to align two images. To address this issue, this paper presents an unsupervised multigrid image alignment method that integrates the conventional feature point-based image alignment algorithm with deep learning techniques. The method postulates that the feature points are uniformly distributed in the image and employs a deep learning network to predict their displacements, thereby enhancing the robustness of the feature points. Furthermore, the precision of image alignment is enhanced through the parameterization of APAP (As-projective-as-possible image stitching with moving DLT) multigrid deformation. Ultimately, based on the symmetry exhibited by the homography matrix and its inverse matrix throughout the projection process, image chunking inverse warping is introduced to obtain the stitched images for the multigrid deep learning network. Additionally, the mesh shape-preserving loss is introduced to constrain the shape of the multigrid. The experimental results demonstrate that in the real-world UDIS-D dataset, the method achieves notable improvements in feature point matching and homography estimation tasks, and exhibits superior alignment performance on the traditional image stitching dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. Improved Behavioral Cloning and DDPG's Driverless Decision Model.
- Author
-
LI Weidong, HUANG Zhenzhu, HE Jingwu, MA Caoyuan, and GE Cheng
- Subjects
MACHINE learning ,ROAD maintenance ,REINFORCEMENT learning ,RACING automobiles ,ALGORITHMS - Abstract
The key to driverless technology is that the decision-making level makes accurate instructions based on the input information of the perception link. Reinforcement learning and imitation learning are better suited for complex scenarios than traditional rules. However, the imitation learning represented by behavioral cloning has the problem of composite error, and this paper uses the priority empirical playback algorithm to improve the behavioral cloning to improve the fitting ability of the model to the demo dataset. The original DDPG (deep deterministic policy gradient) algorithm has the problem of low exploration efficiency, and the experience pool separation and random network distillation (RND) technology are used to improve the DDPG algorithm and improve the training efficiency of DDPG algorithm. The improved algorithm is used for joint training to reduce the useless exploration in the early stage of DDPG training. Verified by TORC (the open racing car simulator) simulation platform, the experimental results show that the proposed method can explore more stable road maintenance, speed maintenance and obstacle avoidance ability within the same number of training times. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. Advancing Renewable Energy Forecasting: A Comprehensive Review of Renewable Energy Forecasting Methods.
- Author
-
Teixeira, Rita, Cerveira, Adelaide, Pires, Eduardo J. Solteiro, and Baptista, José
- Subjects
CLEAN energy ,RENEWABLE energy sources ,OPTIMIZATION algorithms ,RENEWABLE natural resources ,ENERGY consumption ,WIND power - Abstract
Socioeconomic growth and population increase are driving a constant global demand for energy. Renewable energy is emerging as a leading solution to minimise the use of fossil fuels. However, renewable resources are characterised by significant intermittency and unpredictability, which impact their energy production and integration into the power grid. Forecasting models are increasingly being developed to address these challenges and have become crucial as renewable energy sources are integrated in energy systems. In this paper, a comparative analysis of forecasting methods for renewable energy production is developed, focusing on photovoltaic and wind power. A review of state-of-the-art techniques is conducted to synthesise and categorise different forecasting models, taking into account climatic variables, optimisation algorithms, pre-processing techniques, and various forecasting horizons. By integrating diverse techniques such as optimisation algorithms and pre-processing methods and carefully selecting the forecast horizon, it is possible to highlight the accuracy and stability of forecasts. Overall, the ongoing development and refinement of forecasting methods are crucial to achieve a sustainable and reliable energy future. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. Clustering Network Traffic Using Semi-Supervised Learning.
- Author
-
Krajewska, Antonina and Niewiadomska-Szynkiewicz, Ewa
- Subjects
MACHINE learning ,MATRIX decomposition ,COMPUTER network traffic ,NONNEGATIVE matrices ,ALGORITHMS - Abstract
Clustering algorithms play a crucial role in early warning cybersecurity systems. They allow for the detection of new attack patterns and anomalies and enhance system performance. This paper discusses the problem of clustering data collected by a distributed system of network honeypots. In the proposed approach, when a network flow matches an attack signature, an appropriate label is assigned to it. This enables the use of semi-supervised learning algorithms and improves the quality of clustering results. The article compares the results of learning algorithms conducted with and without partial supervision, particularly non-negative matrix factorization and semi-supervised non-negative matrix factorization. Our results confirm the positive impact of labeling a portion of flows on the quality of clustering. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. Maximizing intrusion detection efficiency for IoT networks using extreme learning machine.
- Author
-
Altamimi, Shahad and Abu Al-Haija, Qasem
- Subjects
MACHINE learning ,SUPERVISED learning ,INTERNET of things ,TELECOMMUNICATION systems ,CYBERTERRORISM ,ALGORITHMS - Abstract
Intrusion Detection Systems (IDSs) are crucial for safeguarding modern IoT communication networks against cyberattacks. IDSs must exhibit exceptional performance, low false positive rates, and significant flexibility in constructing attack patterns to efficiently identify and neutralize these attacks. This research paper discusses the use of an Extreme Learning Machine (ELM) as a new technique to enhance the performance of IDSs. The study utilizes two standard IDS-based IoT network datasets: NSL-KDD 2009 via Distilled-Kitsune 2021. Both datasets are used to assess the effectiveness of ELM in a conventional supervised learning setting. The study investigates the capacity of the ELM algorithm to handle high-dimensional and unbalanced data, indicating the potential to enhance IDS accuracy and efficiency. The research also examines the setup of ELM for both NSL_KDD and Kitsune using Python and Google COLAB to do binary and multi-class classification. The experimental evaluation revealed the proficient performance of the proposed ELM-based IDS among other implemented supervised learning-based IDSs and other state-of-the-art models in the same study area. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
49. Predictive Modelling Of Stress Levels: A Comparative Analysis Of Machine Learning Algorithms.
- Author
-
Sasane, Sumit and Mulla, Zameer Ahmed S.
- Subjects
MACHINE learning ,PREDICTION models ,COMPARATIVE studies ,WELL-being ,ALGORITHMS - Abstract
This research paper investigates the efficacy of various machine learning algorithms in predicting stress levels. By employing a diverse set of algorithms, including [List of Algorithms], we aim to identify the most accurate and reliable model for stress prediction. The study utilizes [Dataset Information] to train and test the algorithms, evaluating their performance based on metrics such as accuracy, precision, recall, and F1 score. The findings will contribute to the development of more effective stress prediction models, with potential applications in healthcare, workplace wellness, and personal well-being. [ABSTRACT FROM AUTHOR]
- Published
- 2024
50. Extended Isolation Forest for Intrusion Detection in Zeek Data.
- Author
-
Moomtaheen, Fariha, Bagui, Sikha S., Bagui, Subhash C., and Mink, Dustin
- Subjects
COMPUTER network traffic ,ALGORITHMS ,RECONNAISSANCE operations - Abstract
The novelty of this paper is in determining and using hyperparameters to improve the Extended Isolation Forest (EIF) algorithm, a relatively new algorithm, to detect malicious activities in network traffic. The EIF algorithm is a variation of the Isolation Forest algorithm, known for its efficacy in detecting anomalies in high-dimensional data. Our research assesses the performance of the EIF model on a newly created dataset composed of Zeek Connection Logs, UWF-ZeekDataFall22. To handle the enormous volume of data involved in this research, the Hadoop Distributed File System (HDFS) is employed for efficient and fault-tolerant storage, and the Apache Spark framework, a powerful open-source Big Data analytics platform, is utilized for machine learning (ML) tasks. The best results for the EIF algorithm came from the 0-extension level. We received an accuracy of 82.3% for the Resource Development tactic, 82.21% for the Reconnaissance tactic, and 78.3% for the Discovery tactic. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.