8,151 results on '"clustering algorithms"'
Search Results
2. Predictive Web Prefetching: A Combined Approach Using Clustering Algorithms and WEKA in High-Traffic Settings
- Author
-
Ajibesin, Adeyimi Abel, Vajjhala, Narasimha Rao, Joel, Ernest, Rakshit, Sandip, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Lin, Frank, editor, Pastor, David, editor, Kesswani, Nishtha, editor, Patel, Ashok, editor, Bordoloi, Sushanta, editor, and Koley, Chaitali, editor
- Published
- 2025
- Full Text
- View/download PDF
3. Unsupervised algorithms to identify potential under-coding of secondary diagnoses in hospitalisations databases in Portugal.
- Author
-
Portela, Diana, Amaral, Rita, Rodrigues, Pedro P, Freitas, Alberto, Costa, Elísio, Fonseca, João A, and Sousa-Pinto, Bernardo
- Subjects
- *
DIAGNOSIS of diabetes , *ASTHMA diagnosis , *CLINICAL medicine , *MEDICAL information storage & retrieval systems , *CLUSTER analysis (Statistics) , *HOSPITAL care , *SEX distribution , *HOSPITALS , *HOSPITAL mortality , *DESCRIPTIVE statistics , *DIAGNOSTIC errors , *MEDICAL records , *MEDICAL coding , *MANAGEMENT of medical records , *DATA quality , *DATA analysis software , *COMORBIDITY , *ALGORITHMS , *REGRESSION analysis - Abstract
Background: Quantifying and dealing with lack of consistency in administrative databases (namely, under-coding) requires tracking patients longitudinally without compromising anonymity, which is often a challenging task. Objective: This study aimed to (i) assess and compare different hierarchical clustering methods on the identification of individual patients in an administrative database that does not easily allow tracking of episodes from the same patient; (ii) quantify the frequency of potential under-coding; and (iii) identify factors associated with such phenomena. Method: We analysed the Portuguese National Hospital Morbidity Dataset, an administrative database registering all hospitalisations occurring in Mainland Portugal between 2011–2015. We applied different approaches of hierarchical clustering methods (either isolated or combined with partitional clustering methods), to identify potential individual patients based on demographic variables and comorbidities. Diagnoses codes were grouped into the Charlson an Elixhauser comorbidity defined groups. The algorithm displaying the best performance was used to quantify potential under-coding. A generalised mixed model (GML) of binomial regression was applied to assess factors associated with such potential under-coding. Results: We observed that the hierarchical cluster analysis (HCA) + k-means clustering method with comorbidities grouped according to the Charlson defined groups was the algorithm displaying the best performance (with a Rand Index of 0.99997). We identified potential under-coding in all Charlson comorbidity groups, ranging from 3.5% (overall diabetes) to 27.7% (asthma). Overall, being male, having medical admission, dying during hospitalisation or being admitted at more specific and complex hospitals were associated with increased odds of potential under-coding. Discussion: We assessed several approaches to identify individual patients in an administrative database and, subsequently, by applying HCA + k-means algorithm, we tracked coding inconsistency and potentially improved data quality. We reported consistent potential under-coding in all defined groups of comorbidities and potential factors associated with such lack of completeness. Conclusion: Our proposed methodological framework could both enhance data quality and act as a reference for other studies relying on databases with similar problems. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Use of Patterns of Service Utilization and Hierarchical Survival Analysis in Planning and Providing Care for Overdose Patients and Predicting the Time-to-Second Overdose.
- Author
-
Bambi, Jonas, Olobatuyi, Kehinde, Santoso, Yudi, Sadri, Hanieh, Moselle, Ken, Rudnick, Abraham, Dong, Gracia Yunruo, Chang, Ernie, and Kuo, Alex
- Subjects
MACHINE learning ,NATURAL language processing ,OPIOID epidemic ,CONTINUUM of care ,SURVIVAL analysis (Biometry) - Abstract
Individuals from a variety of backgrounds are affected by the opioid crisis. To provide optimal care for individuals at risk of opioid overdose and prevent subsequent overdoses, a more targeted response that goes beyond the traditional taxonomical diagnosis approach to care management needs to be adopted. In previous works, Graph Machine Learning and Natural Language Processing methods were used to model the products for planning and evaluating the treatment of patients with complex issues. This study proposes a methodology of partitioning patients in the opioid overdose cohort into various communities based on their patterns of service utilization (PSUs) across the continuum of care using graph community detection and applying survival analysis to predict time-to-second overdose for each of the communities. The results demonstrated that the overdose cohort is not homogeneous with respect to the determinants of risk. Moreover, the risk for subsequent overdose was quantified: there is a 51% higher chance of experiencing a second overdose for a high-risk community compared to a low-risk community. The proposed method can inform a more efficient treatment heterogeneity approach for a cohort made of diverse individuals, such as the opioid overdose cohort. It can also guide targeted support for patients at risk of subsequent overdoses. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Feature selection with clustering probabilistic particle swarm optimization.
- Author
-
Gao, Jinrui, Wang, Ziqian, Lei, Zhenyu, Wang, Rong-Long, Wu, Zhengwei, and Gao, Shangce
- Abstract
Dealing with high-dimensional data poses a significant challenge in machine learning. To address this issue, researchers have proposed feature selection as a viable solution. Due to the intricate search space involved in feature selection, swarm intelligence algorithms have gained popularity for their exceptional search capabilities. This study introduces a method called Clustering Probabilistic Particle Swarm Optimization (CPPSO) to revolutionize the traditional particle swarm optimization approach by incorporating probabilities to represent velocity and incorporating an elitism mechanism. Furthermore, CPPSO employs a clustering strategy based on the K-means algorithm, utilizing the Hamming distance to divide the population into two sub-populations to improve the performance. To assess CPPSO's performance, a comparative analysis is conducted against seven existing algorithms using twenty diverse datasets. These datasets are all based on real-world problems. Fifteen of these are frequently used in feature selection research, while the remaining five comprise imbalanced datasets as well as multi-label datasets. The experimental results demonstrate the superiority of CPPSO across a range of evaluation criteria, surpassing the performance of the comparative algorithms on the majority of the datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. A deep learning object detection method to improve cluster analysis of two-dimensional data.
- Author
-
Couturier, Raphaël, Gregori, Pablo, Noura, Hassan, Salman, Ola, and Sider, Abderrahmane
- Subjects
OBJECT recognition (Computer vision) ,DEEP learning ,INFORMATION retrieval ,CLUSTER analysis (Statistics) ,BIG data - Abstract
Clustering is an unsupervised machine learning method grouping data samples into clusters of similar objects, used as a system support tool in numerous applications such as banking customers profiling, document retrieval, image segmentation, and e-commerce recommendation engines. The effectiveness of several clustering techniques is sensible to the initialization parameters, and different solutions have been proposed in the literature to overcome this limitation. They require high computational memory consumption when dealing with big data. In this paper, we propose the application of a recent object detection Deep Learning model (YOLO-v5) for assisting the initialization of classical techniques and improving their effectiveness on two-variate datasets, leveraging the accuracy and reducing dramatically the memory and time consumption of classical clustering methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Clustering Molecules at a Large Scale: Integrating Spectral Geometry with Deep Learning.
- Author
-
Akgüller, Ömer, Balcı, Mehmet Ali, and Cioca, Gabriela
- Subjects
- *
SPECTRAL geometry , *MACHINE learning , *DEEP learning , *GEOMETRIC approach , *K-means clustering - Abstract
This study conducts an in-depth analysis of clustering small molecules using spectral geometry and deep learning techniques. We applied a spectral geometric approach to convert molecular structures into triangulated meshes and used the Laplace–Beltrami operator to derive significant geometric features. By examining the eigenvectors of these operators, we captured the intrinsic geometric properties of the molecules, aiding their classification and clustering. The research utilized four deep learning methods: Deep Belief Network, Convolutional Autoencoder, Variational Autoencoder, and Adversarial Autoencoder, each paired with k-means clustering at different cluster sizes. Clustering quality was evaluated using the Calinski–Harabasz and Davies–Bouldin indices, Silhouette Score, and standard deviation. Nonparametric tests were used to assess the impact of topological descriptors on clustering outcomes. Our results show that the DBN + k-means combination is the most effective, particularly at lower cluster counts, demonstrating significant sensitivity to structural variations. This study highlights the potential of integrating spectral geometry with deep learning for precise and efficient molecular clustering. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. A Nature-Inspired Partial Distance-Based Clustering Algorithm.
- Author
-
El Habib Kahla, Mohammed, Beggas, Mounir, Laouid, Abdelkader, and Hammoudeh, Mohammad
- Subjects
FISH locomotion ,CLASSIFICATION algorithms ,ARTIFICIAL intelligence ,COLLECTIVE behavior ,HIERARCHICAL clustering (Cluster analysis) - Abstract
In the rapidly advancing landscape of digital technologies, clustering plays a critical role in the domains of artificial intelligence and big data. Clustering is essential for extracting meaningful insights and patterns from large, intricate datasets. Despite the efficacy of traditional clustering techniques in handling diverse data types and sizes, they encounter challenges posed by the increasing volume and dimensionality of data, as well as the complex structures inherent in high-dimensional spaces. This research recognizes the constraints of conventional clustering methods, including sensitivity to initial centroids, dependence on prior knowledge of cluster counts, and scalability issues, particularly in large datasets and Internet of Things implementations. In response to these challenges, we propose a K-level clustering algorithm inspired by the collective behavior of fish locomotion. K-level introduces a novel clustering approach based on greedy merging driven by distances in stages. This iterative process efficiently establishes hierarchical structures without the need for exhaustive computations. K-level gives users enhanced control over computational complexity, enabling them to specify the number of clusters merged simultaneously. This flexibility ensures accurate and efficient hierarchical clustering across diverse data types, offering a scalable solution for processing extensive datasets within a reasonable timeframe. The internal validation metrics, including the Silhouette Score, Davies–Bouldin Index, and Calinski–Harabasz Index, are utilized to evaluate the K-level algorithm across various types of datasets. Additionally, comparisons are made with rivals in the literature, including UPGMA, CLINK, UPGMC, SLINK, and K-means. The experiments and analyses show that the proposed algorithm overcomes many of the limitations of existing clustering methods, presenting scalable and adaptable clustering in the dynamic landscape of evolving data challenges. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Comprehensive analysis of clustering algorithms: exploring limitations and innovative solutions.
- Author
-
Wani, Aasim Ayaz
- Subjects
HIERARCHICAL clustering (Cluster analysis) ,SOCIAL network analysis ,CLUSTER analysis (Statistics) ,DATA structures ,MACHINE learning - Abstract
This survey rigorously explores contemporary clustering algorithms within the machine learning paradigm, focusing on five primary methodologies: centroid-based, hierarchical, density-based, distribution-based, and graph-based clustering. Through the lens of recent innovations such as deep embedded clustering and spectral clustering, we analyze the strengths, limitations, and the breadth of application domains—ranging from bioinformatics to social network analysis. Notably, the survey introduces novel contributions by integrating clustering techniques with dimensionality reduction and proposing advanced ensemble methods to enhance stability and accuracy across varied data structures. This work uniquely synthesizes the latest advancements and offers new perspectives on overcoming traditional challenges like scalability and noise sensitivity, thus providing a comprehensive roadmap for future research and practical applications in data-intensive environments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Water Supply Pipeline Operation Anomaly Mining and Spatiotemporal Correlation Study.
- Author
-
Yang, Yanmei, Liu, Ao, Wang, Zegen, Yong, Zhiwei, Sun, Tao, Li, Jie, and Ma, Guoli
- Subjects
- *
MUNICIPAL water supply , *POLYWATER , *WATER pressure , *APRIORI algorithm , *WATER supply - Abstract
The recurrent manifestation of anomalies in water supply network systems exerts a profound influence on individuals' daily lives. Despite this impact, contemporary research on urban water supply networks reveals a conspicuous lack in the thorough examination of spatiotemporal patterns and the relevance of these anomalies. This investigation meticulously scrutinizes anomalies within a specified segment of the water supply pipe network located in a county in southwest China. Clustering algorithms [ K -means and density-based spatial clustering of applications with noise (DBSCAN)] and statistical methods (standard deviation) identify anomalous water pressure. Subsequently, the Apriori algorithm is utilized to extract association rules for different types of anomalies, and these rules are compared with user similarity, quantified through standard Euclidean distance. The key findings are as follows. First, anomalies in water pressure are predominantly concentrated in May, September, and November. On a 24-h scale, the highest incidence of anomalies occurs between 6:00 a.m. and 9:00 a.m. Areas with the highest anomaly occurrence are primarily situated near the city center and the railway station. Second, correlation rules exist among occurrences of anomalous values at various monitoring sites within the study area. In concrete terms, identical water pressure abnormal types frequently co-occur (confidence level >50% , support level >3%) at diverse monitoring sites, with this correlation linked to the types of users around the monitoring sites. Finally, the categorization of anomalies results in significantly enhanced accuracy in correlation rule outcomes, surpassing the comprehensive analysis of anomalies overall. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Optimizing Patient Stratification in Healthcare: A Comparative Analysis of Clustering Algorithms for EHR Data
- Author
-
Abeer Aljohani
- Subjects
Clustering algorithms ,Healthcare data analysis ,Decision-making ,Machine learning ,Healthcare analytics ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Abstract Advanced data analytics are increasingly being employed in healthcare research to improve patient classification and personalize medicinal therapies. In this paper, we focus on the critical problem of clustering electronic health record (EHR) data to enable appropriate patient categorization. In the era of personalized medicine, optimizing patient classification is critical to healthcare analytics. This research presents a comparative assessment of different clustering algorithms for Electronic Health Record (EHR) data, with the goal of improving the efficacy and productivity of patient clustering methods. Our study focuses on Fuzzy Technique for Order of Preference by Similarity to Ideal Solution (Fuzzy TOPSIS) as a Multi-Criteria Decision-Making (MCDM) strategy, includes an in-depth assessment of eight clustering algorithms: K-Means, DBSCAN, Hierarchical Clustering, Mean Shift, Affinity Propagation, Spectral Clustering, Gaussian Mixture Models (GMM), as well as Self-Organizing Maps. The evaluation factors used for evaluation in this research are Cluster Quality Metrics, Scalability, Robustness to Noise, Cluster Shape and Density, Interpretability, Cluster Number, Dimensionality, and Consistency and Stability. These criteria and alternatives were chosen after conducting a thorough assessment of the literature and consulting with domain experts. All participated specialists actively engaged in the decision-making process, bringing unique insights into the best clustering algorithms for healthcare data. The results of this study illustrate each algorithm’s strengths and weaknesses in the setting of patient stratification, providing insight into their performance across multiple dimensions. The fuzzy TOPSIS MCDM strategy is a reliable instrument for synthesizing expert opinions and methodically evaluating the found clustering alternatives. This study advances healthcare analytics by giving practitioners and researchers with informative perspectives on the selection of clustering algorithms designed to address the unique problems of patient stratification utilizing EHR data.
- Published
- 2024
- Full Text
- View/download PDF
12. An Improved Parallel Biobjective Hybrid Real-Coded Genetic Algorithm with Clustering-Based Selection
- Author
-
Akopov Andranik S.
- Subjects
clustering algorithms ,multi-objective optimization ,real-coded genetic algorithms ,particle swarm optimization ,multiagent socioeconomic systems ,agent-based modeling ,Cybernetics ,Q300-390 - Abstract
This work presents an improved parallel biobjective hybrid real-coded genetic algorithm (MORCGA-MOPSO-II). The approach is based on the combined use of the parallel Multi-Objective Real-Coded Genetic Algorithm (MORCGA) and the Multi-Objective Particle Swarm Optimization (MOPSO). At the same time, clustering-based selection techniques are used to form subpopulations of parent individuals. Using well-known clustering algorithms (e.g., k-Means, hierarchical clustering, c-means, and DBSCAN) in combination with the proposed clustering-based mutation (the CL-mutation) directed toward the obtained cluster centers allows for improving the quality of the Pareto fronts’ approximations. The results of the MORCGA-MOPSO-II were compared with other well-known multi-objective evolutionary algorithms (e.g., SPEA2, NSGA-II, FCGA, MOSPO, etc.). Moreover, the MORCGA-MOPSO-II was integrated with the previously developed agent-based model of a goods exchange through the objective functions. As a result, the Pareto fronts have been obtained for the agent-based model of a goods exchange in different configurations of the initial distribution of agents.
- Published
- 2024
- Full Text
- View/download PDF
13. Machine Learning-Based Transactions Anomaly Prediction for Enhanced IoT Blockchain Network Security and Performance.
- Author
-
Abdullah, Nor Fadzilah, Kairaldeen, Ammar Riadh, Abu-Samah, Asma, and Nordin, Rosdiadee
- Abstract
The integration of blockchain technology with the rapid growth of Internet of Things (IoT) devices has enabled secure and decentralised data exchange. However, security vulnerabilities and performance limitations remain significant challenges in IoT blockchain networks. This work proposes a novel approach that combines transaction representation and machine learning techniques to address these challenges. Various clustering techniques, including k-means, DBSCAN, Gaussian Mixture Models (GMM), and Hierarchical clustering, were employed to effectively group unlabelled transaction data based on their intrinsic characteristics. Anomaly transaction prediction models based on classifiers were then developed using the labelled data. Performance metrics such as accuracy, precision, recall, and F1-measure were used to identify the minority class representing specious transactions or security threats. The classifiers were also evaluated on their performance using balanced and unbalanced data. Compared to unbalanced data, balanced data resulted in an overall average improvement of approximately 15.85% in accuracy, 88.76% in precision, 60% in recall, and 74.36% in F1-score. This demonstrates the effectiveness of each classifier as a robust classifier with consistently better predictive performance across various evaluation metrics. Moreover, the k-means and GMM clustering techniques outperformed other techniques in identifying security threats, underscoring the importance of appropriate feature selection and clustering methods. The findings have practical implications for reinforcing security and efficiency in real-world IoT blockchain networks, paving the way for future investigations and advancements. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. Anomaly Detection Based on GCNs and DBSCAN in a Large-Scale Graph.
- Author
-
Retiti Diop Emane, Christopher, Song, Sangho, Lee, Hyeonbyeong, Choi, Dojin, Lim, Jongtae, Bok, Kyoungsoo, and Yoo, Jaesoo
- Subjects
ANOMALY detection (Computer security) ,INTRUSION detection systems (Computer security) ,REPRESENTATIONS of graphs ,DATA structures ,DEEP learning ,DATA integrity ,FUZZY algorithms - Abstract
Anomaly detection is critical across domains, from cybersecurity to fraud prevention. Graphs, adept at modeling intricate relationships, offer a flexible framework for capturing complex data structures. This paper proposes a novel anomaly detection approach, combining Graph Convolutional Networks (GCNs) and Density-Based Spatial Clustering of Applications with Noise (DBSCAN). GCNs, a specialized deep learning model for graph data, extracts meaningful node and edge representations by incorporating graph topology and attribute information. This facilitates learning expressive node embeddings capturing local and global structural patterns. For anomaly detection, DBSCAN, a density-based clustering algorithm effective in identifying clusters of varying densities amidst noise, is employed. By defining a minimum distance threshold and a minimum number of points within that distance, DBSCAN proficiently distinguishes normal graph elements from anomalies. Our approach involves training a GCN model on a labeled graph dataset, generating appropriately labeled node embeddings. These embeddings serve as input to DBSCAN, identifying clusters and isolating anomalies as noise points. The evaluation on benchmark datasets highlights the superior performance of our approach in anomaly detection compared to traditional methods. The fusion of GCNs and DBSCAN demonstrates a significant potential for accurate and efficient anomaly detection in graphs. This research contributes to advancing graph-based anomaly detection, with promising applications in domains where safeguarding data integrity and security is paramount. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Energy efficient power cap configurations through Pareto front analysis and machine learning categorization.
- Author
-
Cabrera, Alberto, Almeida, Francisco, Castellanos-Nieves, Dagoberto, Oleksiak, Ariel, and Blanco, Vicente
- Subjects
- *
PARETO analysis , *MACHINE learning , *MODERN architecture , *ENERGY consumption , *COMPUTER systems , *VIDEO coding , *CENTRAL processing units - Abstract
The growing demand for more computing resources has increased the overall energy consumption of computer systems. To support this increasing demand, power and energy consumption must be considered as a constraint on software execution. Modern architectures provide tools for managing the power constraints of a system directly. The Intel Power Cap is a relatively new tool developed to give users fine-grained control over power usage at the central processing unit (CPU) level. The complexity of these tools, in addition to the high variety of modern heterogeneous architectures, hinders predictions of the energy consumption and the performance of any target software. The application of power capping technologies usually leads to the bi-objective optimization problem for energy efficiency and execution time but optimal power constraints could also produce exceeding performance losses. Thus, methods and tools are needed to calculate the proper parameters for power capping technologies, and to optimize energy efficiency. We propose a methodology to analyze the performance and the energy efficiency trade-offs using this power cap technology for a given application. A Pareto front is extracted for the multi-objective performance and energy problem, which represents multiple feasible configurations for both objectives. An extensive experimentation is carried out to categorize the different applications to determine the overall optimal power cap configurations. We propose the use of machine learning (ML) clustering techniques to categorize each application in the target architecture. The use of ML allows us to automate the process and simplifies the effort required to solve the optimization problem. A practical case is presented where we categorize the applications using ML techniques, with the possibility of adding a new application into an existing categorization. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Unsupervised Machine Learning for Bot Detection on Twitter: Generating and Selecting Features for Accurate Clustering.
- Author
-
Al-Azawi, Raad Ghazi and AL-mamory, Safaa O.
- Subjects
- *
MACHINE learning , *SUPERVISED learning , *SOCIAL media , *BOTNETS , *FEATURE selection - Abstract
Twitter is a popular social media platform that is widely used by individuals and businesses. However, it is vulnerable to bot attacks, which can have negative effects on society. Supervised machine learning techniques can detect bots but they require labeled data to differentiate between human and bot users. Twitter generates a significant amount of unlabeled data, which can be expensive to be labeled. This issue can be addressed by exploiting the advantages of unsupervised machine learning techniques, specifically clustering algorithms as such techniques are crucial for managing such kind of data and reducing computational complexity. However, feature selection is necessary for clustering, as some features are more important than others. This study aims to enhance feature reliability, introduce new features, and reduce the proposed model's complexity. This, in turn, can improve bot identification accuracy based on clustering algorithms. The study achieved a Fowlkes-Mallows score of 0.99 in DBSCAN clustering algorithms, including agglomerative hierarchy, k-medoids, DBSCAN, and K-means. This was accomplished by minimizing dataset dimensions and selecting essential features. By employing unsupervised machine learning techniques, Twitter can detect and mitigate bot attacks more efficiently, which can positively impact society. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Assessment of the real‐time pattern recognition capability of machine learning algorithms.
- Author
-
Polytarchos, Elias, Bardaki, Cleopatra, and Pramatari, Katerina
- Subjects
- *
PATTERN recognition systems , *MACHINE learning , *BLOCKCHAINS , *INTERNET of things - Abstract
Nowadays data streams from different sources, like blockchain‐based and traditional financial transactions, social networks, and interconnected Internet of Things (IoT) devices, are becoming increasingly large in volume and the need to recognize patterns in real time from these streams, while adapting to their velocity and veracity, is emerging. Established machine learning algorithms used for pattern recognition methods have not been designed taking under account the volume, velocity, diversity, and accuracy of the data streams. This research contributes with an approach for assessing the pattern recognition capabilities of established machine learning algorithms when handling volatile data in real time and proposes a system that adapts the algorithms to the requirements of data streams, as well as assesses their pattern recognition capabilities based on established criteria. The system was applied for assessing five machine learning algorithms with input from a data stream from Bluetooth beacons tracking consumers in a retail store. This research can support future data scientists and analysts who need to reveal data patterns in big, volatile data streams in real time in order to support effective decision‐making in the respective application domain. Copyright © 2024 John Wiley & Sons, Ltd. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. An efficient healthcare system by cloud computing and clustering-based hybrid machine learning algorithm.
- Author
-
Ramapraba, Palayanoor Seethapathy, Radhika, Moorthy, Sumathi, Sokkanarayanan, Karthik, Jayavarapu, and Senthamilarasi, Nachiappan
- Subjects
MACHINE learning ,DEEP learning ,CLINICAL decision support systems ,ARTIFICIAL intelligence ,COMPUTER systems ,GENETIC algorithms - Abstract
Cloud computing, deep learning, clustering, genetic, and ensemble algorithms in healthcare are gaining popularity. This research highlights the relevance and complex repercussions of this integration. Cloud computing is transforming healthcare by providing scalable data storage and application access. It streamlines data exchange between hospitals, researchers, and institutions. Deep learning allows healthcare systems to use artificial intelligence for diagnostics, predictive analytics, and customized medication. Clustering algorithms segment patients, improving therapy and intervention customization. Genetic algorithms can optimize healthcare processes like treatment planning and resource allocation. Ensemble algorithms combine multiple models to improve predicted accuracy, enabling strong healthcare decision-making. This connection has several benefits. Healthcare systems become more efficient and scalable, resulting in cost-effective resource allocation. Access to patient data and apps promotes collaborative research and real-time healthcare. Deep learning algorithms can recognize complex medical data patterns, improving illness diagnosis and treatment results. Clustering algorithms streamline customized healthcare by stratifying individuals by clinical variables. Genetic algorithms optimize resource allocation, assuring healthcare resource efficiency. Ensemble algorithms improve predicted accuracy and clinical decision support system dependability. Its efficiency, accessibility, and prediction accuracy are positives, but security, resource constraints, interpretability, and ethical issues are obstacles. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Hiyerarşik Bölücü Kümeleme ile Çok Sınıflı Sınıflandırma Performansının Geliştirilmesi: Kümeleme Algoritmaları Üzerine Karşılaştırmalı Bir Analiz
- Author
-
Alagöz, Celal
- Abstract
Copyright of Dicle University Journal of Engineering / Dicle Üniversitesi Mühendislik Dergisi is the property of Dicle Universitesi and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
20. Evolution of Hybrid Cellular Automata for Density Classification Problem.
- Author
-
Anghelescu, Petre
- Subjects
- *
CELLULAR automata , *BIOLOGICALLY inspired computing , *CELLULAR evolution , *IMAGE recognition (Computer vision) , *INDUSTRIAL robots , *PLURALITY voting - Abstract
This paper describes a solution for the image density classification problem (DCP) using an entirely distributed system with only local processing of information named cellular automata (CA). The proposed solution uses two cellular automata's features, density conserving and translation of the information stored in the cellular automata's cells through the lattice, in order to obtain the solution for the density classification problem. The motivation for choosing a bio-inspired technique based on CA for solving the DCP is to investigate the principles of self-organizing decentralized computation and to assess the capabilities of CA to achieve such computation, which is applicable to many real-world decentralized problems that require a decision to be taken by majority voting, such as multi-agent holonic systems, collaborative robots, drones' fleet, image analysis, traffic optimization, forming and then separating clusters with different values. The entire application is coded using the C# programming language, and the obtained results and comparisons between different cellular automata configurations are also discussed in this research. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. A Survey and Study of Signal and Data-Driven Approaches for Pipeline Leak Detection and Localization.
- Author
-
Rajasekaran, Uma and Kothandaraman, Mohanaprasad
- Subjects
- *
LEAK detection , *CONVOLUTIONAL neural networks , *ARTIFICIAL intelligence , *HAZARDOUS wastes , *CROSS correlation - Abstract
A pipeline is critical in conveying water, oil, gas, petrochemicals, and slurry. As the pipeline ages and corrodes, it becomes susceptible to deterioration, resulting in wastage and hazardous damages depending on the material it transports. To mitigate these risks, implementing a suitable monitoring system becomes essential, enabling the early identification of damage and minimizing waste and the potential for hazardous incidents. The pipeline monitoring system can be exterior, visual/biological, and computational. This paper surveys state-of-the-art approaches and also performs experimental analyses with a few methods in signal/data-driven approaches within computational methods. More precisely, signal processing-based leak localization methods, artificial intelligence-based leak detection methods, and combined approaches are given. This paper implements five signal processing-based methods and 17 artificial intelligence-based methods. This implementation helps to compare and understand the significance of appropriate noise removal and feature extraction. The data for this analysis is collected using acousto-optic sensors from an experimental setup. After implementation, the highest observed leak localization accuracy is 99.14% with the wavelet packet adaptive independent component analysis-based generalized cross correlation, and the highest leak detection accuracy is 98.32% with the one-dimensional convolutional neural network. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Comprehensive analysis of clustering algorithms: exploring limitations and innovative solutions
- Author
-
Aasim Ayaz Wani
- Subjects
Clustering algorithms ,Unsupervised learning ,Scalability and efficiency ,Centroid-based clustering ,Hierarchical clustering ,Density-based clustering ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
This survey rigorously explores contemporary clustering algorithms within the machine learning paradigm, focusing on five primary methodologies: centroid-based, hierarchical, density-based, distribution-based, and graph-based clustering. Through the lens of recent innovations such as deep embedded clustering and spectral clustering, we analyze the strengths, limitations, and the breadth of application domains—ranging from bioinformatics to social network analysis. Notably, the survey introduces novel contributions by integrating clustering techniques with dimensionality reduction and proposing advanced ensemble methods to enhance stability and accuracy across varied data structures. This work uniquely synthesizes the latest advancements and offers new perspectives on overcoming traditional challenges like scalability and noise sensitivity, thus providing a comprehensive roadmap for future research and practical applications in data-intensive environments.
- Published
- 2024
- Full Text
- View/download PDF
23. Advancing Customer Segmentation in Banking: Harnessing Machine Learning and H2O for Personalized Insights
- Author
-
Omran, Marawan, Hamza, Khaled, Elghamrawy, Sally, Xhafa, Fatos, Series Editor, Hassanien, Aboul Ella, editor, Darwish, Ashraf, editor, F. Tolba, Mohammed, editor, and Snasel, Vaclav, editor
- Published
- 2024
- Full Text
- View/download PDF
24. Comprehensive Analysis of Clustering Techniques on Microblog Tweets
- Author
-
Christy, K. T., Bibal Benifa, J. V., Bansal, Jagdish Chand, Series Editor, Deep, Kusum, Series Editor, Nagar, Atulya K., Series Editor, Tripathi, Ashish Kumar, editor, and Anand, Darpan, editor
- Published
- 2024
- Full Text
- View/download PDF
25. An Energy Optimization Clustering Methods for Homogeneous Networks of Wireless Sensors
- Author
-
Tekulapally, Venumadhav, Chali, Diriba, Abose, Tadele A., Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, R., Annie Uthra, editor, Kottursamy, Kottilingam, editor, Raja, Gunasekaran, editor, Bashir, Ali Kashif, editor, Kose, Utku, editor, Appavoo, Revathi, editor, and Madhivanan, Vimaladevi, editor
- Published
- 2024
- Full Text
- View/download PDF
26. Comparative Analysis of Machine Learning Clustering Methods for Electroretinogram
- Author
-
Zhdanov, Aleksei, Bulev, Daniil, Dolganov, Anton, Kulyabin, Mikhail, Magjarević, Ratko, Series Editor, Ładyżyński, Piotr, Associate Editor, Ibrahim, Fatimah, Associate Editor, Lackovic, Igor, Associate Editor, Rock, Emilio Sacristan, Associate Editor, Costin, Hariton-Nicolae, editor, and Petroiu, Gladiola Gabriela, editor
- Published
- 2024
- Full Text
- View/download PDF
27. Reverse Clustering: A New Perspective in Data Analysis
- Author
-
Owsiński, Jan W., Kacprzyk, Janusz, Series Editor, Novikov, Dmitry A., Editorial Board Member, Shi, Peng, Editorial Board Member, Cao, Jinde, Editorial Board Member, Polycarpou, Marios, Editorial Board Member, Pedrycz, Witold, Editorial Board Member, Balas, Valentina Emilia, editor, Dzemyda, Gintautas, editor, and Belciug, Smaranda, editor
- Published
- 2024
- Full Text
- View/download PDF
28. Fast Multi-scale Batch-Learning Growing Neural Gas
- Author
-
Obo, Takenori, Kubota, Naoyuki, Toda, Yuichiro, Masuyama, Naoki, Rudas, Imre J., Series Editor, Szakál, Anikó, Series Editor, Batyrshin, Ildar, Editorial Board Member, Bokor, József, Editorial Board Member, De Baets, Bernard, Editorial Board Member, Fujita, Hamido, Editorial Board Member, Fukuda, Toshio, Editorial Board Member, Harashima, Fumio, Editorial Board Member, Hirota, Kaoru, Editorial Board Member, Pap, Endre, Editorial Board Member, Wilamowski, Bogdan M., Editorial Board Member, Baranyi, P., Advisory Editor, Bodenhofer, U., Advisory Editor, Fichtinger, G., Advisory Editor, Fullér, R., Advisory Editor, Galántai, A., Advisory Editor, Hluchý, L., Advisory Editor, Jamshidi, M. O., Advisory Editor, Kelemen, J., Advisory Editor, Kocur, D., Advisory Editor, Korondi, P., Advisory Editor, Kovács, G., Advisory Editor, Kóczy, L. T., Advisory Editor, Madarász, L., Advisory Editor, Nguyen, CH. C., Advisory Editor, Petriu, E., Advisory Editor, Precup, R.-E., Advisory Editor, Preitl, S., Advisory Editor, Prostean, O., Advisory Editor, Puri, V., Advisory Editor, Sallai, G. Y., Advisory Editor, Somló, J., Advisory Editor, Takács, M., Advisory Editor, Tar, J., Advisory Editor, Ungvari, L., Advisory Editor, Várkonyi-Kóczy, A. R., Advisory Editor, Várlaki, P., Advisory Editor, Vokorokos, L., Advisory Editor, Kovács, Levente, editor, and Haidegger, Tamás, editor
- Published
- 2024
- Full Text
- View/download PDF
29. Inference Algorithm for Knowledge Bases with Rule Cluster Structure
- Author
-
Nowak-Brzezińska, Agnieszka, Gaibei, Igor, Hartmanis, Juris, Founding Editor, van Leeuwen, Jan, Series Editor, Hutchison, David, Editorial Board Member, Kanade, Takeo, Editorial Board Member, Kittler, Josef, Editorial Board Member, Kleinberg, Jon M., Editorial Board Member, Kobsa, Alfred, Series Editor, Mattern, Friedemann, Editorial Board Member, Mitchell, John C., Editorial Board Member, Naor, Moni, Editorial Board Member, Nierstrasz, Oscar, Series Editor, Pandu Rangan, C., Editorial Board Member, Sudan, Madhu, Series Editor, Terzopoulos, Demetri, Editorial Board Member, Tygar, Doug, Editorial Board Member, Weikum, Gerhard, Series Editor, Vardi, Moshe Y, Series Editor, Goos, Gerhard, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Franco, Leonardo, editor, de Mulatier, Clélia, editor, Paszynski, Maciej, editor, Krzhizhanovskaya, Valeria V., editor, Dongarra, Jack J., editor, and Sloot, Peter M. A., editor
- Published
- 2024
- Full Text
- View/download PDF
30. Building Blocks
- Author
-
Alfaqeeh, Mosab, Skillicorn, David B., Alhajj, Reda, Series Editor, Glässer, Uwe, Series Editor, Aggarwal, Charu C., Advisory Editor, Brantingham, Patricia L., Advisory Editor, Gross, Thilo, Advisory Editor, Han, Jiawei, Advisory Editor, Manásevich, Raúl, Advisory Editor, Masys, Anthony J., Advisory Editor, Alfaqeeh, Mosab, and Skillicorn, David B.
- Published
- 2024
- Full Text
- View/download PDF
31. A Machine Learning Approach for Points of Interest Extraction and Event Classification
- Author
-
Dias, Pedro, Ferreira, Flora, Guimarães, Pedro M. F., Wojtak, Weronika, Erlhagen, Wolfram, Monteiro, Sérgio, Sousa, Emanuel, Bicho, Estela, Rannenberg, Kai, Editor-in-Chief, Soares Barbosa, Luís, Editorial Board Member, Carette, Jacques, Editorial Board Member, Tatnall, Arthur, Editorial Board Member, Neuhold, Erich J., Editorial Board Member, Stiller, Burkhard, Editorial Board Member, Stettner, Lukasz, Editorial Board Member, Pries-Heje, Jan, Editorial Board Member, Kreps, David, Editorial Board Member, Rettberg, Achim, Editorial Board Member, Furnell, Steven, Editorial Board Member, Mercier-Laurent, Eunika, Editorial Board Member, Winckler, Marco, Editorial Board Member, Malaka, Rainer, Editorial Board Member, Maglogiannis, Ilias, editor, Iliadis, Lazaros, editor, Macintyre, John, editor, Avlonitis, Markos, editor, and Papaleonidas, Antonios, editor
- Published
- 2024
- Full Text
- View/download PDF
32. Risk Perception Visualization of Public Health Emergencies Based on Clustering Algorithms
- Author
-
Zhao, Zhe, Cai, Yiguo, Gong, Zhen, Ni, Hao, Xu, Zhiyong, Sun, Li, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Jin, Hai, editor, Pan, Yi, editor, and Lu, Jianfeng, editor
- Published
- 2024
- Full Text
- View/download PDF
33. 2ARTs: A Platform for Exercise Prescriptions in Cardiac Recovery Patients
- Author
-
Pereira, Andreia, Martinho, Ricardo, Pinto, Rui, Rijo, Rui, Grilo, Carlos, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Daimi, Kevin, editor, and Al Sadoon, Abeer, editor
- Published
- 2024
- Full Text
- View/download PDF
34. Creation of a Unique Clustering Method Employing Novel Similarity Metrics for Legal Texts to Improve Information Management and Retrieval in the Legal Field
- Author
-
Jain, Rajanish Kumar, Jain, Anubha, Goel, Vikas, Dey, Nilanjan, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Piuri, Vincenzo, Series Editor, Mishra, Durgesh, editor, Yang, Xin She, editor, Unal, Aynur, editor, and Jat, Dharm Singh, editor
- Published
- 2024
- Full Text
- View/download PDF
35. Towards Improving Multivariate Time-Series Forecasting Using Weighted Linear Stacking
- Author
-
Aiwansedo, Konstandinos, Bosche, Jérôme, Badreddine, Wafa, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Rocha, Ana Paula, editor, Steels, Luc, editor, and van den Herik, Jaap, editor
- Published
- 2024
- Full Text
- View/download PDF
36. A Clustering-Based Algorithm for Product Platform Design in the Mass Customization Era
- Author
-
Bortolini, Marco, Cafarella, Cristian, Galizia, Francesco Gabriele, Gamberi, Mauro, Naldi, Ludovica Diletta, Howlett, Robert J., Series Editor, Jain, Lakhmi C., Series Editor, Scholz, Steffen G., editor, and Setchi, Rossi, editor
- Published
- 2024
- Full Text
- View/download PDF
37. Recent Advancements in Data Mining and Machine Learning Applications in Evaluating Goalkeepers’ Performances in Elite Football
- Author
-
Musa, Rabiu Muazu, Majeed, Anwar P. P. Abdul, Ab Rasid, Aina Munirah, Abdullah, Mohamad Razali, Musa, Rabiu Muazu, Majeed, Anwar P. P. Abdul, Ab Rasid, Aina Munirah, and Abdullah, Mohamad Razali
- Published
- 2024
- Full Text
- View/download PDF
38. Analyzing and Comparing Clustering Algorithms for Student Academic Data
- Author
-
Bhurre, Shraddha, Raikwar, Sunny, Prajapat, Shaligram, Pathak, Deepika, Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Naik, Nitin, editor, Jenkins, Paul, editor, Grace, Paul, editor, Yang, Longzhi, editor, and Prajapat, Shaligram, editor
- Published
- 2024
- Full Text
- View/download PDF
39. Experimental Comparison of Three Topic Modeling Methods with LDA, Top2Vec and BERTopic
- Author
-
Gan, Lin, Yang, Tao, Huang, Yifan, Yang, Boxiong, Luo, Yami Yanwen, Richard, Lui Wing Cheung, Guo, Dabo, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Lu, Huimin, editor, and Cai, Jintong, editor
- Published
- 2024
- Full Text
- View/download PDF
40. Artificial Intelligence Analysis of Macroscopic X-Ray Fluorescence Data: A Case Study of Nineteenth Century Icon
- Author
-
Gerodimos, T., Chatzipanteliadis, D., Chantas, G., Asvestas, A., Mastrotheodoros, G., Likas, A., Anagnostopoulos, D. F., Ghosh, Arindam, Series Editor, Chua, Daniel, Series Editor, de Souza, Flavio Leandro, Series Editor, Aktas, Oral Cenk, Series Editor, Han, Yafang, Series Editor, Gong, Jianghong, Series Editor, Jawaid, Mohammad, Series Editor, Osman, Ahmad, editor, Moropoulou, Antonia, editor, and Lampropoulos, Kyriakos, editor
- Published
- 2024
- Full Text
- View/download PDF
41. An Energy-Saving Clustering Based on the Grid-Based Whale Optimization Algorithm (GBWOA) for WSNs
- Author
-
Bairwa, Neetika, Agrawal, Navneet Kumar, Gupta, Prateek, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Hassanien, Aboul Ella, editor, Castillo, Oscar, editor, Anand, Sameer, editor, and Jaiswal, Ajay, editor
- Published
- 2024
- Full Text
- View/download PDF
42. Realistic constraints, model selection, and detectability of modular network structures
- Author
-
Zhang, Lizhi, De Paula Peixoto, Tiago, and Nunes, Matthew
- Subjects
network analysis ,Bayesian inference ,clustering algorithms ,machine learning - Abstract
Many real-world systems are complex, consisting of many entities with interactions among them. Our understanding of real-world complex systems has been significantly advanced by modelling these systems as networks. A network is a mathematical ab- straction of complex systems, representing entities and interactions by nodes and edges. Recent years have witnessed a rapid growth in the demand for analysing networks data, driven by the increased availability of large-scale, quality datasets. A common task in network analysis is to identify the "building blocks" of a network by finding divisions of nodes, such that nodes in the same division connect with the rest of the network in a similar way. This task is often referred to as community detection in networks. Community detection methods allow researchers to characterise network data from the perspective of connection pattern, which could convey important information about the functional and evolutionary mechanism of the underlying systems. Recently, Bayesian inference based on generative network model has attracted great attention as a community detection method, which is mainly due to its principle infer- ence nature and formal implementation of the Occam's razor. However, this method often relies on general models that simultaneously account for different kinds of com- munity structure. If the dominant structure in data is in fact restricted and simple, using general models could lead to sub-optimal fit to data. This thesis concerns with developing Bayesian inference community detection methods that are tailored for a particular kind of structure - the assortative structure. A net- work is said to be assortative if it can be divided into subgroups of nodes, such that connections inside each of division are dense while between distinct divisions are sparse. To this end, we develop the Bayesian formulation of the degree-corrected planted par- tition model. Such model assumes the probability of an edge between a pair of nodes is dependant on whether they are from the same subgroups as well as their node- wise propensity of receiving an edge. This formulation leads to a novel method for extracting assortative structures and this method is one of the main contributions of this thesis. Compared with other existing methods, our proposed method has the ad- vantage of being robust against overfitting, which means our method will not report spurious community structures in random networks while other non-statistical, heuris- tic methods usually do. In deriving our proposed method, we clarify on an established equivalence between the popular modularity maximisation approach and maximum likelihood inference. Our analysis shows that the equivalence result is tenuous, since it relies on subjective choices of model parameters which lack of principle justifications. We demonstrate the performance of our proposed method in both synthetic and empir- ical networks. In particular, we construct a large network corpus consisting of datasets which are diverse in terms of size and density. Using this network corpus, we find evidence that the degree-corrected planted partition model has the ability of achiev- ing better quality of fit in some empirical networks compared to existing models in some cases,. Moreover, the degree-corrected planted partition model has the potential of providing additional insight into data regarding high-resolution community struc- ture. Moreover, by conducting model selection in our network corpus, we find that assortativity is often too simplistic to be the dominant pattern in empirical networks. Finally, we study the detectability of assortative community structures. In networks where all nodes receive identical number of edges on average, there exists a detectabil- ity threshold of the strength of community structure, below which no polynomial algo- rithms can detect the planted community structure better than random guessing. We conduct a numerical study to examine the effect of heterogeneity in the number of edges attaching to nodes on the detectability of assortative structures. Such effect has been analytically studied in a special case where networks have two equal-size communities. Our results provide further numerical evidence for the existing theoretical analysis and open the door to investigation about the detectability of community structures in more general settings, e.g. in networks consisting of more than two communities, which could have different extents of heterogeneity in degree distribution.
- Published
- 2023
43. Using a k-means clustering to identify novel phenotypes of acute ischemic stroke and development of its Clinlabomics models.
- Author
-
Yao Jiang, Yingqiang Dang, Qian Wu, Boyao Yuan, Lina Gao, and Chongge You
- Subjects
ISCHEMIC stroke ,K-means clustering ,PHENOTYPES ,SUPPORT vector machines - Abstract
Objective: Acute ischemic stroke (AIS) is a heterogeneous condition. To stratify the heterogeneity, identify novel phenotypes, and develop Clinlabomics models of phenotypes that can conduct more personalized treatments for AIS. Methods: In a retrospective analysis, consecutive AIS and non-AIS inpatients were enrolled. An unsupervised k-means clustering algorithm was used to classify AIS patients into distinct novel phenotypes. Besides, the intergroup comparisons across the phenotypes were performed in clinical and laboratory data. Next, the least absolute shrinkage and selection operator (LASSO) algorithm was used to select essential variables. In addition, Clinlabomics predictivemodels of phenotypes were established by a support vector machines (SVM) classifier. We used the area under curve (AUC), accuracy, sensitivity, and specificity to evaluate the performance of the models. Results: Of the three derived phenotypes in 909 AIS patients [median age 64 (IQR: 17) years, 69% male], in phenotype 1 (N = 401), patients were relatively young and obese and had significantly elevated levels of lipids. Phenotype 2 (N = 463) was associated with abnormal ion levels. Phenotype 3 (N = 45) was characterized by the highest level of inflammation, accompanied by mild multiple-organ dysfunction. The external validation cohort prospectively collected 507 AIS patients [median age 60 (IQR: 18) years, 70% male]. Phenotype characteristics were similar in the validation cohort. After LASSO analysis, Clinlabomics models of phenotype 1 and 2 were constructed by the SVM algorithm, yielding high AUC (0.977, 95% CI: 0.961-0.993 and 0.984, 95% CI: 0.971-0.997), accuracy (0.936, 95% CI: 0.922-0.956 and 0.952, 95% CI: 0.938-0.972), sensitivity (0.984, 95% CI: 0.968-0.998 and 0.958, 95% CI: 0.939-0.984), and specificity (0.892, 95% CI: 0.874-0.926 and 0.945, 95% CI: 0.923-0.969). Conclusion: In this study, three novel phenotypes that reflected the abnormal variables of AIS patients were identified, and the Clinlabomics models of phenotypes were established, which are conducive to individualized treatments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. Sketching of interactive VoIP traffic with multivariate statistical learning-based classification.
- Author
-
Sangeetha, R., Kuriakose, Bessy M., Naveen, V. Edward, Jenefa, A., and Lincy, A.
- Subjects
- *
COMPUTER network traffic , *INTERNET telephony , *NETWORK performance , *DATA packeting , *CLASSIFICATION - Abstract
Classifying VoIP (Voice over Internet Protocol) traffic is vital for optimizing network performance and Quality of Service (QoS). This study introduces the Multivariate Statistical-Based Classification (MVSC) system, designed to classify network traffic with high accuracy and efficiency. As traditional methods struggle in the diverse and complex landscape of today's network traffic, which includes voice, video, gaming, and data, the MVSC algorithm rises to the challenge. It employs Statistical Dissemination and leverages various statistical features such as Packet Size, Inter-Arrival Statistics, Packet and Data rates, Flow Length, and Five-tuple information to create nuanced profiles of network traffic packets. These packets are then grouped into distinct clusters based on their statistical attributes through Application Flow Cluster Grouping. A unique aspect of the MVSC system is its approach to representing each application flow as points in a two-dimensional space, where distances to predefined application profiles are calculated. The nearest profile then determines the type of VoIP traffic. Experimental results using university traffic data (KU-IDS) underscore the system's high accuracy, consistently around 98-99%. These findings affirm the system's suitability for real-time deployment. In summary, the MVSC system offers a robust and efficient solution for VoIP traffic classification, significantly boosting network performance and QoS, and proving to be an invaluable asset in contemporary network management. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. Use of Autoencoder and One-Hot Encoding for Customer Segmentation.
- Author
-
Smutek, Tomasz, Sikora, Jan, Bogacki, Sylwester, Rutkowski, Marek, and Woźniak, Dariusz
- Published
- 2024
- Full Text
- View/download PDF
46. Optimizing Procurement Strategies for Diverse Product Segments: A Case Study in Pharmaceutical Supply Chain Management.
- Author
-
Douaioui, Kaoutar, Oucheikh, Rachid, and Benmoussa, Othmane
- Subjects
METAHEURISTIC algorithms ,GREY Wolf Optimizer algorithm ,SUPPLY chain management ,MACHINE learning ,PARTICLE swarm optimization - Abstract
Selecting the most suitable procurement strategy is crucial to the efficient management of supply chain operations and the prevention of stock shortages. Nevertheless, when dealing with a wide variety of products, this task becomes an intricate challenge. While traditional and advanced procurement tools are available, applying them across such diverse product ranges is often impractical. This research is dedicated to determining distinct procurement strategies tailored to each product cluster. These strategies will be designed to accommodate the technical and financial constraints specific to each cluster. To address the optimization challenges associated with clustering algorithms, especially within complex search spaces, metaheuristic algorithms are considered as promising solutions. In this paper, Accelerated Particle Swarm Optimization (APSO) is harnessed for its exploratory capabilities, and Teaching Learning Based Algorithms (TLBO) are leveraged for their high exploitation competence. This innovative approach effectively combines the strengths of both algorithms, ensuring optimal clustering solutions in an efficient manner. The suggested approach outperforms the accuracy of the well-known metaheuristics including Grey Wolf Optimizer and the Whale Optimization Algorithm. This methodology successfully identifies five major clusters and assigns the appropriate procurement strategy to each cluster. The selection of a suitable procurement strategy for each product cluster significantly enhances overall procurement performance. This study introduces a powerful approach to assist managers in adapting procurement strategies for different product clusters. This approach has been implemented within organizations specializing in pharmaceutical freight and holds potential applicability across various product types. This innovation has the capacity to significantly impact and enhance global procurement performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
47. Reconstructing mobility from smartphone data: Empirical evidence of the effects of COVID-19 pandemic crisis on working and leisure.
- Author
-
Mourtakos, Vasileios, Mantouka, Eleni G., Fafoutellis, Panagiotis, Vlahogianni, Eleni I., and Kepaptsoglou, Konstantinos
- Subjects
- *
COVID-19 pandemic , *CHARGE carrier mobility , *SMARTPHONES , *TELECOMMUTING , *INTELLIGENT transportation systems , *COVID-19 , *RESEARCH personnel - Abstract
Identifying daily mobility patterns of commuters can be of great importance to researchers, authorities, operators and transportation service providers. Further, in the light of the global COVID-19 pandemic, understanding the changes of mobility patterns induced by the imposed restrictions to the general public, may have significant impact on how we conceptualize, design and operate the future transportation systems. However, such analyses may require extensive mobility related datasets that are very difficult and expensive to acquire using traditional data collection approaches. In this work, we aim at proposing a methodology to identify and characterize users' mobility patterns and augment the information captured by classical travel surveys. We apply this framework on anonymized raw smartphone sensors data gathered in Athens (Greece) to identify changes in mobility chains as an effect of COVID-19 restrictions. The methodological framework is based on a mixture of clustering and rule-based approaches. The proposed methodology is able to consistently detect the most frequently visited locations of each user, identify the related primary and secondary activities and finally, construct their daily trip chains. Findings reveal that people travel less frequent, but for longer especially on weekends. During lockdown periods, home-related activities have significantly increased, both in weekdays and weekends. In addition, COVID-19 regulations resulted in a significant reduction of the spatial randomness of the conducted trips. Finally, work-study trip chains have seen evident growth, as an aftermath of tele-working and tele-studying regulations. • We propose methods for constructing and analyzing users' daily mobility trip chains. • We identify primary/secondary activities by a hybrid DBSCAN and K-Means approach. • We implement the framework to identify COVID-19 related trip chain changes. • Teleworking is detected by studying mobility before and after the pandemic outburst. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. Improving wireless sensor network lifespan with optimized clustering probabilities, improved residual energy LEACH and energy efficient LEACH for corner-positioned base stations
- Author
-
Tadele A. Abose, Venumadhav Tekulapally, Ketema T. Megersa, Diriba C. Kejela, Samuel T. Daka, and Kehali A. Jember
- Subjects
Clustering algorithms ,Cluster heads ,Corner-located base station ,Homogeneous network ,Wireless sensor networks ,Science (General) ,Q1-390 ,Social sciences (General) ,H1-99 - Abstract
The goal of this paper's novel energy-conscious routing method is to optimize energy usage and extend network lifespans using a new clustering probability. Versatile arrangements and a longer network lifespan (until the last node dies) are achieved through cluster-based routing strategies. Existing algorithms, such as low energy adaptive clustering hierarchy (LEACH), residual energy LEACH (RES-EL), and distributed residual energy LEACH (DIS-RES-EL), have been compared to the newly proposed algorithms: improved residual energy LEACH (IMP-RES-EL) and energy efficient LEACH (EEL). IMP-RES-EL and EEL outperform all other stated algorithms by extending the network lifespan, enhancing stability, increasing the number of aggregated data packets transmitted from cluster heads to the base station (BS), and selecting cluster heads with energy efficiency and optimal routing within the network. The proposed approaches outperform existing algorithms, particularly when every corner-located BS is considered in the wireless sensor network (WSN). The network lifespan in rounds increased by 36 %, the number of aggregated data packets from cluster heads to the BS increased by 44 %, and the efficiency of corner-located BSs improved by 20 %. Extensive simulations on five distinct topologies were reviewed and compared to the three techniques listed above, demonstrating the superiority of the proposed algorithms.
- Published
- 2024
- Full Text
- View/download PDF
49. An In-Depth Analysis of COVID-19 Symptoms Considering the Co-Occurrence of Symptoms Using Clustering Algorithms
- Author
-
Diego Javier Benito, Jesus Rufino Robles, Juan Ramirez, Antonio Fernandez Anta, and Jose Aguilar
- Subjects
COVID-19 symptoms ,symptomatic patterns ,machine learning ,clustering algorithms ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
A comprehensive analysis of the COVID-19 pandemic is necessary to prepare for future healthcare challenges. In this regard, the large number of datasets collected during the pandemic has allowed various studies on disease behavior and characteristics. For example, collected datasets can be used to extract knowledge about the symptomatic behavior of the disease. In this work, we are interested in analyzing the relationships between the different symptoms of the disease, considering various dimensions, such as countries, variants of COVID-19, and age groups. To this end, we consider the co-occurrence of symptoms as a fundamental element. More precisely, we implemented clustering techniques to discover symptomatic patterns across the various dimensions. For instance, in analyzing the dominant patterns, we observe that symptom congestion or runny nose almost always appears with the symptom muscle pain across many dimensions. Hence, the information on symptom patterns can be helpful in decision-making processes to detect and combat COVID-19 and similar diseases.
- Published
- 2024
- Full Text
- View/download PDF
50. Optimized Cluster Routing Protocol With Energy-Sustainable Mechanisms for Wireless Sensor Networks
- Author
-
Tadele A. Abose, Venumadhav Tekulapally, Diriba C. Kejela, Ketema T. Megersa, Samuel T. Daka, and Kehali A. Jember
- Subjects
Base station ,clustering algorithms ,cluster heads ,heterogeneous network ,homogeneous network ,routing protocol ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Clustering algorithms have a key role in decreasing energy consumption and increasing network longevity in wireless sensor networks. This work advances on previous homogeneous and heterogeneous algorithms, including low-energy adaptive clustering hierarchical routing protocol (LEACH), distributed residual energy LEACH (DIS-RES-EL), residual energy LEACH (RES-EL), energy efficient LEACH (EEL), and stable election protocol (SEP), by introducing novel clustering methodologies. It introduces novel improved residual energy LEACH (IMP-RES-EL) and energy efficient stable election protocol (EE-SEP) to improve the efficiency of clustering algorithms in energy savings for homogeneous and heterogeneous wireless sensor networks. The simulation result shows that, in addition to prolonging network lifetime and optimal routing, these methods transported more data packets from the cluster to sensor nodes and then to base stations than other techniques. When compared to the stable election protocol (SEP), the proposed energy-efficient stable election protocol (EE-SEP) influences the number of bunch heads formed over their lifetime, the organization’s stability, the number of nodes shipped off the base station from each cluster head, and the organization’s overall lifetime. When comparing the two current algorithms, EE-SEP and LEACH, for various topologies, the findings demonstrate that EE-SEP is the most energy efficient directing convention for extending the previously described qualities. This attribute has not been discussed thus far. The results also show that the IMP-RES-EL algorithm successfully increases network lifespan while minimizing energy dissipation and transmissions between sensor nodes and base stations or cluster heads (CHs). For all of the suggested homogeneous and heterogeneous algorithms, network lifetime in rounds rose by 36%, aggregated data packets from CHs to BS increased by 44%, and total data packets to BSs improved by 20%.
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.