Descriptor: "automatic clustering" / Publication Year Range: Last 3 years - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"automatic clustering"' showing total 46 results

Start Over Descriptor "automatic clustering" Publication Year Range Last 3 years

46 results on '"automatic clustering"'

1. Cluster validity indices for automatic clustering: A comprehensive review

Author: Ikotun, Abiodun M., Habyarimana, Faustin, and Ezugwu, Absalom E.
Published: 2025
Full Text: View/download PDF

2. Adaptive multi-model predictive control with optimal model bank formation: Consideration of local models uncertainty and stability

Author: Fathi, Mohammad, Bolandi, Hossein, Vaghei, Bahman Ghorbani, and Ebadolahi, Saeid
Published: 2024
Full Text: View/download PDF

3. Dynamic Social Particle Swarm Optimization For Automatic Clustering

Author: Amdouni, Hamida, Manita, Ghaith, Oliva, Diego, Houssein, Essam H., Korbaa, Ouajdi, and Zapotecas-Martínez, Saúl
Published: 2024
Full Text: View/download PDF

4. Ellipsoidal K -Means: An Automatic Clustering Approach for Non-Uniform Data Distributions.

Author: Abdel-Hakim, Alaa E., Ibrahim, Abdel-Monem M., Bouazza, Kheir Eddine, Deabes, Wael, and Hedar, Abdel-Rahman
Subjects: *CLUSTERING algorithms, *DATA distribution, *K-means clustering, *EUCLIDEAN distance, *CLUSTER analysis (Statistics), *SIMULATED annealing, *CENTROID
Abstract: Traditional K-means clustering assumes, to some extent, a uniform distribution of data around predefined centroids, which limits its effectiveness for many realistic datasets. In this paper, a new clustering technique, simulated-annealing-based ellipsoidal clustering (SAELLC), is proposed to automatically partition data into an optimal number of ellipsoidal clusters, a capability absent in traditional methods. SAELLC transforms each identified cluster into a hyperspherical cluster, where the diameter of the hypersphere equals the minor axis of the original ellipsoid, and the center is encoded to represent the entire cluster. During the assignment of points to clusters, local ellipsoidal properties are independently considered. For objective function evaluation, the method adaptively transforms these ellipsoidal clusters into a variable number of global clusters. Two objective functions are simultaneously optimized: one reflecting partition compactness using the silhouette function (SF) and Euclidean distance, and another addressing cluster connectedness through a nearest-neighbor algorithm. This optimization is achieved using a newly-developed multiobjective simulated annealing approach. SAELLC is designed to automatically determine the optimal number of clusters, achieve precise partitioning, and accommodate a wide range of cluster shapes, including spherical, ellipsoidal, and non-symmetric forms. Extensive experiments conducted on UCI datasets demonstrated SAELLC's superior performance compared to six well-known clustering algorithms. The results highlight its remarkable ability to handle diverse data distributions and automatically identify the optimal number of clusters, making it a robust choice for advanced clustering analysis. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

5. Exploring meta-heuristics for partitional clustering: methods, metrics, datasets, and challenges.

Author: Kaur, Arvinder, Kumar, Yugal, and Sidhu, Jagpreet
Abstract: Partitional clustering is a type of clustering that can organize the data into non-overlapping groups or clusters. This technique has diverse applications across the different various domains like image processing, pattern recognition, data mining, rule-based systems, customer segmentation, image segmentation, and anomaly detection, etc. Hence, this survey aims to identify the key concepts and approaches in partitional clustering. Further, it also highlights its widespread applicability including major advantages and challenges. Partitional clustering faces challenges like selecting the optimal number of clusters, local optima, sensitivity to initial centroids, etc. Therefore, this survey describes the clustering problems as partitional clustering, dynamic clustering, automatic clustering, and fuzzy clustering. The objective of this survey is to identify the meta-heuristic algorithms for the aforementioned clustering. Further, the meta-heuristic algorithms are also categorised into simple meta-heuristic algorithms, improved meta-heuristic algorithms, and hybrid meta-heuristic algorithms. Hence, this work also focuses on the adoption of new meta-heuristic algorithms, improving existing methods and novel techniques that enhance clustering performance and robustness, making partitional clustering a critical tool for data analysis and machine learning. This survey also highlights the different objective functions and benchmark datasets adopted for measuring the effectiveness of clustering algorithms. Before the literature survey, several research questions are formulated to ensure the effectiveness and efficiency of the survey such as what are the various meta-heuristic techniques available for clustering problems? How to handle automatic data clustering? What are the main reasons for hybridizing clustering algorithms? The survey identifies shortcomings associated with existing algorithms and clustering problems and highlights the active area of research in the clustering field to overcome these limitations and improve performance. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

6. AENCIC: a method to estimate the number of clusters based on image complexity to be used in fuzzy clustering algorithms for image segmentation.

Author: Madrid-Herrera, Luis, Chacon-Murguia, Mario I., and Ramirez-Quintana, Juan A.
Subjects: *IMAGE segmentation, *DATABASES, *FUZZY algorithms, *ALGORITHMS
Abstract: Image segmentation through fuzzy clustering has been widely used in diverse areas. However, most of those clustering algorithms require that some of their parameter values be determined manually. The number of clusters, C, is one of the most important parameters because it impacts the number of regions to segment and directly affects the performance of the clustering algorithms. Some state-of-the-art general clustering algorithms methods automatically determine C. However, not all of them can be employed for image segmentation. Therefore, this paper describes the method automatic estimation of number of clusters by image complexity (AENCIC). AENCIC is a method that automatically estimates the best C needed by state-of-the-art clustering algorithms to segment an image, considering the image complexity perceived by humans. AENCIC was designed to work with fuzzy clustering algorithms employed to segment real-world images because this kind of segmentation is an ill-defined problem causing a high variation of C per image to attain a good segmentation. Results using the database BSDS500 demonstrate that using AENCIC to estimate C improves the performance of state-of-the-art fuzzy clustering image segmentation algorithms up to 94% of their ideal maximum performance, allowing those algorithms to work without human intervention. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

7. Comparison Between Single and Multi-objective Clustering Algorithms: MathE Case Study

Author: Azevedo, Beatriz Flamia, Rocha, Ana Maria A. C., Fernandes, Florbela P., Pacheco, Maria F., Pereira, Ana I., Li, Gang, Series Editor, Filipe, Joaquim, Series Editor, Xu, Zhiwei, Series Editor, Pereira, Ana I., editor, Fernandes, Florbela P., editor, Coelho, João P., editor, Teixeira, João P., editor, Lima, José, editor, Pacheco, Maria F., editor, Lopes, Rui P., editor, and Álvarez, Santiago T., editor
Published: 2024
Full Text: View/download PDF

8. Automatic Hyperspectral Image Clustering Using Qutrit Differential Evolution

Author: Dutta, Tulika, Bhattacharyya, Siddhartha, Panigrahi, Bijaya Ketan, Platos, Jan, Snasel, Vaclav, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Tan, Ying, editor, and Shi, Yuhui, editor
Published: 2024
Full Text: View/download PDF

9. Combination of Cooperative Grouper Fish -- Octopus Algorithm and DBSCAN to Automatic Clustering

Author: Balavand, Alireza, Kulkarni, Anand J, Section editor, Kulkarni, Anand J., editor, and Gandomi, Amir H., editor
Published: 2024
Full Text: View/download PDF

10. Improvement of DBSCAN Algorithm Involving Automatic Parameters Estimation and Curvature Analysis in 3D Point Cloud of Piled Pipe.

Author: Pratama, Alfan Rizaldy, Bayu Dewantara, Bima Sena, Sari, Dewi Mutiara, and Pramadihanto, Dadet
Subjects: PARAMETER estimation, CURVATURE, CLUSTER analysis (Statistics), ALGORITHMS, SIMPLICITY, POINT cloud
Abstract: Bin-picking in the industrial area is a challenging task since the object is piled in a box. The rapid development of 3D point cloud data in the bin-picking task has not fully addressed the robustness issue of handling objects in every circumstance of piled objects. Density-Based Spatial Clustering of Application with Noise (DBSCAN) as the algorithm that attempts to solve by its density still has a disadvantage like parameter-tuning and ignoring the unique shape of an object. This paper proposes a solution by providing curvature analysis in each point data to represent the shape of an object therefore called Curvature-Density-Based Spatial Clustering of Application with Noise (CVRDBSCAN). Our improvement uses curvature to analyze object shapes in different placements and automatically estimates parameters like Eps and MinPts. Divided by three algorithms, we call it Auto-DBSCAN, CVR-DBSCAN-Avg, and CVR-DBSCAN-Disc. By using real-scanned Time-of-Flight camera datasets separated by three piled conditions that are well separated, well piled, and arbitrary piled to analyze all possibilities in placing objects. As a result, in well separated, Auto-DBSCAN leads by the stability and accuracy in 99.67% which draws as the DBSCAN using specified parameters. For well piled, CVR-DBSCAN-Avg gives the highest stability although the accuracy can be met with DBSCAN on specified parameters in 98.83%. Last, in arbitrary piled though CVR-DBSCAN-Avg in accuracy lower than DBSCAN which is 73.17% compared to 80.43% the stability is slightly higher with less outlier value. Deal with computational time higher than novel DBSCAN, our improvement made the simplicity and deep analysis in scene understanding. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

11. An efficient automatic clustering algorithm for probability density functions and its applications in surface material classification.

Author: Nguyen‐Trang, Thao, Vo‐Van, Tai, and Che‐Ngoc, Ha
Subjects: *PROBABILITY density function, *SURFACES (Technology), *IMAGE recognition (Computer vision), *ALGORITHMS, *CUMULATIVE distribution function
Abstract: Clustering is a technique used to partition a dataset into groups of similar elements. In addition to traditional clustering methods, clustering for probability density functions (CDF) has been studied to capture data uncertainty. In CDF, automatic clustering is a clever technique that can determine the number of clusters automatically. However, current automatic clustering algorithms update the new probability density function (pdf) fi(t)$$ {f}_i(t) $$ based on the weighted mean of all previous pdfs fj(t−1),j=1,2,...,N$$ {f}_j\left(t-1\right),j=1,2,\dots, N $$, resulting in slow convergence. This paper proposes an efficient automatic clustering algorithm for pdfs. In the proposed approach, the update of fi(t)$$ {f}_i(t) $$ is based on the weighted mean of f1(t),f2(t),...,fi−1(t),fi(t−1),fi+1(t−1),...,fN(t−1)$$ \left\{{f}_1(t),{f}_2(t),\dots, {f}_{i-1}(t),{f}_i\left(t-1\right),{f}_{i+1}\left(t-1\right),\dots, {f}_N\left(t-1\right)\right\} $$, where N$$ N $$ is the number of pdfs and i=1,2,...,N$$ i=1,2,\dots, N $$. This technique allows for the incorporation of recently updated pdfs, leading to faster convergence. This paper also pioneers the applications of certain CDF algorithms in the field of surface image recognition. The numerical examples demonstrate that the proposed method can result in a rapid convergence at some early iterations. It also outperforms other state‐of‐the‐art automatic clustering methods in terms of the Adjusted Rand Index and the Normalized Mutual Information. Additionally, the proposed algorithm proves to be competitive when clustering material images contaminated by noise. These results highlight the applicability of the proposed method in the problem of surface image recognition. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

12. Neighbor-Relationship-Based Adaptive Density Peak Clustering

Author: Zhigang Su, Qian Gao, Jingtang Hao, Yue Wang, and Bing Han
Subjects: Spatial clustering, density peak, uneven density, neighbor relationship, automatic clustering, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: The Density Peak Clustering (DPC) algorithm encounters challenges such as difficulty in choosing cluster centers and the chain reaction caused by incorrect assignment of data points when clustering spatial datasets containing clusters with significant density differences or multi-peak clusters. To address these problems, in this paper, starting from enhancing the local density definition, optimizing the selection of cluster centers, and improving the assignment strategy of non-cluster center data points, an Adaptive DPC (NRA-DPC) algorithm is proposed based on the neighbor relationship. The NRA-DPC algorithm utilizes the reverse K-nearest neighbors of data points as the basis for defining the local density of data points and divides the spatial dataset into a core point set and a boundary point set based on the number of elements in the reverse K-nearest neighbor set of data points. The idea of iteration is adopted to select cluster centers from the core point set and assign non-cluster center data points, forming the initial clusters. For each initial cluster formed by the core point set, the corresponding minimum spanning tree (MST) is generated, and based on the average edge length of the MST, the assignment threshold of this cluster is set. The boundary point set completes the corresponding data point assignment task based on this assignment threshold and the mutual K-nearest neighbor relationship. Experimental results indicate that, compared with other typical clustering algorithms, the NRA-DPC algorithm can automatically select cluster centers, reduce the probability of incorrect assignment of non-cluster center data points, and effectively suppress the chain reaction triggered by incorrect assignment of non-cluster center data points, demonstrating more stable clustering performance when dealing with different datasets.
Published: 2024
Full Text: View/download PDF

13. Adaptive clustering algorithm based on improved marine predation algorithm and its application in bearing fault diagnosis

Author: Zhuanzhe Zhao, Mengxian Wang, Yongming Liu, Zhibo Liu, Yuelin Lu, Yu Chen, and Zhijian Tu
Subjects: fault diagnosis, automatic clustering, cluster validity index, marine predator algorithm, k-means clustering, Mathematics, QA1-939, Applied mathematics. Quantitative methods, T57-57.97
Abstract: In cluster analysis, determining the number of clusters is an important issue because there is less information about the most appropriate number of clusters in the real problem. Automatic clustering is a clustering method that automatically finds the most appropriate number of clusters and divides instances into the corresponding clusters. In this paper, a novel automatic clustering algorithm based on the improved marine predator algorithm (IMPA) and K-means algorithm is proposed. The new IMPA utilizes refracted opposition-based learning in population initialization, generates opposite solutions to improve the diversity of the population and produces more accurate solutions. In addition, the sine-cosine algorithm is incorporated to balance global exploration and local development of the algorithm for dynamic updating of the predator and prey population positions. At the same time, the Gaussian-Cauchy mutation is combined to improve the probability of obtaining the globally optimal solution. The proposed IMPA is validated with some benchmark data sets. The calculation results show that IMPA is superior to the original MPA in automatic clustering. In addition, IMPA is also used to solve the problem of fault classification of Xi*an Jiaotong University bearing data. The results show that the IMPA has better and more stable results than other algorithms such as the original MPA, whale optimization algorithm, fuzzy C-means and K-means in automatic clustering.
Published: 2023
Full Text: View/download PDF

14. An automatic density peaks clustering based on a density-distance clustering index

Author: Xiao Xu, Hong Liao, and Xu Yang
Subjects: dpc algorithm, automatic clustering, decision graph, optimal number of clusters, parameter selection, Mathematics, QA1-939
Abstract: The density peaks clustering (DPC) algorithm plays an important role in data mining by quickly identifying cluster centers using decision graphs to identify arbitrary clusters. However, the decision graph introduces uncertainty in determining the cluster centers, which can result in an incorrect number of clusters. In addition, the cut-off distance parameter relies on prior knowledge, which poses a limitation. To address these issues, we propose an improved automatic density peaks clustering (ADPC) algorithm. First, a novel clustering validity index called density-distance clustering (DDC) is introduced. The DDC index draws inspiration from the density and distance characteristics of cluster centers, which is applicable to DPC and aligns with the general definition of clustering. Based on the DDC index, the ADPC algorithm automatically selects the suitable cut-off distance and acquires the optimal number of clusters without additional parameters. Numerical experimental results validate that the introduced ADPC algorithm successfully automatically determines the optimal number of clusters and cut-off distance, significantly outperforming DPC, AP and DBSCAN algorithms.
Published: 2023
Full Text: View/download PDF

15. Unsupervised optimal model bank for multiple model control systems: Genetic-based automatic clustering approach

Author: Mohammad Fathi and Hossein Bolandi
Subjects: Multiple model control, Automatic clustering, Genetic algorithm, Optimal model bank, Science (General), Q1-390, Social sciences (General), H1-99
Abstract: In the Multiple Model Control (MMC) strategies, a bank of simple local models is used to describe the behavior of complex systems with vast operation space. In this approach, the system operation space is divided into several subspaces, and in each subspace, a simple local model is assigned to describe the system behavior. This study addresses the two main challenges in this field which involve determining the optimal number of required local models to form the model bank and identifying the optimal distribution of the local models across the system operation space. Providing appropriate answers to these questions directly affects the performance of the MMC system. In this paper, GA-based automatic clustering method is suggested to form an optimal model bank. In this regard, an appropriate mapping is established between the concepts of MMC and automatic clustering, and a novel unsupervised algorithm is designed to determine the optimal model bank. Unlike the existing methods in the literature, the proposed method can form the global optimal model bank without entrapment into local optima regardless of the initial conditions of the used search algorithm. In this paper, the formation of the optimal model bank using the proposed method is investigated by considering the spacecraft attitude dynamics as a complex, MIMO, non-linear case study and its satisfactory and promising performance is demonstrated.
Published: 2024
Full Text: View/download PDF

16. A quantum inspired differential evolution algorithm for automatic clustering of real life datasets.

Author: Dey, Alokananda, Bhattacharyya, Siddhartha, Dey, Sandip, Platos, Jan, and Snasel, Vaclav
Abstract: In recent years, Quantum Inspired Metaheuristic algorithms have emerged to be promising due to their efficiency, robustness and faster computational capability. In this paper, a novel Quantum Inspired Differential Evolution (QIDE) algorithm has been presented for automatic clustering of unlabeled datasets. In case of automatic clustering, the datasets have been clustered into optimal number of groups on the run without any apriori knowledge of the datasets. In this work, the proposed algorithm has been compared with other two quantum inspired algorithms, viz., Fast Quantum Inspired Evolutionary Clustering Algorithm (FQEA) and Quantum Evolutionary Algorithm for Data Clustering (QEAC), a Classical Differential Evolution (CDE) algorithm with different mutation probabilities and an Improved Differential Evolution (IDE) algorithm. The experiments have been conducted on six real life publicly available datasets to identify the optimal number of clusters. By introducing some concepts of quantum gates, the proposed algorithm not only achieves good convergence speed but also provides better results than other competitive algorithms. In addition, Sobol's sensitivity analysis has been conducted for tuning the parameters of the proposed algorithm. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

17. Swarm based automatic clustering using nature inspired Emperor Penguins Colony algorithm.

Author: Harifi, Sasan, Khalilian, Madjid, and Mohammadzadeh, Javad
Abstract: Nature acts as a source of concepts, mechanisms, and principles for designing artificial computing systems to deal with complex computational problems. Most heuristic and metaheuristic algorithms are taken from the behavior of biological systems or physical systems in nature. Clustering is the process of grouping a set of data and putting it in a class of similar examples. Since the clustering problem is an NP-hard problem, using metaheuristics can be an appropriate tool to deal with these issues. Indeed, clustering is a special case of an optimization problem. In classic clustering, knowing the number of clusters is required before clustering. This paper presents an algorithm that requires no prior knowledge to classify the data. In this paper, we proposed a swarm-based Emperor Penguins Colony (EPC) algorithm to solve both classic and automatic clustering problems. The proposed approach is compared with six state-of-the-art, popular, and improved nature-inspired algorithms, a partitioning-based heuristic algorithm, and a hierarchical clustering method on ten real-world datasets. The results show that classic and automatic clustering using the EPC algorithm has better performance in comparison with other competing algorithms. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

18. Automatic Clustering of Hyperspectral Images Using Quantum Reptile Search Algorithm

Author: Dutta, Tulika, Bhattacharyya, Siddhartha, Panigrahi, Bijaya Ketan, Hassanien, Aboul Ella, Howlett, Robert J., Series Editor, Jain, Lakhmi C., Series Editor, Hassanien, Aboul Ella, editor, Zheng, Dequan, editor, Zhao, Zhijie, editor, and Fan, Zhipeng, editor
Published: 2023
Full Text: View/download PDF

19. Calibration of hydrological models for ungauged catchments by automatic clustering using a differential evolution algorithm: The Gorganrood river basin case study

Author: Zahra Alizadeh and Jafar Yazdi
Subjects: automatic clustering, differential evolution (de), gorganrood river basin, hydrologic model calibration, swmm, ungauged catchments, Information technology, T58.5-58.64, Environmental technology. Sanitary engineering, TD1-1066
Abstract: The hydrological model calibration is a challenging task, especially in ungauged catchments. The regionalization calibration methods can be used to estimate the parameters of the model in ungauged sub-catchments. In this article, the model of ungauged sub-catchments is calibrated by a regionalization approach based on automatic clustering. Under the clustering procedure, gauged and ungauged sub-catchments are grouped based on their physical characteristics and similarity. The optimal number of clusters is determined using an automatic differential evolution algorithm-based clustering. Considering obtained five clusters, the value of the silhouette measure is equal to 0.56, which is an acceptable value for goodness of clustering. The calibration process is conducted according to minimizing errors in simulated peak flow and total flow volume. The Storm Water Management Model is applied to calibrate a set of 53 sub-catchments in the Gorganrood river basin. Comparing graphically and statistically simulated and observed runoff values and also calculating the value of the silhouette coefficient demonstrate that the proposed methodology is a promising approach for hydrological model calibration in ungauged catchments. HIGHLIGHTS The model of ungauged sub-catchments is calibrated by a regionalization approach based on automatic clustering.; The optimal number of clusters is determined using an automatic differential evolution algorithm-based clustering.; Comparing graphically and statistically simulated and observed runoff values and also calculating the value of silhouette coefficient proved the superiority of automatic clustering differential evolution in clustering.;
Published: 2023
Full Text: View/download PDF

20. An efficient robust automatic clustering algorithm for interval data.

Author: Vo-Van, Tai, Ngoc, Lethikim, and Nguyen-Trang, Thao
Subjects: *INTERVAL analysis, *ALGORITHMS, *CLUSTER analysis (Statistics), *RESEARCH personnel, *OUTLIER detection, *DATA analysis, *FUZZY algorithms
Abstract: In recent years, clustering analysis for interval data has attracted the attention of many researchers. Nevertheless, an algorithm that can automatically determine the number of clusters, and can effectively detect the outlier intervals at the same time has not been studied so far. Therefore, in this paper, we propose a robust automatic clustering algorithm that only can automatically determine the number of clusters but also can assign the outlier intervals into separated clusters. The proposed algorithm is then applied in detecting the abnormal images consisting of the new image categories, and the images contaminated with noise. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

21. Globally automatic fuzzy clustering for probability density functions and its application for image data.

Author: Nguyen-Trang, Thao, Nguyen-Thoi, Trung, and Vo-Van, Tai
Subjects: PROBABILITY density function, FUZZY algorithms, DIFFERENTIAL evolution, IMAGE recognition (Computer vision)
Abstract: Clustering for probability density functions (CDF) can be categorized as non-fuzzy and fuzzy approaches. Regarding the second approach, the iterative refinement technique has been used for searching the optimal partition. This method could be easily trapped at a local optimum. In order to find the global optimum, a meta-heuristic optimization (MO) algorithm must be incorporated into the fuzzy CDF problem. However, no research utilizing MO to solve the fuzzy CDF problem has been proposed so far due to the lack of a reasonable encoding for converting a fuzzy clustering solution to a chromosome. To address this shortcoming, a new definition called Gaussian prototype is defined first. This type of prototype is capable of accurately representing the cluster without being overly complex. As a result, prototypes' information can be easily integrated into the chromosome via a novel prototype-based encoding method. Second, a new objective function is introduced to evaluate a fuzzy CDF solution. Finally, Differential Evolution (DE) is used to determine the optimal solution for fuzzy clustering. The proposed method, namely DE-AFCF, is the first to propose a globally automatic fuzzy CDF algorithm, which not only can automatically determine the number of clusters k but also can search for the optimal fuzzy partition matrix by taking into account both clustering compactness and separation. The DE-AFCF is also applied in some image clustering problems, such as processed image detection, and traffic image recognition. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

22. Dynamic Kernel Clustering by Spider Monkey Optimization Algorithm.

Author: Patel, Vaishali P. and Vishwamitra, L. K.
Subjects: *OPTIMIZATION algorithms, *MEMETICS, *SWARM intelligence, *METAHEURISTIC algorithms, *DOCUMENT clustering, *GAUSSIAN distribution, *CLUSTER analysis (Statistics)
Abstract: In data, analysis clustering plays a major role. In the past decade varieties of clustering algorithms are proposed and produced better results. But many of them required prior information on the number of clusters and failed to produce optimum results when such information is not available. In real-life problems, it is difficult to predict the number of clusters due to the complexity of data in shape and dimensionality. Therefore predicting the number of clusters is a difficult task and this draws the attention of many researchers. In this work, we proposed DKCSMO, dynamic kernel clustering with a spider monkey optimization algorithm. In this work for better clustering results, the local leader phase of the spider monkey optimization algorithm is improved with the neighborhood search strategy. Further to improve the quality of results, we modified CS-Index with Gaussian kernel distribution. The proposed algorithm is compared with five well-known meta-heuristic algorithms and seven previously published automatic clustering algorithms. Experimental results show that the proposed algorithm produced better results in terms of the predicted clusters, DB, SIL, and ARI measures. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

23. Automatic Clustering for Improved Radio Environment Maps in Distributed Applications.

Author: Ben Chikha, Haithem and Alaerjan, Alaa
Subjects: STANDARD deviations, SMART cities, K-means clustering, WIRELESS communications, TELECOMMUNICATION
Abstract: Wireless communication greatly contributes to the evolution of new technologies, such as the Internet of Things (IoT) and edge computing. The new generation networks, including 5G and 6G, provide several connectivity advantages for multiple applications, such as smart health systems and smart cities. Adopting wireless communication technologies in these applications is still challenging due to factors such as mobility and heterogeneity. Predicting accurate radio environment maps (REMs) is essential to facilitate connectivity and improve resource utilization. The construction of accurate REMs through the prediction of reference signal received power (RSRP) can be useful in densely distributed applications, such as smart cities. However, predicting an accurate RSRP in the applications can be complex due to intervention and mobility aspects. Given the fact that the propagation environments can be different in a specific area of interest, the estimation of a common path loss exponent for the entire area produces errors in the constructed REM. Hence, it is necessary to use automatic clustering to distinguish between different environments by grouping locations that exhibit similar propagation characteristics. This leads to better prediction of the propagation characteristics of other locations within the same cluster. Therefore, in this work, we propose using the Kriging technique, in conjunction with the automatic clustering approach, in order to improve the accuracy of RSRP prediction. In fact, we adopt K-means clustering (KMC) to enhance the path loss exponent estimation. We use a dataset to test the proposed model using a set of comparative studies. The results showed that the proposed approach provides significant RSRP prediction capabilities for constructing REM, with a gain of about 3.3 dB in terms of root mean square error compared to the case without clustering. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

24. A Review of Quantum-Inspired Metaheuristic Algorithms for Automatic Clustering.

Author: Dey, Alokananda, Bhattacharyya, Siddhartha, Dey, Sandip, Konar, Debanjan, Platos, Jan, Snasel, Vaclav, Mrsic, Leo, and Pal, Pankaj
Subjects: *QUANTUM computing, *ALGORITHMS, *QUANTUM computers
Abstract: In real-world scenarios, identifying the optimal number of clusters in a dataset is a difficult task due to insufficient knowledge. Therefore, the indispensability of sophisticated automatic clustering algorithms for this purpose has been contemplated by some researchers. Several automatic clustering algorithms assisted by quantum-inspired metaheuristics have been developed in recent years. However, the literature lacks definitive documentation of the state-of-the-art quantum-inspired metaheuristic algorithms for automatically clustering datasets. This article presents a brief overview of the automatic clustering process to establish the importance of making the clustering process automatic. The fundamental concepts of the quantum computing paradigm are also presented to highlight the utility of quantum-inspired algorithms. This article thoroughly analyses some algorithms employed to address the automatic clustering of various datasets. The reviewed algorithms were classified according to their main sources of inspiration. In addition, some representative works of each classification were chosen from the existing works. Thirty-six such prominent algorithms were further critically analysed based on their aims, used mechanisms, data specifications, merits and demerits. Comparative results based on the performance and optimal computational time are also presented to critically analyse the reviewed algorithms. As such, this article promises to provide a detailed analysis of the state-of-the-art quantum-inspired metaheuristic algorithms, while highlighting their merits and demerits. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

25. Automatic clustering of colour images using quantum inspired meta-heuristic algorithms.

Author: Dey, Alokananda, Bhattacharyya, Siddhartha, Dey, Sandip, Platos, Jan, and Snasel, Vaclav
Subjects: PARTICLE swarm optimization, COLOR image processing, METAHEURISTIC algorithms, QUANTUM computers, EVOLUTIONARY algorithms, QUANTUM computing, DIFFERENTIAL evolution, COLOR
Abstract: This work explores the effectiveness and robustness of quantum computing by conjoining the principles of quantum computing with the conventional computational paradigm for the automatic clustering of colour images. In order to develop such a computationally efficient algorithm, two population-based meta-heuristic algorithms, viz., Particle Swarm Optimization (PSO) algorithm and Enhanced Particle Swarm Optimization (EPSO) algorithm have been consolidated with the quantum computing framework to yield the Quantum Inspired Particle Swarm Optimization (QIPSO) algorithm and the Quantum Inspired Enhanced Particle Swarm Optimization (QIEPSO) algorithm, respectively. This paper also presents a comparison between the proposed quantum inspired algorithms with their corresponding classical counterparts and also with three other evolutionary algorithms, viz., Artificial Bee Colony (ABC), Differential Evolution (DE) and Covariance Matrix Adaption Evolution Strategies (CMA-ES). In this paper, twenty different sized colour images have been used for conducting the experiments. Among these twenty images, ten are Berkeley images and ten are real life colour images. Three cluster validity indices, viz., PBM, CS-Measure (CSM) and Dunn index (DI) have been used as objective functions for measuring the effectiveness of clustering. In addition, in order to improve the performance of the proposed algorithms, some participating parameters have been adjusted using the Sobol's sensitivity analysis test. Four segmentation evaluation metrics have been used for quantitative evaluation of the proposed algorithms. The effectiveness and efficiency of the proposed quantum inspired algorithms have been established over their conventional counterparts and the three other competitive algorithms with regards to optimal computational time, convergence rate and robustness. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

26. CVIK: A Matlab-based cluster validity index toolbox for automatic data clustering

Author: Adán José-García and Wilfrido Gómez-Flores
Subjects: Clustering, Cluster validity index, Automatic clustering, Computer software, QA76.75-76.765
Abstract: We present CVIK, a Matlab-based toolbox for assisting the process of cluster analysis applications. This toolbox aims to implement 28 cluster validity indices (CVIs) for measuring clustering quality available to data scientists, researchers, and practitioners. CVIK facilitates implementing the entire pipeline of automatic clustering in two approaches: (i) evaluating candidate clustering solutions from classical algorithms, in which the number of clusters increases gradually, and (ii) assessing potential solutions in evolutionary clustering algorithms using single- and multi-objective optimization methods. This toolbox also implements distinct proximity measures to estimate data similarity, and the CVIs are capable of processing both feature data and relational data. The source code and examples can be found in this GitHub repository: https://github.com/adanjoga/cvik-toolbox.
Published: 2023
Full Text: View/download PDF

27. Balance-driven automatic clustering for probability density functions using metaheuristic optimization.

Author: Nguyen-Trang, Thao, Nguyen-Thoi, Trung, Nguyen-Thi, Kim-Ngan, and Vo-Van, Tai
Abstract: For solving the clustering for probability density functions (CDF) problem with a given number of clusters, the metaheuristic optimization (MO) algorithms have been widely studied because of their advantages in searching for the global optimum. However, the existing approaches cannot be directly extended to the automatic CDF problem for determining the number of clusters k. Besides, balance-driven clustering, an essential research direction recently developed in the problem of discrete-element clustering, has not been considered in the field of CDF. This paper pioneers a technique to apply an MO algorithm for resolving the balance-driven automatic CDF. The proposed method not only can automatically determine the number of clusters but also can approximate the global optimal solution in which both the clustering compactness and the clusters' size similarity are considered. The experiments on one-dimensional and multidimensional probability density functions demonstrate that the new method possesses higher quality clustering solutions than the other conventional techniques. The proposed method is also applied in analyzing the difficulty levels of entrance exam questions. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

28. A novel density deviation multi-peaks automatic clustering algorithm.

Author: Zhou, Wei, Wang, Limin, Han, Xuming, Parmar, Milan, and Li, Mingyang
Subjects: POINT processes, ALGORITHMS
Abstract: The density peaks clustering (DPC) algorithm is a classical and widely used clustering method. However, the DPC algorithm requires manual selection of cluster centers, a single way of density calculation, and cannot effectively handle low-density points. To address the above issues, we propose a novel density deviation multi-peaks automatic clustering method (AmDPC) in this paper. Firstly, we propose a new local-density and use the deviation to measure the relationship between data points and the cut-off distance ( d c ). Secondly, we divide the density deviation into multiple density levels equally and extract the points with higher distances in each density level. Finally, for the multi-peak points with higher distances at low-density levels, we merge them according to the size difference of the density deviation. We finally achieve the overall automatic clustering by processing the low-density points. To verify the performance of the method, we test the synthetic dataset, the real-world dataset, and the Olivetti Face dataset, respectively. The simulation experimental results indicate that the AmDPC method can handle low-density points more effectively and has certain effectiveness and robustness. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

29. A fault diagnosis framework using unlabeled data based on automatic clustering with meta-learning.

Author: Zhao, Zhiqian, Jiao, Yinghou, Xu, Yeyin, Chen, Zhaobo, and Zio, Enrico
Subjects: *DATA augmentation, *MACHINE learning, *INTERNET of things, *DIAGNOSIS methods, *PROBLEM solving, *CASCADE control
Abstract: With the growth of the industrial internet of things, the poor performance of conventional deep learning models hinders the application of intelligent diagnosis methods in industrial situations such as lack of fault samples and difficulties in data labeling. To solve the above problems, we propose a fault diagnosis framework based on unsupervised meta-learning and contrastive learning, which is called automatic clustering with meta-learning (ACML). First, the amount of data is expanded through data augmentation approaches, and a feature generator is constructed to extract highly discriminative features from the unlabeled dataset using contrastive learning. Then, a cluster generator is used to automatically divide cluster partitions and add pseudo-labels for these. Finally, the classification tasks are derived through taking original samples in the partitions, which are embedded in the meta-learner for fault diagnosis. In the meta-learning stage, we split out two subsets from task and feed them into the inner and outer loops to maintain the class consistency of the real labels. After training, ACML transfers its prior expertise to the unseen task to efficiently complete the categorization of new faults. ACML is applied to two cases concerning a public dataset and a self-constructed dataset, demonstrate that ACML achieves good diagnostic performance, outperforming popular unsupervised methods. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

30. A novel density deviation multi-peaks automatic clustering algorithm

Author: Wei Zhou, Limin Wang, Xuming Han, Milan Parmar, and Mingyang Li
Subjects: Automatic clustering, Density peaks clustering, Density deviation, Low-density points, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract The density peaks clustering (DPC) algorithm is a classical and widely used clustering method. However, the DPC algorithm requires manual selection of cluster centers, a single way of density calculation, and cannot effectively handle low-density points. To address the above issues, we propose a novel density deviation multi-peaks automatic clustering method (AmDPC) in this paper. Firstly, we propose a new local-density and use the deviation to measure the relationship between data points and the cut-off distance ( $$d_c$$ d c ). Secondly, we divide the density deviation into multiple density levels equally and extract the points with higher distances in each density level. Finally, for the multi-peak points with higher distances at low-density levels, we merge them according to the size difference of the density deviation. We finally achieve the overall automatic clustering by processing the low-density points. To verify the performance of the method, we test the synthetic dataset, the real-world dataset, and the Olivetti Face dataset, respectively. The simulation experimental results indicate that the AmDPC method can handle low-density points more effectively and has certain effectiveness and robustness.
Published: 2022
Full Text: View/download PDF

31. A hybrid genetic-fuzzy ant colony optimization algorithm for automatic K-means clustering in urban global positioning system.

Author: Ran, Xiaojuan, Suyaroj, Naret, Tepsan, Worawit, Ma, Jianghong, Zhou, Xiangbing, and Deng, Wu
Subjects: *GLOBAL Positioning System, *K-means clustering, *ANT algorithms
Published: 2024
Full Text: View/download PDF

32. Improved SOSK-Means Automatic Clustering Algorithm with a Three-Part Mutualism Phase and Random Weighted Reflection Coefficient for High-Dimensional Datasets.

Author: Ikotun, Abiodun M. and Ezugwu, Absalom E.
Subjects: FUZZY algorithms, CENTROID, METAHEURISTIC algorithms, REFLECTANCE, K-means clustering, CENTRAL limit theorem, OUTLIER detection, SEARCH algorithms
Abstract: Automatic clustering problems require clustering algorithms to automatically estimate the number of clusters in a dataset. However, the classical K-means requires the specification of the required number of clusters a priori. To address this problem, metaheuristic algorithms are hybridized with K-means to extend the capacity of K-means in handling automatic clustering problems. In this study, we proposed an improved version of an existing hybridization of the classical symbiotic organisms search algorithm with the classical K-means algorithm to provide robust and optimum data clustering performance in automatic clustering problems. Moreover, the classical K-means algorithm is sensitive to noisy data and outliers; therefore, we proposed the exclusion of outliers from the centroid update's procedure, using a global threshold of point-to-centroid distance distribution for automatic outlier detection, and subsequent exclusion, in the calculation of new centroids in the K-means phase. Furthermore, a self-adaptive benefit factor with a three-part mutualism phase is incorporated into the symbiotic organism search phase to enhance the performance of the hybrid algorithm. A population size of 40 + 2 g was used for the symbiotic organism search (SOS) algorithm for a well distributed initial solution sample, based on the central limit theorem that the selection of the right sample size produces a sample mean that approximates the true centroid on Gaussian distribution. The effectiveness and robustness of the improved hybrid algorithm were evaluated on 42 datasets. The results were compared with the existing hybrid algorithm, the standard SOS and K-means algorithms, and other hybrid and non-hybrid metaheuristic algorithms. Finally, statistical and convergence analysis tests were conducted to measure the effectiveness of the improved algorithm. The results of the extensive computational experiments showed that the proposed improved hybrid algorithm outperformed the existing SOSK-means algorithm and demonstrated superior performance compared to some of the competing hybrid and non-hybrid metaheuristic algorithms. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

33. Automatic clustering based on dynamic parameters harmony search optimization algorithm.

Author: Zhu, Qidan, Tang, Xiangmeng, and Elahi, Ahsan
Subjects: *SEARCH algorithms, *MATHEMATICAL optimization, *K-means clustering, *IMAGE segmentation, *PROBLEM solving
Abstract: As a typical unsupervised learning technique, clustering has been widely applied. However, in many cases, prior information about the number of clusters is unknown, so how to determine it automatically in clustering is getting more attention. In this article, a method named automatic clustering based on dynamic parameters harmony search optimization algorithm, i.e., AC-DPHS, is proposed to solve this problem. By improving the basic harmony search (HS), the dynamic parameters harmony search (DPHS) is devised, which makes the parameters change dynamically without pre-definition. The AC-DPHS takes advantage of the merits of both DPHS and K-means clustering and can determine the optimal number of clusters automatically. A comprehensive experiment is carried out to evaluate the performance of AC-DPHS. The results illustrate that the AC-DPHS generated by using the PBM validity index as its fitness function is relatively superior, and it performs over other approaches developed recently in real-life data clustering as well as grayscale images segmentation. Consequently, the method explained in this article is effectiveness and practical, which can be considered as a new automatic clustering scheme. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

34. Consensus Nature Inspired Clustering of Single-Cell RNA-Sequencing Data

Author: Amany H. Abou El-Naga, Sabah Sayed, Akram Salah, and Heba Mohsen
Subjects: Single-cell RNA-seq, automatic clustering, unsupervised learning, swarm intelligence, metaheuristic algorithms, consensus clustering, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Single-cell RNA sequencing (scRNA-seq) enables quantification of mRNA expression at the level of individual cells. scRNA-seq uncovers the disparity of cellular heterogeneity giving insights about the expression profiles of distinct cells revealing cellular differentiation. The rapid advancements in scRNA-seq technologies enable researchers to exploit questions regarding cancer heterogeneity and tumor microenvironment. The process of analyzing mainly clustering scRNA-seq data is computationally challenging due to its noisy high dimensionality nature. In this paper, a computational clustering approach is proposed to cluster scRNA-seq data based on consensus clustering using swarm intelligent optimization algorithms to accurately recognize cell subtypes. The proposed approach uses variational auto-encoders to handle the curse of dimensionality, as it operates to create a latent biologically relevant feature space representing the original data. The new latent space is then clustered using Particle Swarm Optimization Algorithm, Multi-Verse Optimization Algorithm and Grey Wolf Optimization Algorithm. A consensus solution is found using solutions returned by the swarm intelligent algorithms. The proposed approach automatically derives the number of clusters without any prior knowledge. To evaluate the performance of the proposed approach a total of four datasets have been used then a comparison against the existing methods in literature has been performed. Experimental results show that the proposed approach performs better than widely most used tools, achieving an adjusted rand index of.95,.75,.88,.9 for Biase, Goolam, Melanoma cancer and Lung cancer datasets respectively.
Published: 2022
Full Text: View/download PDF

35. An automatic affinity propagation clustering based on improved equilibrium optimizer and t-SNE for high-dimensional data.

Author: Duan, Yuxian, Liu, Changyun, Li, Song, Guo, Xiangke, and Yang, Chunlin
Subjects: *SWARM intelligence, *MACHINE learning, *EQUILIBRIUM, *DATA distribution, *PROBLEM solving, *ALGORITHMS, *CONSUMER preferences
Abstract: Automatic clustering and dimension reduction are two of the most intriguing topics in the field of clustering. Affinity propagation (AP) is a representative graph-based clustering algorithm in unsupervised learning. However, extracting features from high-dimensional data and providing satisfactory clustering results is a serious challenge for the AP algorithm. Besides, the clustering performance of the AP algorithm is sensitive to preference. In this paper, an improved affinity propagation based on optimization of preference (APBOP) is proposed for automatic clustering on high-dimensional data. This method is optimized to solve the difficult problem of determining the preference of affinity propagation and the poor clustering effect for non-convex data distribution. First, t-distributed stochastic neighbor embedding is introduced to reduce the dimensionality of the original data to solve the redundancy problem caused by excessively high dimensionality. Second, an improved hybrid equilibrium optimizer based on the crisscross strategy (HEOC) is proposed to optimize preference selection. HEOC introduces the crisscross strategy to enhance local search and convergence efficiency. The benchmark function experiments indicate that the HEOC algorithm has better accuracy and convergence rate than other swarm intelligence algorithms. Simulation experiments on high-dimensional and real-world datasets show that APBOP has better effectiveness. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

36. Automatic Clustering for Unsupervised Risk Diagnosis of Vehicle Driving for Smart Road.

Author: Shi, Xiupeng, Wong, Yiik Diew, Chai, Chen, Li, Michael Zhi-Feng, Chen, Tianyi, and Zeng, Zeng
Abstract: Early risk diagnosis and driving anomaly detection from vehicle stream are of great benefits in a range of advanced solutions towards Smart Road and crash prevention, although there are intrinsic challenges, especially lack of ground truth, definition of multiple risk exposures. This study proposes a domain-specific automatic clustering (termed AutoCluster) to self-learn the optimal models for unsupervised risk assessment, which integrates key steps of clustering into an auto-optimisable pipeline, including feature and algorithm selection, hyperparameter auto-tuning. Firstly, based on surrogate conflict measures, a series of risk indicator features are constructed to represent temporal-spatial and kinematical risk exposures. Then, we develop an unsupervised feature selection method to identify the useful features by elimination-based model reliance importance (EMRI). Secondly, we propose balanced Silhouette Index (bSI) to evaluate the internal quality of imbalanced clustering. A loss function is designed that considers the clustering performance in terms of internal quality, inter-cluster variation, and model stability. Thirdly, based on Bayesian optimisation, the algorithm auto-selection and hyperparameter auto-tuning are self-learned to generate the best clustering results. Herein, NGSIM vehicle trajectory data is used for test-bedding. Findings show that AutoCluster is reliable and promising to diagnose multiple distinct risk levels inherent to generalised driving behaviour. We also delve into risk clustering, such as, algorithms heterogeneity, Silhouette analysis, hierarchical clustering flows, etc. Meanwhile, the AutoCluster is also a method for unsupervised data labelling and indicator threshold calibration. Furthermore, AutoCluster is useful to tackle the challenges in imbalanced clustering without ground truth or a priori knowledge. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

37. Automatic Clustering for Improved Radio Environment Maps in Distributed Applications

Author: Haithem Ben Chikha and Alaa Alaerjan
Subjects: automatic clustering, edge computing, K-means clustering, Kriging technique, radio environment map, reference signal received power, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Abstract: Wireless communication greatly contributes to the evolution of new technologies, such as the Internet of Things (IoT) and edge computing. The new generation networks, including 5G and 6G, provide several connectivity advantages for multiple applications, such as smart health systems and smart cities. Adopting wireless communication technologies in these applications is still challenging due to factors such as mobility and heterogeneity. Predicting accurate radio environment maps (REMs) is essential to facilitate connectivity and improve resource utilization. The construction of accurate REMs through the prediction of reference signal received power (RSRP) can be useful in densely distributed applications, such as smart cities. However, predicting an accurate RSRP in the applications can be complex due to intervention and mobility aspects. Given the fact that the propagation environments can be different in a specific area of interest, the estimation of a common path loss exponent for the entire area produces errors in the constructed REM. Hence, it is necessary to use automatic clustering to distinguish between different environments by grouping locations that exhibit similar propagation characteristics. This leads to better prediction of the propagation characteristics of other locations within the same cluster. Therefore, in this work, we propose using the Kriging technique, in conjunction with the automatic clustering approach, in order to improve the accuracy of RSRP prediction. In fact, we adopt K-means clustering (KMC) to enhance the path loss exponent estimation. We use a dataset to test the proposed model using a set of comparative studies. The results showed that the proposed approach provides significant RSRP prediction capabilities for constructing REM, with a gain of about 3.3 dB in terms of root mean square error compared to the case without clustering.
Published: 2023
Full Text: View/download PDF

38. Automatic Data Clustering Using Hybrid Chaos Game Optimization with Particle Swarm Optimization Algorithm.

Author: Ouertani, Mohamed Wajdi, Manita, Ghaith, and Korbaa, Ouajdi
Subjects: PARTICLE swarm optimization, MATHEMATICAL optimization, IMAGE processing
Abstract: In cluster analysis, classical approaches suffer from the problem of identifying the number of clusters, known as the automatic clustering problem. Therefore, automatic clustering has become a popular research area and offers opportunities in various data analysis applications such as bioinformatics, medicine, image processing and consumer segmentation. It is considered as NP- complete problem where it is preferable to use approximate approaches. In this study, we propose an hybrid approach between chaos game optimization and particle swarm optimization (CGOPSO). The Davies-Bouldin index (DBI) is used as a main objective of the proposed approach with the purpose to find the most accurate number of cluster centroids and their positions. To assess its performance, we compared CGOPSO with different other existing algorithms in the literature over 12 classical datasets using two different validity indexes: Davies Bouldin index (DBI) and Compact-Seperated index (CSI). The experimental results have demonstrated that CGOPSO shows better performance than other algorithms. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

39. Leveraging clustering validation index for detecting 'stops' in spatial trajectory data: a semi-automatic approach.

Author: Bandyopadhyay, Mainak
Subjects: *PARTICLE swarm optimization, *SMARTPHONES
Abstract: Spatial trajectory data, interestingly attracting organizations to obtain mobility-based activity patterns of smartphone users. One of the basic objective in this regard is the determination of 'stops' or technically high-density points in the trajectory data. Most works carried out in this area uses variants of density-based clustering algorithms for determining stop points. One of the notable challenges in this area is the determination of the parameters for the clustering algorithm, which highly affects the accuracy of detecting the 'stops'.In this paper a semi-automatic approach is proposed based on particle swarm optimization, DBSCAN, and S_Dbw internal validity index for determining appropriate parameter values for the clustering algorithm and fast convergence. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

40. Unsupervised modelling of rice aroma change during ageing based on electronic nose coupled with bio-inspired algorithms.

Author: Rahimzadeh, Hassan, Sadeghi, Morteza, Mireei, Seyed Ahmad, and Ghasemi-Varnamkhasti, Mahdi
Subjects: *ELECTRONIC noses, *NOSE, *RICE, *BEES algorithm, *SELF-organizing maps, *DIFFERENTIAL evolution, *ALGORITHMS
Abstract: Rice aroma profiles, consisting of different sorts of volatile organic compounds, go through complex alterations during storage period that affect its end-use quality. The aroma active compounds of rice are the key elements to determine its quality characteristics and freshness. This study evaluates the efficiency of electronic nose combined with modern biologically inspired unsupervised algorithms as a fast, real-time, and non-invasive technology for modelling rice aroma change during storage. The developed unsupervised algorithms could present successful results in clustering of the aroma change, and also could provide supplementary facilities for assessment of ageing in aromatic and non-aromatic rice samples. Self-organizing map neural network resulted in reliable grouping of the samples. The provided unified distance matrices were applied to interpret the pace of the aroma change. Fuzzy c-means was hybridized with differential evolution, and the fuzzifier parameter was applied for controlling of the samples' memberships to the storage clusters. The developed fuzzy clustering proposed more reasonable solutions when there were overlappings among the clusters. The automatic clustering developed by artificial bee colony algorithm proposed original grouping of the aromatic samples. For the non-aromatic rice, the structures obtained by the automatic approach agreed satisfactorily with the aroma change during ageing. The automatic clustering eliminated the need for the number of clusters as well as the descriptive labelled samples, and presented smarter performance. The electronic nose system combined with the developed unsupervised algorithms could be utilized as a reliable and rapid tool for analysis of rice aroma change. [Display omitted] • MOS based e-nose was applied for unsupervised modelling of rice aroma change. • Self-organizing map and the resulting U-matrix was applied for crisp clustering. • Differential evolution was hybridized with fuzzy c-means for fuzzy clustering. • Artificial bee colony was developed in automatic clustering framework. • E-nose combined with bio-inspired methods is fast and reliable in analyzing rice aroma. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

41. Automatic Domain Decomposition in Finite Element Method -- A Comparative Study.

Author: Kaveh, Ali, Seddighian, Mohammad Reza, and Hassani, Pouya
Subjects: *FINITE element method, *GRAPH theory, *COMPARATIVE studies
Abstract: In this paper, an automatic data clustering approach is presented using some concepts of the graph theory. Some Cluster Validity Index (CVI) is mentioned, and DB Index is defined as the objective function of meta-heuristic algorithms. Six Finite Element meshes are decomposed containing two- and three- dimensional types that comprise simple and complex meshes. Six meta-heuristic algorithms are utilized to determine the optimal number of clusters and minimize the decomposition problem. Finally, corresponding statistical results are compared. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

42. Improved SOSK-Means Automatic Clustering Algorithm with a Three-Part Mutualism Phase and Random Weighted Reflection Coefficient for High-Dimensional Datasets

Author: Abiodun M. Ikotun and Absalom E. Ezugwu
Subjects: symbiotic organism search, K-means, clustering algorithms, hybrid metaheuristics, automatic clustering, outliers, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Abstract: Automatic clustering problems require clustering algorithms to automatically estimate the number of clusters in a dataset. However, the classical K-means requires the specification of the required number of clusters a priori. To address this problem, metaheuristic algorithms are hybridized with K-means to extend the capacity of K-means in handling automatic clustering problems. In this study, we proposed an improved version of an existing hybridization of the classical symbiotic organisms search algorithm with the classical K-means algorithm to provide robust and optimum data clustering performance in automatic clustering problems. Moreover, the classical K-means algorithm is sensitive to noisy data and outliers; therefore, we proposed the exclusion of outliers from the centroid update’s procedure, using a global threshold of point-to-centroid distance distribution for automatic outlier detection, and subsequent exclusion, in the calculation of new centroids in the K-means phase. Furthermore, a self-adaptive benefit factor with a three-part mutualism phase is incorporated into the symbiotic organism search phase to enhance the performance of the hybrid algorithm. A population size of 40+2g was used for the symbiotic organism search (SOS) algorithm for a well distributed initial solution sample, based on the central limit theorem that the selection of the right sample size produces a sample mean that approximates the true centroid on Gaussian distribution. The effectiveness and robustness of the improved hybrid algorithm were evaluated on 42 datasets. The results were compared with the existing hybrid algorithm, the standard SOS and K-means algorithms, and other hybrid and non-hybrid metaheuristic algorithms. Finally, statistical and convergence analysis tests were conducted to measure the effectiveness of the improved algorithm. The results of the extensive computational experiments showed that the proposed improved hybrid algorithm outperformed the existing SOSK-means algorithm and demonstrated superior performance compared to some of the competing hybrid and non-hybrid metaheuristic algorithms.
Published: 2022
Full Text: View/download PDF

43. Optimized data driven fault detection and diagnosis in chemical processes.

Author: Ardali, Nahid Raeisi, Zarghami, Reza, and Gharebagh, Rahmat Sotudeh
Subjects: *CHEMICAL processes, *FAULT diagnosis, *FEATURE selection, *METAHEURISTIC algorithms, *FEATURE extraction
Abstract: • A novel fault diagnosis scheme was proposed based on optimization methods. • Nonstationary and nonlinear multivariate chemical processes were analyzed. • NSGAII was utilized for feature selection and t-SNE method was used as feature extraction and visualization method. • DBSCAN, k-means, CURE methods were utilized for non- automatic unsupervised learning investigation. • GA, ABC, DE, HS, and PSO, in combination with DB and CS clustering measures were utilized as automatic unsupervised learning investigation. • The proposed method performed well for fault detection and diagnosis of chemical processes. Fault detection and diagnosis (FDD) is crucial for ensuring process safety and product quality in the chemical industry. Despite the large amounts of process data recorded and stored in chemical plants, most of them are not well-labeled, and their conditions are not adequately specified. In this study, an optimized data-driven FDD model was developed for a chemical process based on automatic clustering algorithms. Due to data preprocessing importance, feature selection was performed by a non-dominated sorting genetic algorithm (NSGAII) based on k-means clustering. The optimal subset of features is selected by comparing clustering results for each subset. The performance of the proposed feature selection method was compared with the Fisher discriminant ratio (FDR), and XGBoost methods. The t-distributed stochastic neighbor embedding (t-SNE), Isomap, and KPCA dimension reduction methods were also employed for feature extraction. Finally, automatic clustering was performed based on metaheuristic algorithms for fault detection and diagnosis. Results were compared with non-automatic clustering methods. The performance of the proposed method was evaluated by examining the Tennessee Eastman and four water tank processes as case studies. The results showed that the proposed method is reliable and capable of online and offline chemical process fault detection and diagnosis. As a result, the findings of this study can be used to stabilize the operation of chemical processes. [Display omitted] [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

44. Multi-level quantum inspired metaheuristics for automatic clustering of hyperspectral images

Author: Dutta, Tulika, Bhattacharyya, Siddhartha, Panigrahi, Bijaya Ketan, Zelinka, Ivan, and Mrsic, Leo
Published: 2023
Full Text: View/download PDF

45. Clustering approximation via a fusion of multiple random samples.

Author: Mahmud, Mohammad Sultan, Huang, Joshua Zhexue, and García, Salvador
Subjects: *STATISTICAL sampling, *DISTRIBUTED computing, *PARALLEL programming, *BIG data, *GRAPH algorithms, *DATA quality, *FUZZY algorithms
Abstract: In big data clustering exploration, the situation is paradoxical because there is no prior or insufficient domain knowledge. Moreover, clustering a big dataset is a challenging task in distributed computing framework. To address this, we propose a new distributed clustering approximation framework for big data with quality guarantees. This innovative framework uses multiple disjoint random samples instead of a single random sample to compute an ensemble result as the estimation of the true result of the entire big dataset. To begin, we modeled a large dataset as a collection of random sample data blocks stored in a distributed file system. Henceafter, a subset of data blocks is randomly selected, and to generate the component clustering results, the serial clustering algorithm is executed in parallel on the distributed computing framework. In each selected random sample, the number of clusters and initial centroids is identified using a density peak-based I-niceDP clustering algorithm, and then the k-means sweep refines them. Since the random samples are disjoint and traditional consensus functions cannot be used, we propose two new methods, a graph similarity and a naturally inspired firefly-based algorithm, to integrate the component clustering results into the final ensemble result. The entire clustering process is displayed through systematic support, extensive measures of clusterability, and quality evaluation. The methods are verified in a series of experiments using synthetic and real-world datasets. Our comprehensive experimental results demonstrate that the proposed methods vividly (1) recognize the correct number of clusters by analyzing a subset of samples and (2) exhibit better scalability, efficiency, and clustering stability. • A distributed computing framework to facilitate the clustering approximation. • A new clustering approximation method using a fusion of multiple random samples. • A data-adaptive and randomized method for big data clustering. • Graph similarity and an evolutionary firefly algorithm-based clustering ensemble. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

46. A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects.

Author: Ezugwu, Absalom E., Ikotun, Abiodun M., Oyelade, Olaide O., Abualigah, Laith, Agushaka, Jeffery O., Eke, Christopher I., and Akinyelu, Andronicus A.
Subjects: *MACHINE learning, *ALGORITHMS, *ARTIFICIAL intelligence, *DATA mining, *COMPUTER science
Abstract: Clustering is an essential tool in data mining research and applications. It is the subject of active research in many fields of study, such as computer science, data science, statistics, pattern recognition, artificial intelligence, and machine learning. Several clustering techniques have been proposed and implemented, and most of them successfully find excellent quality or optimal clustering results in the domains mentioned earlier. However, there has been a gradual shift in the choice of clustering methods among domain experts and practitioners alike, which is precipitated by the fact that most traditional clustering algorithms still depend on the number of clusters provided a priori. These conventional clustering algorithms cannot effectively handle real-world data clustering analysis problems where the number of clusters in data objects cannot be easily identified. Also, they cannot effectively manage problems where the optimal number of clusters for a high-dimensional dataset cannot be easily determined. Therefore, there is a need for improved, flexible, and efficient clustering techniques. Recently, a variety of efficient clustering algorithms have been proposed in the literature, and these algorithms produced good results when evaluated on real-world clustering problems. This study presents an up-to-date systematic and comprehensive review of traditional and state-of-the-art clustering techniques for different domains. This survey considers clustering from a more practical perspective. It shows the outstanding role of clustering in various disciplines, such as education, marketing, medicine, biology, and bioinformatics. It also discusses the application of clustering to different fields attracting intensive efforts among the scientific community, such as big data, artificial intelligence, and robotics. This survey paper will be beneficial for both practitioners and researchers. It will serve as a good reference point for researchers and practitioners to design improved and efficient state-of-the-art clustering algorithms. • Provide an up-to-date comprehensive review of the different clustering techniques. • Highlight novel and most recent practical applications areas of clustering. • Provide a convenient research path for new researchers. • Help experts develop new algorithms for emerging challenges in the research area. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

46 results on '"automatic clustering"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources