34 results on '"Medoid"'
Search Results
2. Dissimilarity to Class Medoids as Features for 3D Point Cloud Classification
- Author
-
Hind Bril El-Haouzi, Sylvain Chabanet, Valentin Chazelle, Philippe Thomas, Centre de Recherche en Automatique de Nancy (CRAN), and Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)
- Subjects
Learning classifier system ,Computer science ,business.industry ,05 social sciences ,Point cloud ,Pattern recognition ,02 engineering and technology ,15. Life on land ,Medoids ,Sawmill simulation ,Class (biology) ,Medoid ,Metamodeling ,Set (abstract data type) ,[SPI]Engineering Sciences [physics] ,Artificial Intelligence ,Iterative Closest Point dissimilarity ,Similarity Discriminant Analysis ,0502 economics and business ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Classifier (UML) ,050203 business & management - Abstract
International audience; Several sawmill simulators exist in the forest-product industry. They are able to simulate the sawing of a log to generate the set of lumbers that would be obtained by transforming a log at a sawmill. In particular, such simulators are able to use a 3D scan of the exterior shape of the logs as input for the simulation. However, it was observed that they can be computationally intensive. Therefore, several authors have proposed to use Artificial Intelligence metamodel, which, in general, can make predictions extremely fast once trained. Such models can approximate the results of a simulator using a vector of descriptive features representing a log, or, alternatively, the full 3D log scans. This paper proposes to use dissimilarity to representative log scans as features to train a Machine Learning classifier. The concept of class Medoids as representative elements of a class will be presented, and a Simlarity Discrimant Analysis was chosen as a good candidate ML classier. This classifier will be compared with two others models studied by the authors.
- Published
- 2021
- Full Text
- View/download PDF
3. Data-Driven Dynamics Description of a Transitional Boundary Layer
- Author
-
Firoozeh Foroozan, Andrea Ianiro, Stefano Discetti, Vanesa Guerrero, Comunidad de Madrid, Universidad Carlos III de Madrid, and European Commission
- Subjects
Clustering algorithms ,Turbulence ,Plane (geometry) ,Markov processes ,Feature vector ,Markov process ,Laminar flow ,Atmospheric thermodynamics ,Medoid ,Aeronáutica ,symbols.namesake ,Boundary layer ,symbols ,Boundary layers ,Statistical physics ,Cluster analysis ,Mathematics - Abstract
Cluster analysis is applied to a DNS dataset of a transitional boundary layer developing over a flat plate. The stream-wise-span-wise plane at a wall normal distance close to the wall is sampled at several time instants and discretized into small sub-regions, which are the observations analysed in this work. Using K-medoids clustering algorithm, a partition of the observations is sought such that the medoids in each cluster represent the main local states. The clustering has been carried out on a two-dimensional reduced-order feature space, constructed with the multi-dimensional scaling technique. The clustered feature space provides a partitioning which consists of five different regions. The observations are automatically classified as laminar, turbulent spots, amplification of disturbances, or fully-developed turbulence. The Lagrangian evolution of the regions and the state transitions are described as a Markov process in terms of transition probability matrix and transition trajectory graph to determine the transition dynamics between different states. PITUFLOW-CM-UC3M, funded by the call "Programa de apoyo a la realización de proyectos interdisciplinares de I+D para jóvenes investigadores de la Universidad Carlos III de Madrid 2019-2020" under the frame of the Convenio Plurianual Comunidad de Madrid-Universidad Carlos III de Madrid. COTURB, funded by the European Research Council, under grant ERC-2014-AdG-669505.
- Published
- 2021
- Full Text
- View/download PDF
4. Outlier Detection in Multivariate Time Series Data Using a Fusion of K-Medoid, Standardized Euclidean Distance and Z-Score
- Author
-
Sulaimon A. Bashir, Nwodo Benita Chikodili, Mohammed D. Abdulmalik, and Opeyemi Aderiike Abisoye
- Subjects
Multivariate statistics ,City block ,business.industry ,Computer science ,Big data ,020206 networking & telecommunications ,Pattern recognition ,02 engineering and technology ,Medoid ,Euclidean distance ,Outlier ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Anomaly detection ,Artificial intelligence ,Time series ,business - Abstract
Data mining technique has been used to extract potentially useful knowledge from big data. However, data mining sometimes faces the issue of incorrect results which could be due to the presence of an outlier in the analyzed data. In the literature, it has been identified that the detection of this outlier could enhance the quality of the dataset. An important type of data that requires outlier detection for accurate prediction and enhanced decision making is time series data. Time series data are valuable as it helps to understand the past behavior which is helpful for future predictions hence, it is important to detect the presence of outliers in time series dataset. This paper proposes an algorithm for outlier detection in Multivariate Time Series (MTS) data based on a fusion of K-medoid, Standard Euclidean Distance (SED), and Z-score. Apart from SED, experiments were also performed on two other distance metrics which are City Block and Euclidean Distance. Z-score performance was compared to that of inter-quartile. However, the result obtained showed that the Z-score technique produced a better outlier detection result of 0.9978 F-measure as compared to inter-quartile of 0.8571 F-measure. Furthermore, SED performed better when combined with both Z-score and inter-quartile than City Block and Euclidean Distance.
- Published
- 2021
- Full Text
- View/download PDF
5. Ball K-Medoids: Faster and Exacter
- Author
-
Yuanyuan Huang, Jinquan Zhang, Qiao Peng, Haozhe Tang, Boyi Yao, and Shibin Zhang
- Subjects
ComputingMethodologies_PATTERNRECOGNITION ,Data point ,k-medoids ,Computer science ,Computation ,Outlier ,Cluster (physics) ,Ball (mathematics) ,Cluster analysis ,Algorithm ,Medoid - Abstract
Cluster analysis can be viewed as a result of the natural evolution of the vast amount of data from daily life, and can discover invisible feature information to contribute to the analysis. K-means algorithm is one of the wide data clustering methods in a variety of real-world applications thanks to its simpleness. However, the k-means is sensitive to noise and outlier data points because a small number of such data can substantially influence the mean value of the cluster. In light of this, the k-medoids algorithm selects a point as a new center that minimizes the sum of the dissimilarities in the cluster, to diminish such sensitivity to outliers. Nevertheless, the line of the k-medoids algorithm is limited by its amounts of computation and not to handle with data efficiently. To this end, we present a novel k-medoids algorithm motivated by the theory of ball cluster, relationship between clusters and partitioning cluster for assigning samples into their nearest medoids efficiently, called ball k-medoids, which drop the distance calculation of sample-medoid significantly. Moreover, a threshold is inferenced by the rollback method for reducing computation of medoid-medoid distance and accelerating clustering. Experiments finally demonstrate that the performance of ball k-medoids achieves more efficient in comparison with other k-medoids algorithms, and it performs exacter accuracy compared with k-means.
- Published
- 2021
- Full Text
- View/download PDF
6. NP-Hardness of 1-Mean and 1-Medoid 2-Clustering Problem with Arbitrary Clusters Sizes
- Author
-
Artem V. Pyatkin
- Subjects
Combinatorics ,Set (abstract data type) ,Euclidean space ,Cluster (physics) ,Centroid ,Partition (number theory) ,Center (group theory) ,Cluster analysis ,Medoid ,Mathematics - Abstract
We consider the following 2-clustering problem. Given n points in Euclidean space, partition it into two subsets (clusters) so that the sum of squared distances between the elements of the clusters and their centers would be minimum. The center of the first cluster coincides with its centroid (mean) while the center of the second cluster should be chosen from the set of the initial points (medoid). It is known that this problem is NP-hard if the cardinalities of the clusters are given as a part of the input. In this paper we prove that the peoblem remains NP-hard in the case of arbitrary clusters sizes.
- Published
- 2021
- Full Text
- View/download PDF
7. Incorporating Historical Data and Past Analyses for Improved Tensile Property Prediction of 9% Cr Steel
- Author
-
Ram Devanathan, Kelly Rose, Osman Mamun, Jeffrey A. Hawk, and Madison Wenzlick
- Subjects
Creative visualization ,Computer science ,Property (programming) ,media_common.quotation_subject ,computer.software_genre ,Medoid ,ComputingMethodologies_PATTERNRECOGNITION ,Principal component analysis ,Data analysis ,Domain knowledge ,Data mining ,Cluster analysis ,computer ,Curse of dimensionality ,media_common - Abstract
Data-driven analytical clustering and visualization techniques were applied to the dataset of 9% Cr experimental alloy data generated through the eXtremeMAT project. Techniques and results were compared with the resulting clusters obtained through similar analytical techniques on previous and reduced versions of the dataset. The principal components were generated in order to reduce the dimensionality of the complex dataset and to visualize the underlying trends in the data. Partitioning around medoids was performed on the resulting principal components to determine relevant clusters. Domain knowledge labels were further applied to the principal components to compare the labels with the trends identified through the clustering methods. The clusters can be used to compare the tensile properties of the alloys and to reduce the variation in the dataset.
- Published
- 2021
- Full Text
- View/download PDF
8. K Means Algorithm
- Author
-
Taeho Jo
- Subjects
ComputingMethodologies_PATTERNRECOGNITION ,Computer science ,Expectation–maximization algorithm ,k-means clustering ,Process (computing) ,Mean vector ,Cluster analysis ,Algorithm ,Fuzzy k means ,Medoid ,k-nearest neighbors algorithm - Abstract
This chapter is concerned with the k means algorithm as the most popular clustering algorithm. This chapter begins with the unsupervised version of the KNN algorithm. With respect to the clustering process, we study the two main versions of the k means algorithm: the crisp k means algorithm and the fuzzy k means algorithm. The k medoid algorithm is mentioned as a variant of the k means algorithm, and the strategies of selecting representative items are focused. Note that the k means algorithm is the simplest version of EM algorithm, and it is covered in the next chapter.
- Published
- 2020
- Full Text
- View/download PDF
9. Analyzing and Enhancing Processing Speed of K-Medoid Algorithm Using Efficient Large Scale Processing Frameworks
- Author
-
Vijay Kumar Dwivedi, Ayshwarya Jaiswal, and Om. Prakash Yadav
- Subjects
Scale (ratio) ,Computer science ,Outlier ,Spark (mathematics) ,0207 environmental engineering ,0202 electrical engineering, electronic engineering, information engineering ,020207 software engineering ,02 engineering and technology ,020701 environmental engineering ,Algorithm ,Medoid - Abstract
K-medoid algorithm has recently become a highly active and most discussed topic. It is better than k-means as it is more robust and less sensitive to outliers, but it itself has drawbacks such as number of medoids should be given in advance which is hard to determine and the initial k-clustering centers need to be chosen at random.
- Published
- 2020
- Full Text
- View/download PDF
10. Anomaly Detection Using Modified Differential Evolution: An Application to Banking and Insurance
- Author
-
Vadlamani Ravi and Gutha Jaya Krishna
- Subjects
Local outlier factor ,Computer science ,020101 civil engineering ,02 engineering and technology ,computer.software_genre ,Measure (mathematics) ,Medoid ,0201 civil engineering ,Constraint (information theory) ,Credit card ,ComputingMethodologies_PATTERNRECOGNITION ,Differential evolution ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Anomaly detection ,Data mining ,computer ,Subspace topology - Abstract
We propose two Modified Differential Evolution driven subspace based optimization models for anomaly detection in customer credit card churn detection, automobile insurance fraud detection and customer credit card default detection. Sparsity coefficient is chosen as the objective function for discovering anomalies. Also, we employed an external performance measure as selection constraint, namely, precision multiplied by recall at every iteration after a pre-specified iteration count. The proposed technique outperformed a bunch of baseline algorithms for anomaly detection, for example, Local Outlier Factor, Angle based Outlier Detection, K-means, Partition Around Medoids and also the proposed model without invoking the external performance measure in terms of precision and Area Under ROC Curve (AUC) indicating that the proposed method a viable alternative for anomaly detection.
- Published
- 2020
- Full Text
- View/download PDF
11. An Efficient Performance of Enhanced Bellman-Ford Algorithm in Wireless Sensor Network Using K-Medoid Clustering
- Author
-
Laxmi Shrivastava, Garima Sharma, and Praveen Kumar
- Subjects
Computer science ,Sensor node ,Shortest path problem ,Real-time computing ,Energy consumption ,Cluster analysis ,Dijkstra's algorithm ,Wireless sensor network ,Medoid ,MathematicsofComputing_DISCRETEMATHEMATICS ,Bellman–Ford algorithm - Abstract
Wireless sensor network (WSN) is an accumulation of smart sensor nodes which has firmly restricted control, calculation ability, storage and communication facility. Wireless sensor network (WSN) is the most standard services engaged in commercial and industrial applications like military surveillance, animal monitoring, target tracking, forest fire detection and industry security. The Sensor Node (SN) automatically construct a network connected to the sink node after deploying manually. Each SN is accountable for monitoring surrounding environment and data which is delivered to the sink node in a one-hop or multihop manner. The collected data are transmitted to the remote server by sink node through satellites or internet. Hence, an energy optimization technique is used to reduce the actual power consumption of the SN in place of sink node. Here, the Bellman Ford Shortest Path Algorithm is used for efficient data transmission purposes which helps in reducing the energy consumption of sensor nodes. The Bellman-Ford algorithm is used as a shortest path algorithm in this work. In a given paper K-medoid clustering algorithm is used for cluster formation. K-medoid clustering chooses the sensor node as a cluster head (CH) which lies at the center of the cluster. Further, The MATLAB software is used for the simulation of the Bellman Ford Shortest Path Algorithm for acquiring better results. The simulated results show that the Bellman Ford Shortest Path Algorithm is better than the K-Medoid Algorithm in place of energy consumption and network lifetime.
- Published
- 2020
- Full Text
- View/download PDF
12. Gender Prediction from Classified Indoor Customer Paths by Fuzzy C-Medoids Clustering
- Author
-
Onur Dogan and Basar Oztaysi
- Subjects
Fuzzy classification ,Data collection ,Computer science ,Fuzzy set ,Path (graph theory) ,Data mining ,Cluster analysis ,Levenshtein distance ,computer.software_genre ,Fuzzy logic ,computer ,Medoid - Abstract
Customer oriented systems provides advantages to companies in competitive environment. Understanding customers is a fundamental problem to present individualized offers. Gender information, which is one of the demographic information of customers, mainly cannot be obtained by data collection technologies. Therefore, various techniques are developed to predict unknown genders of customers. In this study, customer genders are predicted from their paths in a shopping mall using fuzzy set theory. A fuzzy classification method based on Levenshtein distance is developed for string data that refer to the indoor customer paths. Although there are several ways to predict the gender, no study has focused on path-based gender classification. The originality of the study is to classify customer data into the gender classes using indoor paths.
- Published
- 2019
- Full Text
- View/download PDF
13. Faster k-Medoids Clustering: Improving the PAM, CLARA, and CLARANS Algorithms
- Author
-
Peter J. Rousseeuw and Erich Schubert
- Subjects
ComputingMethodologies_PATTERNRECOGNITION ,k-medoids ,Computer science ,020204 information systems ,K medoids clustering ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,02 engineering and technology ,Cluster analysis ,Algorithm ,Medoid ,Hierarchical clustering - Abstract
Clustering non-Euclidean data is difficult, and one of the most used algorithms besides hierarchical clustering is the popular algorithm Partitioning Around Medoids (PAM), also simply referred to as k-medoids.
- Published
- 2019
- Full Text
- View/download PDF
14. A Fuzzy Clustering Algorithm with Multi-medoids for Multi-view Relational Data
- Author
-
Eduardo C. Simões and Francisco de A. T. de Carvalho
- Subjects
0301 basic medicine ,Fuzzy clustering ,Computer science ,Relational database ,media_common.quotation_subject ,02 engineering and technology ,Medoid ,Data set ,03 medical and health sciences ,030104 developmental biology ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Quality (business) ,Cluster analysis ,Algorithm ,media_common - Abstract
There is an increasing interest for multi-view clustering due to its ability to manage data from several sources. The majority of multi-view clustering algorithms are suitable to analyse vector data, but much less attention has been given for the analysis of relational data. This paper provides a fuzzy clustering algorithm with multi-medoids for multi-view relational data (MFMMdd). Experiments with real multi-view data sets show the good performance of the MFMMdd in comparison with previous multi-view clustering algorithms for relational data, concerning the quality of the partitions provided by these algorithms.
- Published
- 2019
- Full Text
- View/download PDF
15. Clustering Approach for Data Lake Based on Medoid’s Ranking Strategy
- Author
-
Aicha Mokhtari, Farid Benhammadi, Omar Boussaid, and Redha Benaissa
- Subjects
Speedup ,Computer science ,Stability (learning theory) ,Centroid ,02 engineering and technology ,computer.software_genre ,Medoid ,ComputingMethodologies_PATTERNRECOGNITION ,Ranking ,020204 information systems ,Spark (mathematics) ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Data mining ,Cluster analysis ,computer - Abstract
A number of conventional clustering algorithms suffer from poor scalability, especially for data lake. Thus many modified clustering algorithms have been proposed to speed up these conventional algorithms based on the employment of data sampling techniques. However, these representations require the number of clusters to proceed to centroid selection for final data clustering. To address this limitation, this paper develops a two-phase clustering-based methodology. In the first phase, rather than attempting to construct a random sampling, we define a novel approach that computes plausible sample points, uses them as centroids for the final clusters. To speedup our clustering algorithm in the second phase we propose a parallelization scheme in conjunction with a Spark parallel processing implementation. Computational experiments reveal that the Global sampling method is more effective in terms of both quality and stability compared to the most popular K-means algorithm for the same parameter settings.
- Published
- 2018
- Full Text
- View/download PDF
16. Medoid-Shift for Noise Removal to Improve Clustering
- Author
-
Pasi Fränti and Jiawei Yang
- Subjects
business.industry ,Computer science ,020208 electrical & electronic engineering ,Process (computing) ,Pattern recognition ,02 engineering and technology ,Medoid ,Noise ,Iterated function ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Point (geometry) ,Anomaly detection ,Artificial intelligence ,business ,Cluster analysis ,Noise removal ,Computer Science::Databases - Abstract
We propose to use medoid-shift to reduce the noise in data prior to clustering. The method processes every point by calculating its k-nearest neighbors (k-NN), and then replacing the point by the medoid of its neighborhood. The process can be iterated. After the data cleaning process, any clustering algorithm can be applied that is suitable for the data.
- Published
- 2018
- Full Text
- View/download PDF
17. An Improved Ranked K-medoids Clustering Algorithm Based on a P System
- Author
-
Laisheng Xiang, Xiyu Liu, and Bao Zhang
- Subjects
0209 industrial biotechnology ,Computer science ,Computation ,K medoids clustering ,02 engineering and technology ,Medoid ,ComputingMethodologies_PATTERNRECOGNITION ,020901 industrial engineering & automation ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Cluster analysis ,Membrane computing ,Algorithm ,Time complexity ,P system - Abstract
In this paper an improved ranked K-medoids algorithm by a specific cell-like P system is proposed which extends the application of membrane computing. First, we use the maximum distance method to choose the initial clustering medoids, maximum distance method which is based on the fact that the farthest initial medoids were the least likely assigned in the same cluster. And then, we realize this algorithm by a specific P system. P system is adequate to solve clustering problem for its high parallelism and lower computational time complexity. By computation of the designed system, one possible clustering result is obtained in a non-deterministic and maximal parallel way. Through example verification, our algorithm can improve the quality of clustering.
- Published
- 2018
- Full Text
- View/download PDF
18. A Comparison of Knee Strategies for Hierarchical Spatial Clustering
- Author
-
Brian J. Ross
- Subjects
musculoskeletal diseases ,0301 basic medicine ,Computer science ,business.industry ,Dendrogram ,Centroid ,Pattern recognition ,02 engineering and technology ,musculoskeletal system ,Medoid ,Hierarchical clustering ,03 medical and health sciences ,030104 developmental biology ,0202 electrical engineering, electronic engineering, information engineering ,Spatial clustering ,Cluster (physics) ,020201 artificial intelligence & image processing ,Artificial intelligence ,F1 score ,business ,human activities ,Spatial analysis - Abstract
A comparative study of the performance of knee detection approaches for the hierarchical clustering of 2D spatial data is undertaken. Knee detection is usually performed on the dendogram generated during cluster generation. For many problems, the knee is a natural indication of the ideal or optimal number of clusters for the given problem. This research compares the performance of various knee strategies on different spatial datasets. Two hierarchical clustering algorithms, single linkage and group average, are considered. Besides determining knees using conventional cluster distances, we also explore alternative metrics such as average global medoid and centroid distances, and F score metrics. Results show that knee determination is difficult and problem dependent.
- Published
- 2018
- Full Text
- View/download PDF
19. Fast Extraction Method of Functional Clusters from Large-Scale Spatial Networks Based on Transfer Learning
- Author
-
Kazumi Saito, Takayasu Fushimi, Kazuhiro Kazama, and Tetsuo Ikeda
- Subjects
Set (abstract data type) ,Computer science ,business.industry ,Node (networking) ,Cluster (physics) ,Scale (descriptive set theory) ,Pattern recognition ,Artificial intelligence ,Voronoi diagram ,Cluster analysis ,business ,Random walk ,Medoid - Abstract
In this paper, we treat the road network of each city as a network and attempt to accelerate extracting functional clusters which means areas that perform similar functions in road network. As a method of extracting a group of nodes having similar functions from the network, we have proposed Functional Cluster Extraction method. In this method, high dimensional vectors based on random walks are clustered by the greedy solution of the K-medoids method, and K functional clusters are extracted. However, it is difficult to hold a similarity matrix of all node pairs for a large network with a large number of nodes like a road network. On the other hand, it has been discovered that the structure of the road network has a similar structure even if the area is different. In this paper, we propose a fast clustering method by extracting approximate medoids from the target network, using the medoid set of networks already clustered, and execute the Voronoi tessellation based on them. Using the actual road network, we evaluate the proposed method from the viewpoint of the correct answer rate (accuracy) and the calculation speed of the approximate solution.
- Published
- 2017
- Full Text
- View/download PDF
20. An Obscure Method for Clustering in Android Using k-Medoid and Apriori Algorithm
- Author
-
Syed Zishan Ali, Sriparna Banerjee, Amar Lalwani, and Manisha Mouly Kindo
- Subjects
Apriori algorithm ,Computer science ,SUBCLU ,Data mining ,Android (operating system) ,Cluster analysis ,computer.software_genre ,computer ,Medoid ,FSA-Red Algorithm - Abstract
In today’s scenario, there is quick evolution in each field which contains majority and distinctive sorts of information. In order to differentiate sample data from the other, the amalgamation of data mining techniques with other useful algorithms is done. Android development is one of the major arena where there is tremendous need to execute these calculations. Combining frequent pattern calculation with clustering is extremely efficacious for android. In this paper the work is done in two levels, initial stage concentrates on generation of clusters and final stage deals with finding the frequent patterns.
- Published
- 2017
- Full Text
- View/download PDF
21. Improving the Efficiency of the K-medoids Clustering Algorithm by Getting Initial Medoids
- Author
-
Joaquin Perez-Ortega, Moisés González-Gárcia, Nelva Nely Almanza-Ortega, Adriana Mexicano, Socorro Saenz-Sanchez, Jessica Adams-López, and J. M. Rodríguez-Lelis
- Subjects
k-medoids ,Computer science ,k-means clustering ,Centroid ,02 engineering and technology ,01 natural sciences ,Hybrid algorithm ,Medoid ,Set (abstract data type) ,010104 statistics & probability ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Sensitivity (control systems) ,0101 mathematics ,Cluster analysis ,Algorithm - Abstract
The conventional K-medoids algorithm is one of the most used clustering algorithms, however, one of its limitations is its sensitivity to initial medoids. The generation of optimized initial medoids, which increases the efficiency and effectiveness of K-medoids is proposed. The initial medoids are obtained in two steps, in the first one the data are grouped with an efficient variant of algorithm K-means denominated Early Classification. In the second step, the centroids generated by K-means are transformed into optimized initial medoids. The proposed approach was validated by solving a set of real data sets and compared with the K-medoids algorithm solution. Based on the obtained results it was determined that our approach reduced the time an average of 68%. The quality results of our approach were compared using several well-known validation indexes, and the values were very similar.
- Published
- 2017
- Full Text
- View/download PDF
22. Accelerating Greedy K-Medoids Clustering Algorithm with $$L_1$$ Distance by Pivot Generation
- Author
-
Kazumi Saito, Tetsuo Ikeda, Kazuhiro Kazama, and Takayasu Fushimi
- Subjects
Computer science ,02 engineering and technology ,Object (computer science) ,Real image ,Medoid ,Euclidean distance ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Pruning (decision trees) ,Lazy evaluation ,Cluster analysis ,Algorithm ,Selection (genetic algorithm) - Abstract
With the explosive increase of multimedia objects represented as high-dimensional vectors, clustering techniques for these objects have received much attention in recent years. However, clustering methods usually require a large amount of computational cost when calculating the distances between these objects. In this paper, for accelerating the greedy K-medoids clustering algorithm with \(L_1\) distance, we propose a new method consisting of the fast first medoid selection, lazy evaluation, and pivot pruning techniques, where the efficiency of the pivot construction is enhanced by our new pivot generation method called PGM2. In our experiments using real image datasets where each object is represented as a high-dimensional vector and \(L_1\) distance is recommended as their dissimilarity, we show that our proposed method achieved a reasonably high acceleration performance.
- Published
- 2017
- Full Text
- View/download PDF
23. Multivariate Time Series Clustering Analysis for Human Balance Data
- Author
-
Daphne Teck Ching Lai and Owais Ahmed Malik
- Subjects
Dynamic time warping ,Multivariate statistics ,Multivariate analysis ,Computer science ,business.industry ,Pattern recognition ,Balance test ,01 natural sciences ,Medoid ,010104 statistics & probability ,03 medical and health sciences ,0302 clinical medicine ,Center of pressure (terrestrial locomotion) ,Statistics ,Force platform ,Artificial intelligence ,0101 mathematics ,Cluster analysis ,business ,030217 neurology & neurosurgery - Abstract
The evaluation of human balance control patterns is an important tool for identifying the underlying disorders in the postural control system of individuals and taking appropriate actions if required. This study presents the use of the multivariate time-series clustering techniques for analyzing the human balance patterns based on the force platform data. Different multivariate time-series clustering techniques including partitioning clustering with Dynamic Time Warping (DTW) measure, Permutation Distribution Clustering (PDC) and k-means for longitudinal data (KmL3D) were investigated. The cluster solutions were generated using anterior-posterior and medial-lateral center of pressure (COP) displacement data for four balance evaluation conditions namely eyes open on stable surface (EOS), eyes open on unstable surface (EOU), eyes closed on stable surface (ECS) and eyes closed on unstable surface (ECU). The resulted clusters were evaluated based on various cluster validity indexes. Further, suitable association measures were computed between clustering solutions and demographic (age and body mass index) and qualitative balance test (BEST-T) parameters. The clusters generated by Partition Around Medoid (PAM) DTW technique for EOS, EOU and ECS balance conditions demonstrated statistically significant association with all parameters while for ECU balance testing condition, significant associations were observed only for the age parameter of the participants.
- Published
- 2017
- Full Text
- View/download PDF
24. Terminological Cluster Trees for Disjointness Axiom Discovery
- Author
-
Floriana Esposito, Giuseppe Rizzo, Nicola Fanizzi, and Claudia d'Amato
- Subjects
Theoretical computer science ,Computer science ,business.industry ,02 engineering and technology ,Linked data ,Similarity measure ,computer.software_genre ,Medoid ,Description logic ,Knowledge base ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Data mining ,Cluster analysis ,business ,Semantic Web ,computer ,Axiom - Abstract
Despite the benefits deriving from explicitly modeling concept disjointness to increase the quality of the ontologies, the number of disjointness axioms in vocabularies for the Web of Data is still limited, thus risking to leave important constraints underspecified. Automated methods for discovering these axioms may represent a powerful modeling tool for knowledge engineers. For the purpose, we propose a machine learning solution that combines (unsupervised) distance-based clustering and the divide-and-conquer strategy. The resulting terminological cluster trees can be used to detect candidate disjointness axioms from emerging concept descriptions. A comparative empirical evaluation on different types of ontologies shows the feasibility and the effectiveness of the proposed solution that may be regarded as complementary to the current methods which require supervision or consider atomic concepts only.
- Published
- 2017
- Full Text
- View/download PDF
25. An Incremental Approach to Semantic Clustering Designed for Software Visualization
- Author
-
Juraj Vincur and Ivan Polasek
- Subjects
0301 basic medicine ,Structure (mathematical logic) ,Software visualization ,business.industry ,Computer science ,Covariance matrix ,05 social sciences ,050301 education ,Machine learning ,computer.software_genre ,Medoid ,Visualization ,03 medical and health sciences ,Identification (information) ,030104 developmental biology ,Semantic computing ,Software system ,Artificial intelligence ,business ,0503 education ,computer - Abstract
In this paper, we introduce an incremental approach to semantic clustering, designed for software visualization, inspired by behavior of fire ant colony. Our technique focus on identification of equally sized but natural clusters that provides better hindsight of software system structure for development participants. We also address performance issues of existing approaches by maintaining similarities based on global weights incrementally, using subspaces and covariance matrix. Effectivity of visualization is improved by representing multiple documents with precise medoid approximation.
- Published
- 2016
- Full Text
- View/download PDF
26. Fuzzy Clustering of Series Using Quantile Autocovariances
- Author
-
Borja Lafuente-Rego and José A. Vilar
- Subjects
Fuzzy clustering ,02 engineering and technology ,computer.software_genre ,01 natural sciences ,Fuzzy logic ,Medoid ,010104 statistics & probability ,Autocovariance ,ComputingMethodologies_PATTERNRECOGNITION ,0202 electrical engineering, electronic engineering, information engineering ,FLAME clustering ,020201 artificial intelligence & image processing ,Data mining ,0101 mathematics ,Cluster analysis ,computer ,k-medians clustering ,Quantile ,Mathematics - Abstract
Unlike conventional clustering, fuzzy cluster analysis allows data elements to belong to more than one cluster by assigning membership degrees of each data to clusters. This work proposes a fuzzy C- medoids algorithm to cluster time series based on comparing their estimated quantile autocovariance functions. The behaviour of the proposed algorithm is studied on different simulated scenarios and its effectiveness is concluded by comparison with alternative approaches.
- Published
- 2016
- Full Text
- View/download PDF
27. Design of Computer Experiments Using Competing Distances Between Set-Valued Inputs
- Author
-
Jean Baccou, Clément Chevalier, Frédéric Perales, and David Ginsbourger
- Subjects
Set (abstract data type) ,Hausdorff distance ,Computer simulation ,Computer science ,Design of experiments ,Hausdorff space ,Uncertainty quantification ,Computer experiment ,Algorithm ,Simulation ,Medoid - Abstract
In many numerical simulation experiments from natural sciences and engineering, inputs depart from the classical moderate-dimensional vector set-up and include more complex objects such as parameter fields or maps. In this case, and when inputs are generated using stochastic methods or taken from a pre-existing large set of candidates, one often needs to choose a subset of “representative” elements because of practical restrictions. Here we tackle the design of experiments based on distances or dissimilarity measures between input maps, and more specifically between inputs of set-valued nature. We consider the problem of choosing experiments given dissimilarities such as the Hausdorff or Wasserstein distances but also of eliciting adequate dissimilarities not only based on practitioners’ expertise but also on quantitative and graphical diagnostics including nearest neighbour cross-validation and non-Euclidean structural analysis. The proposed approaches are illustrated on an original uncertainty quantification case study from mechanical engineering, where using partitioning around medoids with ad hoc distances gives promising results in terms of stratified sampling.
- Published
- 2016
- Full Text
- View/download PDF
28. Clustering of MRI Radiomics Features for Glioblastoma Multiforme: An Initial Study
- Author
-
Zhicheng Li, Yaoqin Xie, Song Bolin, Li Qihua, Lei Wang, Yinsheng Chen, and Qiuchang Sun
- Subjects
medicine.diagnostic_test ,Computer science ,business.industry ,Magnetic resonance imaging ,Pattern recognition ,Fluid-attenuated inversion recovery ,medicine.disease ,Medoid ,030218 nuclear medicine & medical imaging ,03 medical and health sciences ,0302 clinical medicine ,Radiomics ,Feature (computer vision) ,030220 oncology & carcinogenesis ,Consensus clustering ,medicine ,Artificial intelligence ,business ,Cluster analysis ,Glioblastoma - Abstract
This paper proposed a radiomics model from magnetic resonance imaging (MRI) for Glioblastoma Multiforme (GBM) patients. One challenge of radiomics study is to reduce the redundancy of the features. Totally 466 radiomics features were extracted from automatically segmented tumors from T1, T1 contrast, T2, and FLAIR MRIs. The consensus clustering method was used and 10 feature clusters were obtained. All clusters had a prognostic association with survival, where three clusters had a mean C-index \(\ge \)0.60. The medoid features in each clusters with highest C-index were selected as radiomics signature candidates. The maximum and mean C-indices of the medoids are 0.75 and 0.68. The results demonstrated that the clusters reduced the data redundancy as well as generated clinical relevant radiomics features.
- Published
- 2016
- Full Text
- View/download PDF
29. Advances in Rough and Soft Clustering: Meta-Clustering, Dynamic Clustering, Data-Stream Clustering
- Author
-
Pawan Lingras and Matt Triff
- Subjects
Soft computing ,Fuzzy clustering ,Computer science ,Fuzzy set ,02 engineering and technology ,computer.software_genre ,Medoid ,ComputingMethodologies_PATTERNRECOGNITION ,Data stream clustering ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Unsupervised learning ,020201 artificial intelligence & image processing ,Rough set ,Data mining ,Cluster analysis ,computer - Abstract
Over the last five decades, clustering has established itself as a primary unsupervised learning technique. In most major data mining projects clustering can serve as a first step in understanding the available data. Clustering is used for creating meaningful profiles of entities in an application. It can also be used to compress the dataset into more manageable granules. The initial methods of crisp clustering objects represented using numeric attributes have evolved to address the demands of the real-world. These extensions include the use of soft computing techniques such as fuzzy and rough set theory, the use of centroids and medoids for computational efficiency, modes to accommodate categorical attributes, dynamic and stream clustering for managing continuous accumulation of data, and meta-clustering for correlating parallel clustering processes. This paper uses applications in engineering, web usage, retail, finance, and social networks to illustrate some of the recent advances in clustering and their role in improved profiling, as well as augmenting prediction, classification, association mining, dimensionality reduction, and optimization tasks.
- Published
- 2016
- Full Text
- View/download PDF
30. Algorithmic Optimizations in the HMAX Model Targeted for Efficient Object Recognition
- Author
-
Ahmad W. Bitar, Ali Chehab, and Mohammad M. Mansour
- Subjects
Support vector machine ,Computational complexity theory ,Computer science ,Cognitive neuroscience of visual object recognition ,Embedding ,Cluster analysis ,Feature learning ,Algorithm ,Medoid ,k-nearest neighbors algorithm - Abstract
In this paper, we propose various approximations aimed at increasing the accuracy of the S1, C1 and S2 layers of the original Gray HMAX model of the visual cortex. At layer S1, an image is convolved with 64 separable gabor filters in the spatial domain after removing some irrelevant information such as illumination and expression variations. At layer C1, some of the minimum scales values are exploited in addition to the maximum ones in order to increase the model’s accuracy. By applying the embedding space in the additive domain, the advantage of some of the minimum scales values is taken by embedding them into their corresponding maximum ones based on a weight value between 0 and 1. At layer S2, we apply clustering, which is considered one the most interesting research areas in the field of data mining, in order to enhance the manner by which all the prototypes are selected during the feature learning stage. This is achieved by using the Partitioning Around Medoid (PAM) clustering algorithm. The impact of these approximations in terms of accuracy and computational complexity was evaluated on the Caltech101 dataset containing a total of 9,145 images split between 101 distinct object categories in addition to a background category, and compared with the baseline performance using support vector machine (SVM) and nearest neighbor (NN) classifiers. The results show that our model provides significant improvement in accuracy at the S1 layer by more than 10 % where the computational complexity is also reduced. The accuracy is slightly increased for both approximations at the C1 and S2 layers.
- Published
- 2016
- Full Text
- View/download PDF
31. BSO-CLARA: Bees Swarm Optimization for Clustering LARge Applications
- Author
-
Habiba Drias, Nadjet Kamel, and Yasmin Aboubi
- Subjects
Computer science ,business.industry ,Big data ,Swarm behaviour ,computer.software_genre ,Machine learning ,Medoid ,ComputingMethodologies_PATTERNRECOGNITION ,CURE data clustering algorithm ,Scalability ,Canopy clustering algorithm ,Artificial intelligence ,Data mining ,Cluster analysis ,business ,Metaheuristic ,computer - Abstract
Clustering is an essential data mining tool for analyzing big data. In this article, an overview of literature methods is undertaken. Following this study, a new algorithm called BSO-CLARA is proposed for clustering large data sets. It is based on bee behavior and k-medoids partitioning. Criteria like effectiveness, eficiency, scalability and control of noise and outliers are discussed for the new method and compared to those of the previous techniques. Experimental results show that BSO-CLARA is more effective and more efficient than PAM, CLARA and CLARANS, the well-known partitioning algorithms but also CLAM, a recent algorithm found in the literature.
- Published
- 2015
- Full Text
- View/download PDF
32. Clustering-Based Retrieval of Similar Outfits Based on Clothes Visual Characteristics
- Author
-
Piotr Czapiewski, Dariusz Frejlichowski, Paweł Forczmański, and Radosław Hofman
- Subjects
Information retrieval ,Computer science ,business.industry ,Cluster analysis ,Clothing ,business ,Medoid ,Domain (software engineering) - Abstract
The fashion domain has been one of the most growing areas of e-commerce, hence the issue of facilitating cloth searching in fashionrelated websites becomes an important topic of research. The paper deals with searching for similar outfits in the clothing images database, using information extracted from unconstrained images containing human silhouettes. Medoids-based clustering is introduced in order to detect groups of similar outfits and speed up the retrieval procedure. Exemplary results of experiments performed on real clothing datasets are presented.
- Published
- 2015
- Full Text
- View/download PDF
33. Tackling Curse of Dimensionality for Efficient Content Based Image Retrieval
- Author
-
Minakshi Banerjee and Seikh Mazharul Islam
- Subjects
business.industry ,Feature vector ,Dimensionality reduction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,Content-based image retrieval ,computer.software_genre ,Kernel principal component analysis ,Medoid ,ComputingMethodologies_PATTERNRECOGNITION ,Feature (computer vision) ,Artificial intelligence ,Data mining ,business ,Cluster analysis ,computer ,Curse of dimensionality ,Mathematics - Abstract
This paper proposes a content based image retrieval (CBIR) technique for tackling curse of dimensionality arising from high dimensional feature representation of database images and search space reduction by clustering. Kernel principal component analysis (KPCA) is taken on MPEG-7 Color Structure Descriptor (CSD) (64-bins) to get low-dimensional nonlinear-subspace. The reduced feature space is clustered using Partitioning Around Medoids (PAM) algorithm with number of clusters chosen from optimum average silhouette width. The clusters are refined to remove possible outliers to enhance retrieval accuracy. The training samples for a query are marked manually and fed to One-Class Support Vector Machine (OCSVM) to search the refined cluster containing the query image. Images are ranked and retrieved from the positively labeled outcome of the belonging cluster. The effectiveness of the proposed method is supported with comparative results obtained from (i) MPEG-7 CSD features directly (ii) other dimensionality reduction techniques.
- Published
- 2015
- Full Text
- View/download PDF
34. Fuzzy c-Medoid Graph Clustering
- Author
-
Ágnes Vathy-Fogarassy, János Abonyi, and András Király
- Subjects
Fuzzy clustering ,Correlation clustering ,FLAME clustering ,Cluster analysis ,Fuzzy logic ,Algorithm ,Dijkstra's algorithm ,Medoid ,Clustering coefficient ,Mathematics - Abstract
We present a modified fuzzy c-medoid algorithm to find central objects in graphs. Initial cluster centres are determined by graph centrality measures. Cluster centres are fine-tuned by minimizing fuzzy-weighted geodesic distances calculated by Dijkstra’s algorithm. Cluster validity indices show significant improvement against fuzzy c-medoid clustering.
- Published
- 2014
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.