Descriptor: "Medoid" / Publisher: springer berlin heidelberg - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Medoid"' showing total 33 results

Start Over Descriptor "Medoid" Publisher springer berlin heidelberg

33 results on '"Medoid"'

1. The Impact of Triangular Inequality Violations on Medoid-Based Clustering

Author: Baraty, Saaid, Simovici, Dan A., Zara, Catalin, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Sudan, Madhu, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Vardi, Moshe Y., Series editor, Weikum, Gerhard, Series editor, Goebel, Randy, editor, Siekmann, Jörg, editor, Wahlster, Wolfgang, editor, Kryszkiewicz, Marzena, editor, Rybinski, Henryk, editor, Skowron, Andrzej, editor, and Raś, Zbigniew W., editor
Published: 2011
Full Text: View/download PDF

2. Incorporating Fuzziness to CLARANS

Author: Ghosh, Sampreeti, Mitra, Sushmita, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Chaudhury, Santanu, editor, Mitra, Sushmita, editor, Murthy, C. A., editor, Sastry, P. S., editor, and Pal, Sankar K., editor
Published: 2009
Full Text: View/download PDF

3. MACOC: A medoid-based ACO clustering algorithm

Author: Héctor D. Menéndez, David Camacho, Fernando E. B. Otero, UAM. Departamento de Ingeniería Informática, and Análisis de Datos e Inteligencia Aplicada (ING EPS-012)
Subjects: Informática, Computer science, Centroid, computer.software_genre, Partition (database), Medoid, Ant Colony Optimization, Clustering, Machine Learning, ComputingMethodologies_PATTERNRECOGNITION, Unsupervised learning, Data Mining, Data mining, Q335, Cluster analysis, computer
Abstract: Proceedings of 9th International Conference, ANTS 2014, Brussels, Belgium, September 10-12, 2014., he final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-09952-1_11, The application of ACO-based algorithms in data mining is growing over the last few years and several supervised and unsupervised learning algorithms have been developed using this bio-inspired approach. Most recent works concerning unsupervised learning have been focused on clustering, showing great potential of ACO-based techniques. This work presents an ACO-based clustering algorithm inspired by the ACO Clustering (ACOC) algorithm. The proposed approach restructures ACOC from a centroid-based technique to a medoid-based technique, where the properties of the search space are not necessarily known. Instead, it only relies on the information about the distances amongst data. The new algorithm, called MACOC, has been compared against well-known algorithms (K-means and Partition Around Medoids) and with ACOC. The experiments measure the accuracy of the algorithm for both synthetic datasets and real-world datasets extracted from the UCI Machine Learning Repository., This work has been partly supported by: Spanish Ministry of Science and Education under project TIN2010-19872 and Savier – an Airbus Defense & Space project (FUAM-076914 and FUAM-076915).
Published: 2014
Full Text: View/download PDF

4. Cluster Analysis by a Class of Splicing P Systems

Author: Junli Xu, Jie Xue, and Xiyu Liu
Subjects: ComputingMethodologies_PATTERNRECOGNITION, Theoretical computer science, Computer science, RNA splicing, Cluster (physics), Process (computing), Cluster analysis, Class (biology), P system, Medoid
Abstract: This paper describes a splicing P system for clustering analysis. We use the thought of partitioning around medoids algorithm to design clustering by splicing P systems and implement it in clustering. Then we give an example with 10 points to indicate the feasibility of the provided algorithm. All the processes are conducted in membranes. The process of the proposed algorithm is provided and an instance to prove its feasibility is presented.
Published: 2014
Full Text: View/download PDF

5. Beyond Kmedoids: Sparse Model Based Medoids Algorithm for Representative Selection

Author: Yalin Zhang, Feidie Liang, Yu Wang, Jintao Li, and Sheng Tang
Subjects: Computer science, business.industry, Boundary (topology), Pattern recognition, computer.software_genre, Column (database), Automatic summarization, Medoid, Image (mathematics), Data point, Point (geometry), Data mining, Artificial intelligence, Linear combination, business, computer, Algorithm
Abstract: We consider the problem of seeking representative subset of dataset, which can efficiently serve as the condensed view of the entire dataset. The Kmedoids algorithm is a commonly used unsupervised method, which selects center points as representatives. Those center points are mainly located in high density areas and surrounded by other data points. However, boundary points in the low density areas, which are useful for classification problem, are usually overlooked. In this paper we propose a sparse model based medoids algorithm (Smedoids) which aims to learn a special dictionary. Each column of this dictionary is a representative data point from the dataset, and each data point of the dataset can be described well by a linear combination of the columns of this dictionary. In this way, center and boundary points are all selected as representatives. Experiments evaluate the performances of our method for finding representatives of real datasets on the image and video summarization problem and the multi-class classification problem, and our method is shown to out-perform state-of-the-art in accuracy.
Published: 2013
Full Text: View/download PDF

6. Weighting Features for Partition around Medoids Using the Minkowski Metric

Author: Renato Cordeiro de Amorim and Trevor Fenner
Subjects: Combinatorics, ComputingMethodologies_PATTERNRECOGNITION, business.industry, Minkowski space, Partition (number theory), Pattern recognition, Artificial intelligence, business, Medoid, Mathematics, Weighting
Abstract: In this paper we introduce the Minkowski weighted partition around medoids algorithm (MW-PAM). This extends the popular partition around medoids algorithm (PAM) by automatically assigning K weights to each feature in a dataset, where K is the number of clusters. Our approach utilizes the within-cluster variance of features to calculate the weights and uses the Minkowski metric. We show through many experiments that MW-PAM, particularly when initialized with the Build algorithm (also using the Minkowski metric), is superior to other medoid-based algorithms in terms of both accuracy and identification of irrelevant features.
Published: 2012
Full Text: View/download PDF

7. Sentic Medoids: Organizing Affective Common Sense Knowledge in a Multi-Dimensional Vector Space

Author: Chris Eckl, Thomas Mazzocco, Amir Hussain, and Erik Cambria
Subjects: Commonsense knowledge, Computer science, business.industry, Sentiment analysis, Context (language use), Space (commercial competition), computer.software_genre, Medoid, Artificial intelligence, business, computer, Semantic Web, Natural language, Natural language processing
Abstract: Existing approaches to opinion mining and sentiment analysis mainly rely on parts of text in which opinions and sentiments are explicitly expressed such as polarity terms and affect words. However, opinions and sentiments are often conveyed implicitly through context and domain dependent concepts, which make purely syntactical approaches ineffective. To overcome this problem, we have recently proposed Sentic Computing, a multi-disciplinary approach to opinion mining and sentiment analysis that exploits both computer and social sciences to better recognize and process opinions and sentiments over the Web. Among other tools, Sentic Computing includes AffectiveSpace, a language visualization system that transforms natural language from a linguistic form into a multi-dimensional space. In this work, we present a new technique to better cluster this vector space and, hence, better organize and reason on the affective common sense knowledge in it contained.
Published: 2011
Full Text: View/download PDF

8. Face Recognition Using Multi-exemplar Discriminative Power Analysis

Author: K. K. Achary and Ganesh Bhat
Subjects: business.industry, Computer science, Feature vector, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Nonparametric statistics, Pattern recognition, computer.software_genre, Facial recognition system, Medoid, ComputingMethodologies_PATTERNRECOGNITION, Discriminative model, Computer Science::Computer Vision and Pattern Recognition, Artificial intelligence, Data mining, Mean-shift, Cluster analysis, business, computer, Parametric statistics
Abstract: Face recognition systems based on holistic approach consider facial images as high-dimensional data points generated from a multi modal distribution. Recent efforts in methods based on this assumption have been in deriving multi-descriptor nonparametric classification systems. However, for problems such as face recognition, identification of multiple descriptors for class representation is not trivial. Most of the recent nonparametric classification techniques solve this problem in a low dimensional feature space using supervised parametric density based clustering methods. In this paper, we investigate two unsupervised non parametric clustering methods namely, Mean Shift technique and Partition Around Medoid clustering, to obtain multiple descriptors in the feature space derived using Discriminant Power Analysis (DPA). Simulation results of proposed nonparametric multi descriptor extension of DPA on the AR and Yale database indicate improvement in recognition rate of DPA over parametric multi descriptor and single descriptor classifiers.
Published: 2011
Full Text: View/download PDF

9. The Impact of Triangular Inequality Violations on Medoid-Based Clustering

Author: Saaid Baraty, Dan A. Simovici, and Catalin Zara
Subjects: Metric space, ComputingMethodologies_PATTERNRECOGNITION, Dissimilarity space, Triangle inequality, Computer science, Metric (mathematics), Point (geometry), Rectifier (neural networks), Topology, Cluster analysis, Medoid
Abstract: We evaluate the extent to which a dissimilarity space differs from a metric space by introducing the notion of metric point and metricity in a dissimilarity space. The effect of triangular inequality violations on medoid-based clustering of objects in a dissimilarity space is examined and the notion of rectifier is introduced to transform a dissimilarity space into a metric space.
Published: 2011
Full Text: View/download PDF

10. Non-user-Specific Multivariate Biometric Discretization with Medoid-Based Segmentation

Author: Andrew Beng Jin Teoh and Meng-Hui Lim
Subjects: Discretization, Biometrics, Feature (computer vision), Computer science, business.industry, Feature extraction, Univariate, Segmentation, Pattern recognition, Artificial intelligence, business, Medoid, Discretization of continuous features
Abstract: Univariate discretization approach that transforms continuous attributes into discrete elements/binary string based on discrete/binary feature extraction on a single dimensional basis have been attracting much attention in the biometric community mainly to derive biometric-based cryptographic key derivation for security purpose. However, since components of biometric feature are interdependent, univariate approach may destroy important interactions with such attributes and thus very likely to cause features being discretized suboptimally. In this paper, we introduce a multivariate discretization approach encompassing a medoid-based segmentation with effective segmentation encoding technique. Promising empirical results on two benchmark face datasets significantly justify the superiority of our approach with reference to several non-user-specific univariate biometric discretization schemes.
Published: 2011
Full Text: View/download PDF

11. Data Driven Bandwidth for Medoid Shift Algorithm

Author: Syed Zulqarnain Gilani and Naveed Iqbal Rao
Subjects: Set (abstract data type), Pixel, Mean squared error, Computer Science::Computer Vision and Pattern Recognition, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Bandwidth (computing), Estimator, Point (geometry), Cluster analysis, Algorithm, Medoid, Mathematics
Abstract: Adaptive data driven bandwidth for medoidshift algorithm has been proposed in this work. The proposed method has made it possible to perform clustering on a variety of high resolution statistically different images. Experiments are performed on natural images as well as daily life images. The images have been chosen such that a comparison analysis between fixed sample point estimator k and adaptive k can be carried out in detail. The results show that a fixed value of k=10 is good for statistically compact images but gives undesirable results in dispersed images. Data driven bandwidth is proposed for each data point/ pixel as well as the complete data set/ image. Experimental results have shown our algorithm to be robust. Performance is evaluated on the basis of root mean square error for the quality of clusters.
Published: 2011
Full Text: View/download PDF

12. Clustering Categorical Data in a Multiobjective Framework

Author: Ujjwal Maulik, Sanghamitra Bandyopadhyay, and Anirban Mukhopadhyay
Subjects: business.industry, Computer science, Feature vector, Pattern recognition, Fuzzy logic, Medoid, Euclidean distance, Set (abstract data type), ComputingMethodologies_PATTERNRECOGNITION, Attribute domain, Artificial intelligence, business, Cluster analysis, Categorical variable
Abstract: Most of the clustering algorithms are designed for such datasets where the dissimilarity between any two points of the dataset can be computed using standard distance measures such as the Euclidean distance. However, many real-life datasets are categorical in nature, where no natural ordering can be found among the elements in the attribute domain. In such situations, the clustering algorithms, such as K-means [238] and fuzzy C-means (FCM) [62], cannot be applied. The K-means algorithm computes the center of a cluster by computing the mean of the set of feature vectors belonging to that cluster. However, as categorical datasets do not have any inherent distance measure, computing the mean of a set of feature vectors is meaningless. A variation of the K-means algorithm, namely Partitioning Around Medoids (PAM) or K-medoids [243], has been proposed for such kinds of datasets.
Published: 2011
Full Text: View/download PDF

13. Consensus Multiobjective Differential Crisp Clustering for Categorical Data Analysis

Author: Ujjwal Maulik, Soumyendu Sekhar Bandyopadhyay, Dariusz Plewczynski, and Indrajit Saha
Subjects: Set (abstract data type), Euclidean distance, Differential evolution, Consensus clustering, Data mining, Cluster analysis, computer.software_genre, Categorical variable, computer, Multi-objective optimization, Medoid, Mathematics
Abstract: In this article, an evolutionary crisp clustering technique is described that uses a new consensus multiobjective differential evolution. The algorithm is therefore able to optimize two conflicting cluster validity measures simultaneously and provides resultant Pareto optimal set of nondominated solutions. Thereafter the problem of choosing the best solution from resultant Pareto optimal set is resolved by creation of consensus clusters using voting procedure. The proposed method is used for analyzing the categorical data where no such natural ordering can be found among the elements in categorical domain. Hence no inherent distance measure, like the Euclidean distance, would work to compute the distance between two categorical objects. Index-coded encoding of the cluster medoids (centres) is used for this purpose. The effectiveness of the proposed technique is provided for artificial and real life categorical data sets. Also statistical significance test has been carried out to establish the statistical significance of the clustering results. Matlab version of the software is available at http://bio.icm.edu.pl/~darman/CMODECC.
Published: 2010
Full Text: View/download PDF

14. Median Topographic Maps for Biomedical Data Sets

Author: Hammer, Barbara, Hasenfuss, A., Rossi, F., Biehl, M., Hammer, B., Verleysen, M., Villmann, T., Institute of Computer Science, Institute of Computer Science-Clausthal University of Technology (TU Clausthal), Laboratoire Traitement et Communication de l'Information (LTCI), Télécom ParisTech-Institut Mines-Télécom [Paris] (IMT)-Centre National de la Recherche Scientifique (CNRS), Villmann, Th., Biehl, M., Hammer, B., and Verleysen
Subjects: Neural gas, Computer science, Pairwise data, Data structure, computer.software_genre, [SDV.BIBS]Life Sciences [q-bio]/Quantitative Methods [q-bio.QM], Clustering, Medoid, Matrix (mathematics), Neural Gas, ComputingMethodologies_PATTERNRECOGNITION, [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG], Biomedical data, Data analysis, Data mining, [INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM], Cluster analysis, Focus (optics), Dissimilarity data, computer
Abstract: Median clustering extends popular neural data analysis methods such as the self-organizing map or neural gas to general data structures given by a dissimilarity matrix only. This offers flexible and robust global data inspection methods which are particularly suited for a variety of data as occurs in biomedical domains. In this chapter, we give an overview about median clustering and its properties and extensions, with a particular focus on efficient implementations adapted to large scale data analysis.
Published: 2009
Full Text: View/download PDF

15. Subgroup Discovery in Data Sets with Multi–dimensional Responses: A Method and a Case Study in Traumatology

Author: Lan Umek, Jean-Hugues Chauchat, Blaž Zupan, Marko Toplak, Dragica Smrke, Annie Morin, and Gregor Makovec
Subjects: Contingency table, Dependency (UML), Computer science, business.industry, Experimental data, Machine learning, computer.software_genre, Outcome (probability), Medoid, Subject-matter expert, Feature (machine learning), Artificial intelligence, Data mining, Cluster analysis, business, computer
Abstract: Biomedical experimental data sets may often include many features both at input (description of cases, treatments, or experimental parameters) and output (outcome description). State-of-the-art data mining techniques can deal with such data, but would consider only one output feature at the time, disregarding any dependencies among them. In the paper, we propose the technique that can treat many output features simultaneously, aiming at finding subgroups of cases that are similar both in input and output space. The method is based on k -medoids clustering and analysis of contingency tables, and reports on case subgroups with significant dependency in input and output space. We have used this technique in explorative analysis of clinical data on femoral neck fractures. The subgroups discovered in our study were considered meaningful by the participating domain expert, and sparked a number of ideas for hypothesis to be further experimentally tested.
Published: 2009
Full Text: View/download PDF

16. Partitional Conceptual Clustering of Web Resources Annotated with Ontology Languages

Author: Claudia d'Amato, Floriana Esposito, and Nicola Fanizzi
Subjects: Information retrieval, business.industry, Computer science, Conceptual clustering, Ontology language, Semantics, computer.software_genre, Medoid, Description logic, Knowledge base, Artificial intelligence, business, Cluster analysis, Semantic Web, computer, Natural language processing
Abstract: The paper deals with the problem of cluster discovery in the context of Semantic Web knowledge bases. A partitional clustering algorithm is presented. It is applied for grouping resources contained in knowledge bases and expressed in the standard ontology languages. The method exploits a language-independent semi-distance measure for individuals that is based on the semantics of the resources w.r.t. a context represented by a set of concept descriptions (discriminating features). The clustering algorithm adapts Bisecting k-Means method to work with medoids. Besides, we propose simple mechanisms to assign each cluster an intensional definition that may suggest new concepts for the knowledge base (vivification). A final experiment demonstrates the validity of the approach through absolute quality indices for clustering results.
Published: 2009
Full Text: View/download PDF

17. Are Model-Based Clustering and Neural Clustering Consistent? A Case Study from Bioinformatics

Author: Antonina Starita, Elia Biganzoli, Davide Bacciu, and Paulo J. G. Lisboa
Subjects: Artificial neural network, Computer science, business.industry, Gaussian, Bayesian probability, Pattern recognition, Mixture model, Medoid, Data set, Determining the number of clusters in a data set, symbols.namesake, symbols, Artificial intelligence, Cluster analysis, business
Abstract: A novel neural network clustering algorithm, CoRe, is benchmarked against previously published results on a breast cancer data set and applying the method of Partition Around Medoids (PAM). The data serve to compare the samples partitions obtained with the neural network, PAM and model-based algorithms, namely Gaussian Mixture Model (GMM), Variational Bayesian Gaussian Mixture (VBG) and Variational Bayesian Mixtures with Splitting (VBS). It is found that CoRe, on the one hand, agrees with the previously published partitions; on the other hand, it supports the existence of a supplementary cluster that we hypothesize to be an additional tumor subgroup with respect to those previously identified by PAM.
Published: 2008
Full Text: View/download PDF

18. Evolutionary Clustering in Description Logics: Controlling Concept Formation and Drift in Ontologies

Author: Floriana Esposito, Claudia d'Amato, and Nicola Fanizzi
Subjects: Fitness function, Concept drift, Description logic, Computer science, Concept learning, Metric (mathematics), Supervised learning, Novelty, Data mining, Cluster analysis, computer.software_genre, computer, Medoid
Abstract: We present a method based on clustering techniques to detect concept drift or novelty in a knowledge base expressed in Description Logics. The method exploits an effective and language-independent semi-distance measure defined for the space of individuals, that is based on a finite number of dimensions corresponding to a committee of discriminating features (represented by concept descriptions). In the algorithm, the possible clusterings are represented as strings of central elements (medoids, w.r.t. the given metric) of variable length. The number of clusters is not required as a parameter; the method is able to find an optimal choice by means of the evolutionary operators and of a fitness function. An experimentation with some ontologies proves the feasibility of our method and its effectiveness in terms of clustering validity indices. Then, with a supervised learning phase, each cluster can be assigned with a refined or newly constructed intensional definition expressed in the adopted language.
Published: 2008
Full Text: View/download PDF

19. Evolutionary Rough k-Medoid Clustering

Author: Richard Weber, Martin Lampart, and Georg Peters
Subjects: Computer science, business.industry, Davies–Bouldin index, Boundary (topology), Pattern recognition, Medoid, Set (abstract data type), Core (graph theory), Cluster (physics), Rough set, Artificial intelligence, Cluster analysis, business, Algorithm
Abstract: Recently, clustering algorithms based on rough set theory have gained increasing attention. For example, Lingras et al. introduced a rough k-means that assigns objects to lower and upper approximations of clusters. The objects in the lower approximation surely belong to a cluster while the membership of the objects in an upper approximation is uncertain. Therefore, the core cluster, defined by the objects in the lower approximation is surrounded by a buffer or boundary set with objects with unclear membership status. In this paper, we introduce an evolutionary rough k-medoid clustering algorithm. Evolutionary rough k-medoid clustering belongs to the families of Lingras' rough k-means and classic k-medoids algorithms. We apply the evolutionary rough k-medoids to synthetic as well as to real data sets and compare the results to Lingras' rough k-means. We also introduce a rough version of the Davies-Bouldin-Index as a cluster validity index for the family of rough clustering algorithms.
Published: 2008
Full Text: View/download PDF

20. Conceptual Clustering and Its Application to Concept Drift and Novelty Detection

Author: Floriana Esposito, Nicola Fanizzi, and Claudia d'Amato
Subjects: Fuzzy clustering, Concept drift, Computer science, business.industry, Correlation clustering, Constrained clustering, Conceptual clustering, computer.software_genre, Machine learning, Medoid, Canopy clustering algorithm, Data mining, Artificial intelligence, Cluster analysis, business, computer
Abstract: The paper presents a clustering method which can be applied to populated ontologies for discovering interesting groupings of resources therein. The method exploits a simple, yet effective and language-independent, semi-distance measure for individuals, that is based on their underlying semantics along with a number of dimensions corresponding to a set of concept descriptions (discriminating features committee). The clustering algorithm is a partitional method and it is based on the notion of medoids w.r.t. the adopted semi-distance measure. Eventually, it produces a hierarchical organization of groups of individuals. A final experiment demonstrates the validity of the approach using absolute quality indices. We propose two possible exploitations of these clusterings: concept formation and detecting concept drift or novelty.
Published: 2008
Full Text: View/download PDF

21. Conceptual Clustering Applied to Ontologies

Author: Floriana Esposito, Nicola Fanizzi, and Claudia d'Amato
Subjects: Brown clustering, Fuzzy clustering, business.industry, Correlation clustering, Constrained clustering, Conceptual clustering, Machine learning, computer.software_genre, Medoid, Canopy clustering algorithm, Artificial intelligence, Cluster analysis, business, computer, Mathematics
Abstract: A clustering method is presented which can be applied to semantically annotated resources in the context of ontological knowledge bases. This method can be used to discover emerging groupings of resources expressed in the standard ontology languages. The method exploits a language-independent semi-distance measure over the space of resources, that is based on their semantics w.r.t. a number of dimensions corresponding to a committee of discriminating features represented by concept descriptions. A maximally discriminating group of features can be constructed through a feature construction method based on genetic programming. The evolutionary clustering algorithm proposed is based on the notion of medoids applied to relational representations. It is able to induce a set of clusters by means of a fitness function based on a discernibility criterion. An experimentation with some ontologies proves the feasibility of our method.
Published: 2008
Full Text: View/download PDF

22. A Multi-relational Hierarchical Clustering Method for Datalog Knowledge Bases

Author: Nicola Fanizzi, Claudia d'Amato, and Floriana Esposito
Subjects: Theoretical computer science, Fuzzy clustering, business.industry, Computer science, Correlation clustering, Semantics, Machine learning, computer.software_genre, Medoid, Datalog, Hierarchical clustering, Inductive logic programming, Artificial intelligence, business, Cluster analysis, computer, computer.programming_language
Abstract: A clustering method is presented which can be applied to relational knowledge bases (e.g. DATALOG deductive databases). It can be used to discover interesting groups of resources through their (semantic) annotations expressed in the standard logic programming languages. The method exploits an effective and language-independent semi-distance measure for individuals., that is based on the resource semantics w.r.t. a number of dimensions corresponding to a committee of features represented by a group of concept descriptions (discriminating features). The algorithm is a fusion of the classic BISECTING K-MEANS with approaches based on medoids that are typically applied to relational representations. We discuss its complexity and potential applications to several tasks.
Published: 2008
Full Text: View/download PDF

23. Quick Shift and Kernel Methods for Mode Seeking

Author: Stefano Soatto and Andrea Vedaldi
Subjects: Mathematical optimization, ComputingMethodologies_PATTERNRECOGNITION, Kernel method, Variable kernel density estimation, Kernel embedding of distributions, Computer Science::Computer Vision and Pattern Recognition, Radial basis function kernel, Mean-shift, Cluster analysis, Algorithm, Kernel principal component analysis, Medoid, Mathematics
Abstract: We show that the complexity of the recently introduced medoid-shift algorithm in clustering N points is O(N 2), with a small constant, if the underlying distance is Euclidean. This makes medoid shift considerably faster than mean shift, contrarily to what previously believed. We then exploit kernel methods to extend both mean shift and the improved medoid shift to a large family of distances, with complexity bounded by the effective rank of the resulting kernel matrix, and with explicit regularization constraints. Finally, we show that, under certain conditions, medoid shift fails to cluster data points belonging to the same mode, resulting in over-fragmentation. We propose remedies for this problem, by introducing a novel, simple and extremely efficient clustering algorithm, called quick shift, that explicitly trades off under- and over-fragmentation. Like medoid shift, quick shift operates in non-Euclidean spaces in a straightforward manner. We also show that the accelerated medoid shift can be used to initialize mean shift for increased efficiency. We illustrate our algorithms to clustering data on manifolds, image segmentation, and the automatic discovery of visual categories. © 2008 Springer Berlin Heidelberg.
Published: 2008
Full Text: View/download PDF

24. A Hierarchical Clustering Procedure for Semantically Annotated Resources

Author: Nicola Fanizzi, Floriana Esposito, and Claudia d'Amato
Subjects: Information retrieval, Inductive logic programming, Description logic, business.industry, Computer science, Cluster analysis, Semantics, business, Semantic Web, Social Semantic Web, Medoid, Hierarchical clustering
Abstract: A clustering method is presented which can be applied to relational knowledge bases. It can be used to discover interesting groupings of resources through their (semantic) annotations expressed in the standard languages employed for modeling concepts in the Semantic Web. The method exploits a simple (yet effective and language-independent) semi-distance measure for individuals, that is based on the resource semantics w.r.t. a number of dimensions corresponding to a committee of features represented by a group of concept descriptions (discriminating features). The algorithm is an fusion of the classic Bisecting k-Means with approaches based on medoids since they are intended to be applied to relational representations. We discuss its complexity and the potential applications to a variety of important tasks.
Published: 2007
Full Text: View/download PDF

25. Continuous Medoid Queries over Moving Objects

Author: Kyriakos Mouratidis, Stavros Papadopoulos, and Dimitris Sacharidis
Subjects: Euclidean distance, Set (abstract data type), Task (computing), ComputingMethodologies_PATTERNRECOGNITION, Theoretical computer science, Current (mathematics), k-medoids, Computer science, Medoid
Abstract: In the k-medoid problem, given a dataset P, we are asked to choose k points in P as the medoids. The optimal medoid set minimizes the average Euclidean distance between the points in P and their closest medoid. Finding the optimal k medoids is NP hard, and existing algorithms aim at approximate answers, i.e., they compute medoids that achieve a small, yet not minimal, average distance. Similarly in this paper, we also aim at approximate solutions. We consider, however, the continuous version of the problem, where the points in P move and our task is to maintain the medoid set on-the-fly (trying to keep the average distance small). To the best of our knowledge, this work constitutes the first attempt on continuous medoid queries. First, we consider centralized monitoring, where the points issue location updates whenever they move. A server processes the stream of generated updates and constantly reports the current medoid set. Next, we address distributed monitoring, where we assume that the data points have some computational capabilities, and they take over part of the monitoring task. In particular, the server installs adaptive filters (i.e., permissible spatial ranges, called safe regions) to the points, which report their location only when they move outside their filters. The distributed techniques reduce the frequency of location updates (and, thus, the network overhead and the server load), at the cost of a slightly higher average distance, compared to the centralized methods. Both our centralized and distributed methods do not make any assumption about the data moving patterns (e.g., velocity vectors, trajectories, etc) and can be applied to an arbitrary number of medoids k. We demonstrate the efficiency and efficacy of our techniques through extensive experiments.
Published: 2007
Full Text: View/download PDF

26. Hybrid O( $n \sqrt{n}$ ) Clustering for Sequential Web Usage Mining

Author: Ickjai Lee and Jianhua Yang
Subjects: Clustering high-dimensional data, Theoretical computer science, Brown clustering, Fuzzy clustering, Computer science, Correlation clustering, Single-linkage clustering, computer.software_genre, Medoid, Hierarchical clustering, Metric space, ComputingMethodologies_PATTERNRECOGNITION, Data stream clustering, CURE data clustering algorithm, Canopy clustering algorithm, FLAME clustering, Data mining, Cluster analysis, computer, k-medians clustering
Abstract: We propose a natural neighbor inspired O($n \sqrt{n}$) hybrid clustering algorithm that combines medoid-based partitioning and agglomerative hierarchial clustering. This algorithm works efficiently by inheriting partitioning clustering strategy and operates effectively by following hierarchial clustering. More importantly, the algorithm is designed by taking into account the specific features of sequential data modeled in metric space. Experimental results demonstrate the virtue of our approach.
Published: 2006
Full Text: View/download PDF

27. Medoid Queries in Large Spatial Databases

Author: Kyriakos Mouratidis, Dimitris Papadias, and Spiros Papadimitriou
Subjects: Set (abstract data type), Database, Computer science, Spatial database, Fraction (mathematics), Node (circuits), Data mining, computer.software_genre, computer, Algorithm, Medoid, Block (data storage)
Abstract: Assume that a franchise plans to open k branches in a city, so that the average distance from each residential block to the closest branch is minimized. This is an instance of the k-medoids problem, where residential blocks constitute the input dataset and the k branch locations correspond to the medoids. Since the problem is NP-hard, research has focused on approximate solutions. Despite an avalanche of methods for small and moderate size datasets, currently there exists no technique applicable to very large databases. In this paper, we provide efficient algorithms that utilize an existing data-partition index to achieve low CPU and I/O cost. In particular, we exploit the intrinsic grouping properties of the index in order to avoid reading the entire dataset. Furthermore, we apply our framework to solve medoid-aggregate queries, where k is not known in advance; instead, we are asked to compute a medoid set that leads to an average distance close to a user-specified parameter T. Compared to previous approaches, we achieve results of comparable or better quality at a small fraction of the CPU and I/O costs (seconds as opposed to hours, and tens of node accesses instead of thousands).
Published: 2005
Full Text: View/download PDF

28. A New and Efficient K-Medoid Algorithm for Spatial Clustering

Author: Isabelle Couloigner and Qiaoping Zhang
Subjects: ComputingMethodologies_PATTERNRECOGNITION, Computer science, CURE data clustering algorithm, Multispectral image, Single-linkage clustering, Canopy clustering algorithm, Data mining, computer.software_genre, Cluster analysis, Algorithm, computer, k-medians clustering, Medoid
Abstract: A new k-medoids algorithm is presented for spatial clustering in large applications. The new algorithm utilizes the TIN of medoids to facilitate local computation when searching for the optimal medoids. It is more efficient than most existing k-medoids methods while retaining the exact the same clustering quality of the basic k-medoids algorithm. The application of the new algorithm to road network extraction from classified imagery is also discussed and the preliminary results are encouraging.
Published: 2005
Full Text: View/download PDF

29. Clustering of Web Sessions Using Levenshtein Metric

Author: Sergey V. Kuznetsov and Andrei Scherbina
Subjects: Association rule learning, Computer science, business.industry, String searching algorithm, computer.software_genre, Fuzzy logic, Medoid, Entropy (information theory), The Internet, Edit distance, Data mining, Entropy (energy dispersal), Cluster analysis, business, computer
Abstract: Various commercial and scientific applications require analysis of user behaviour in the Internet. For example, web marketing or network technical support can benefit from web users classification. This is achievable by tracking pages visited by the user during one session (one visit to the particular site). For automated user sessions classification we propose distance that compares sessions judging by the sequence of pages in them and by categories of these pages. Proposed distance is based on Levenshtein metric. Fuzzy C Medoids algorithm was used for clustering, since it has almost linear complexity. Davies-Bouldin, Entropy, and Bezdek validity indices were used to assess the qualities of proposed method. As testing shows, our distance outperforms in this domain both Euclidian and Edit distances.
Published: 2004
Full Text: View/download PDF

30. An Efficient K-Medoids-Based Algorithm Using Previous Medoid Index, Triangular Inequality Elimination Criteria, and Partial Distance Search

Author: John F. Roddick, Shu-Chuan Chu, and Jeng-Shyang Pan
Subjects: Information extraction, ComputingMethodologies_PATTERNRECOGNITION, Triangle inequality, k-medoids, Computational complexity theory, Computer science, Computation, Vector quantization, Cluster analysis, computer.software_genre, Algorithm, computer, Medoid
Abstract: Clustering in data mining is a discovery process that groups similar objects into the same cluster. Various clustering algorithms have been designed to fit various requirements and constraints of application. In this paper, we study several k-medoids-based algorithms including the PAM, CLARA and CLARANS algorithms. A novel and efficient approach is proposed to reduce the computational complexity of such k-medoids-based algorithms by using previous medoid index, triangular inequality elimination criteria and partial distance search. Experimental results based on elliptic, curve and Gauss-Markov databases demonstrate that the proposed algorithm applied to CLARANS may reduce the number of distance calculations by 67% to 92% while retaining the same average distance per object. In terms of the running time, the proposed algorithm may reduce computation time by 38% to 65% compared with the CLARANS algorithm.
Published: 2002
Full Text: View/download PDF

31. Categorizing Visitors Dynamically by Fast and Robust Clustering of Access Logs

Author: Jianhua Yang and Vladimir Estivill-Castro
Subjects: Iterative method, Computer science, business.industry, Correlation clustering, computer.software_genre, Machine learning, Medoid, CURE data clustering algorithm, Outlier, Expectation–maximization algorithm, Canopy clustering algorithm, Artificial intelligence, Data mining, Cluster analysis, business, Time complexity, computer
Abstract: Clustering plays a central role in segmenting markets. The identification of categories of visitors to a Web-site is very useful towards improved Web applications. However, the large volume involved in mining visitation paths, demands efficient clustering algorithms that are also resistant to noise and outliers. Also, dissimilarity between visitation paths involves sophisticated evaluation and results in large dimension of attribute-vectors. We present a randomized, iterative algorithm (a la Expectation Maximization or k-means) but based on discrete medoids. We prove that our algorithm converges and that has subquadratic complexity. We compare to the implementation of the fastest version of matrix-based clustering for visitor paths and show that our algorithm outperforms dramatically matrix-based methods.
Published: 2001
Full Text: View/download PDF

32. Robust Clusterin of Large Geo-referenced Data Sets

Author: Vladimir Estivill-Castro and Michael E. Houle
Subjects: Computer science, Delaunay triangulation, Spatial database, Cluster (physics), Data mining, computer.software_genre, Cluster analysis, Voronoi diagram, Time complexity, computer, Medoid
Abstract: Clustering geo-referenced data with the medoid method is related to k-Means, with the restriction that cluster representatives are chosen from the data. Although the medoid method in general produces clusters of high quality, it is often criticised for the Ω(n2) time that it requires. Our method incorporates both proximity and density information to achieve high-quality clusters in O(n log n) expected time. This is achieved by fast approximation to the medoid objective function using proximity information from Delaunay triangulations.
Published: 1999
Full Text: View/download PDF

33. Discovering associations in spatial data — An efficient medoid based approach

Author: Vladimir Estivill-Castro and Alan Murray
Subjects: Geographic information system, Heuristic, Computer science, business.industry, Spatial database, computer.software_genre, Medoid, Search algorithm, Data mining, Cluster analysis, Heuristics, business, Spatial analysis, Hill climbing, computer
Abstract: Spatial data mining is the discovery of novel and interesting relationships and characteristics that may exist implicitly in spatial databases. The identification of clusters coupled with Geographical Information System provides a means of information generalization. A variety of clustering approaches exists. A non-hierarchical method in data mining applications is the medoid approach. Many heuristics have been developed for this approach. This paper carefully analyses the complexity of hill-climbing heuristics for medoid based spatial clustering. Improvements to recently suggested heuristics like CLARANS are identified. We propose a novel idea, the stopping early of the heuristic search, and demonstrate that this provides large savings in computational time while the quality of the partition remains unaffected.
Published: 1998
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

33 results on '"Medoid"'

1. The Impact of Triangular Inequality Violations on Medoid-Based Clustering

2. Incorporating Fuzziness to CLARANS

3. MACOC: A medoid-based ACO clustering algorithm

4. Cluster Analysis by a Class of Splicing P Systems

5. Beyond Kmedoids: Sparse Model Based Medoids Algorithm for Representative Selection

6. Weighting Features for Partition around Medoids Using the Minkowski Metric

7. Sentic Medoids: Organizing Affective Common Sense Knowledge in a Multi-Dimensional Vector Space

8. Face Recognition Using Multi-exemplar Discriminative Power Analysis

9. The Impact of Triangular Inequality Violations on Medoid-Based Clustering

10. Non-user-Specific Multivariate Biometric Discretization with Medoid-Based Segmentation

11. Data Driven Bandwidth for Medoid Shift Algorithm

12. Clustering Categorical Data in a Multiobjective Framework

13. Consensus Multiobjective Differential Crisp Clustering for Categorical Data Analysis

14. Median Topographic Maps for Biomedical Data Sets

15. Subgroup Discovery in Data Sets with Multi–dimensional Responses: A Method and a Case Study in Traumatology

16. Partitional Conceptual Clustering of Web Resources Annotated with Ontology Languages

17. Are Model-Based Clustering and Neural Clustering Consistent? A Case Study from Bioinformatics

18. Evolutionary Clustering in Description Logics: Controlling Concept Formation and Drift in Ontologies

19. Evolutionary Rough k-Medoid Clustering

20. Conceptual Clustering and Its Application to Concept Drift and Novelty Detection

21. Conceptual Clustering Applied to Ontologies

22. A Multi-relational Hierarchical Clustering Method for Datalog Knowledge Bases

23. Quick Shift and Kernel Methods for Mode Seeking

24. A Hierarchical Clustering Procedure for Semantically Annotated Resources

25. Continuous Medoid Queries over Moving Objects

26. Hybrid O( $n \sqrt{n}$ ) Clustering for Sequential Web Usage Mining

27. Medoid Queries in Large Spatial Databases

28. A New and Efficient K-Medoid Algorithm for Spatial Clustering

29. Clustering of Web Sessions Using Levenshtein Metric

30. An Efficient K-Medoids-Based Algorithm Using Previous Medoid Index, Triangular Inequality Elimination Criteria, and Partial Distance Search

31. Categorizing Visitors Dynamically by Fast and Robust Clustering of Access Logs

32. Robust Clusterin of Large Geo-referenced Data Sets

33. Discovering associations in spatial data — An efficient medoid based approach

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

33 results on '"Medoid"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources