Descriptor: "deep clustering" / Topic: unsupervised learning - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"deep clustering"' showing total 45 results

Start Over Descriptor "deep clustering" Topic unsupervised learning

45 results on '"deep clustering"'

1. Deep Online Probability Aggregation Clustering

Author: Yan, Yuxuan, Lu, Na, Yan, Ruofan, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Leonardis, Aleš, editor, Ricci, Elisa, editor, Roth, Stefan, editor, Russakovsky, Olga, editor, Sattler, Torsten, editor, and Varol, Gül, editor
Published: 2025
Full Text: View/download PDF

2. High-Order Structure Enhanced Graph Clustering Network

Author: Zhang, Yangfan, Guo, Bing, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Hadfi, Rafik, editor, Anthony, Patricia, editor, Sharma, Alok, editor, Ito, Takayuki, editor, and Bai, Quan, editor
Published: 2025
Full Text: View/download PDF

3. ULDC: uncertainty-based learning for deep clustering: ULDC: uncertainty-based learning for deep clustering: L. Chang et al.

Author: Chang, Luyao, Niu, Xinzheng, Li, Zhenghua, Zhang, Zhiheng, Li, Shenshen, and Fournier-Viger, Philippe
Abstract: Deep clustering has gained prominence due to its impressive capability to handle high-dimensional real-world data. However, in the absence of ground-truth labels, existing clustering methods struggle to discern false positives that resemble the target cluster and false negatives that visually differ but maintain semantic consistency. The unreliable projections caused by visual ambiguity disrupt representation learning, leading to sub-optimal clustering outcomes. To address this challenge, we propose a novel method called uncertainty-based learning for deep clustering (ULDC), which aims to discover more optimal cluster structures within data from an uncertainty perspective. Specifically, we utilize the Dirichlet distribution to quantify the uncertainty of feature projections in the latent space, providing a probabilistic framework for modeling uncertainty during the clustering process. We then develop uncertainty-based learning to mitigate the interference caused by false positives and negatives in the clustering tasks. Additionally, a semantic calibration module is introduced to achieve a global alignment of cross-instance semantics, facilitating the learning of clustering-favorite representations. Extensive experiments on five widely-used benchmarks demonstrate the effectiveness of ULDC. The source code is available from . [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

4. A deep embedded clustering technique using dip test and unique neighbourhood set.

Author: Rahman, Md Anisur, Ang, Li-minn, Sun, Yuan, and Seng, Kah Phooi
Subjects: *SELECTION (Plant breeding), *DETERMINISTIC processes, *NEIGHBORHOODS, *SEEDS, *ALGORITHMS, *DEEP learning
Abstract: In recent years, there has been a growing interest in deep learning-based clustering. A recently introduced technique called DipDECK has shown effective performance on large and high-dimensional datasets. DipDECK utilises Hartigan's dip test, a statistical test, to merge small non-viable clusters. Notably, DipDECK was the first deep learning-based clustering technique to incorporate the dip test. However, the number of initial clusters of DipDECK is overestimated and the algorithm then randomly selects the initial seeds to produce the final clusters for a dataset. Therefore, in this paper, we presented a technique called UNSDipDECK , which is an improved version of DipDECK and does not require user input for datasets with an unknown number of clusters. UNSDipDECK produces high-quality initial seeds and the initial number of clusters through a deterministic process. UNSDipDECK uses the unique closest neighbourhood and unique neighbourhood set approaches to determine high-quality initial seeds for a dataset. In our study, we compared the performance of UNSDipDECK with fifteen baseline clustering techniques, including DipDECK, using NMI and ARI metrics. The experimental results indicate that UNSDipDECK outperforms the baseline techniques, including DipDECK. Additionally, we demonstrated that the initial seed selection process significantly contributes to UNSDipDECK 's ability to produce high-quality clusters. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

5. 多角度语义标签引导的自监督多视图聚类.

Author: 柳源, 安俊秀, and 杨林旺
Subjects: *DEEP learning
Abstract: Multi-view clustering aims to explore the feature information of objects from multiple perspectives to obtain accurate clustering results. However, existing research often fails to handle the information conflicts that arise during view fusion and does not fully utilize the complementary information between multiple views. To address these issues, this paper proposed a self-supervised multi-view clustering model guided by multi-angle semantic labels. The model first mapped the latent representations of each view to independent low-dimensional feature spaces, focusing on optimizing the consistency between views in one space to maintain the local structure of the feature space and the relative relationships between samples. At the same time, in another space, clustering information was directly extracted from the view level to capture richer and more diverse semantic features. Finally, pseudo-labels generated from multi-angle semantic features guided the clustering assignment at the object level, achieving collaborative optimization of the two representations. Extensive experimental results demonstrate that this approach can comprehensively explore both common and complementary information in multi-view data and exhibit good clustering performance. Moreover, compared to other methods, this approach has advantages in scenarios with a larger number of views. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

6. Unsupervised deep learning framework for data‐driven gating in positron emission tomography

Author: Li, Tiantian, Xie, Zhaoheng, Qi, Wenyuan, Asma, Evren, and Qi, Jinyi
Subjects: Medical and Biological Physics, Physical Sciences, Cancer, Biomedical Imaging, Bioengineering, 4.1 Discovery and preclinical testing of markers and technologies, 4.2 Evaluation of markers and technologies, data-driven, deep clustering, respiratory gating, unsupervised learning, Other Physical Sciences, Biomedical Engineering, Oncology and Carcinogenesis, Nuclear Medicine & Medical Imaging, Biomedical engineering, Medical and biological physics
Abstract: BackgroundPhysiological motion, such as respiratory motion, has become a limiting factor in the spatial resolution of positron emission tomography (PET) imaging as the resolution of PET detectors continue to improve. Motion-induced misregistration between PET and CT images can also cause attenuation correction artifacts. Respiratory gating can be used to freeze the motion and to reduce motion induced artifacts.PurposeIn this study, we propose a robust data-driven approach using an unsupervised deep clustering network that employs an autoencoder (AE) to extract latent features for respiratory gating.MethodsWe first divide list-mode PET data into short-time frames. The short-time frame images are reconstructed without attenuation, scatter, or randoms correction to avoid attenuation mismatch artifacts and to reduce image reconstruction time. The deep AE is then trained using reconstructed short-time frame images to extract latent features for respiratory gating. No additional data are required for the AE training. K-means clustering is subsequently used to perform respiratory gating based on the latent features extracted by the deep AE. The effectiveness of our proposed Deep Clustering method was evaluated using physical phantom and real patient datasets. The performance was compared against phase gating based on an external signal (External) and image based principal component analysis (PCA) with K-means clustering (Image PCA).ResultsThe proposed method produced gated images with higher contrast and sharper myocardium boundaries than those obtained using the External gating method and Image PCA. Quantitatively, the gated images generated by the proposed Deep Clustering method showed larger center of mass (COM) displacement and higher lesion contrast than those obtained using the other two methods.ConclusionsThe effectiveness of our proposed method was validated using physical phantom and real patient data. The results showed our proposed framework could provide superior gating than the conventional External method and Image PCA.
Published: 2023

7. Leveraging Hierarchical Similarities for Contrastive Clustering

Author: Li, Yuanshu, Xiao, Yubin, Wu, Xuan, Song, Lei, Liang, Yanchun, Zhou, You, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Luo, Biao, editor, Cheng, Long, editor, Wu, Zheng-Guang, editor, Li, Hongyi, editor, and Li, Chaojie, editor
Published: 2024
Full Text: View/download PDF

8. A survey on deep clustering: from the prior perspective

Author: Lu, Yiding, Li, Haobin, Li, Yunfan, Lin, Yijie, and Peng, Xi
Published: 2024
Full Text: View/download PDF

9. Semantic Spectral Clustering with Contrastive Learning and Neighbor Mining.

Author: Wang, Nongxiao, Ye, Xulun, Zhao, Jieyu, and Wang, Qing
Abstract: Deep spectral clustering techniques are considered one of the most efficient clustering algorithms in data mining field. The similarity between instances and the disparity among classes are two critical factors in clustering fields. However, most current deep spectral clustering approaches do not sufficiently take them both into consideration. To tackle the above issue, we propose Semantic Spectral clustering with Contrastive learning and Neighbor mining (SSCN) framework, which performs instance-level pulling and cluster-level pushing cooperatively. Specifically, we obtain the semantic feature embedding using an unsupervised contrastive learning model. Next, we obtain the nearest neighbors partially and globally, and the neighbors along with data augmentation information enhance their effectiveness collaboratively on the instance level as well as the cluster level. The spectral constraint is applied by orthogonal layers to satisfy conventional spectral clustering. Extensive experiments demonstrate the superiority of our proposed frame of spectral clustering. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

10. Gradient-Based Competitive Learning: Theory.

Author: Cirrincione, Giansalvo, Randazzo, Vincenzo, Barbiero, Pietro, Ciravegna, Gabriele, and Pasero, Eros
Abstract: Deep learning has been recently used to extract the relevant features for representing input data also in the unsupervised setting. However, state-of-the-art techniques focus mostly on algorithmic efficiency and accuracy rather than mimicking the input manifold. On the contrary, competitive learning is a powerful tool for replicating the input distribution topology. It is cognitive/biologically inspired as it is founded on Hebbian learning, a neuropsychological theory claiming that neurons can increase their specialization by competing for the right to respond to/represent a subset of the input data. This paper introduces a novel perspective by combining these two techniques: unsupervised gradient-based and competitive learning. The theory is based on the intuition that neural networks can learn topological structures by working directly on the transpose of the input matrix. At this purpose, the vanilla competitive layer and its dual are presented. The former is representative of a standard competitive layer for deep clustering, while the latter is trained on the transposed matrix. The equivalence of the layers is extensively proven both theoretically and experimentally. The dual competitive layer has better properties. Unlike the vanilla layer, it directly outputs the prototypes of the data inputs, while still allowing learning by backpropagation. More importantly, this paper proves theoretically that the dual layer is better suited for handling high-dimensional data (e.g., for biological applications), because the estimation of the weights is driven by a constraining subspace which does not depend on the input dimensionality, but only on the dataset cardinality. This paper has introduced a novel approach for unsupervised gradient-based competitive learning. This approach is very promising both in the case of small datasets of high-dimensional data and for better exploiting the advantages of a deep architecture: the dual layer perfectly integrates with the deep layers. A theoretical justification is also given by using the analysis of the gradient flow for both vanilla and dual layers. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

11. Dual Deep Clustering

Author: Cirrincione, Giansalvo, Randazzo, Vincenzo, Barbiero, Pietro, Ciravegna, Gabriele, Pasero, Eros, Howlett, Robert J., Series Editor, Jain, Lakhmi C., Series Editor, Esposito, Anna, editor, Faundez-Zanuy, Marcos, editor, Morabito, Francesco Carlo, editor, and Pasero, Eros, editor
Published: 2023
Full Text: View/download PDF

12. BYOL Network Based Contrastive Clustering

Author: Chen, Xuehao, Zhou, Weidong, Zhou, Jin, Wang, Yingxu, Han, Shiyuan, Du, Tao, Yang, Cheng, Liu, Bowen, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Huang, De-Shuang, editor, Premaratne, Prashan, editor, Jin, Baohua, editor, Qu, Boyang, editor, Jo, Kang-Hyun, editor, and Hussain, Abir, editor
Published: 2023
Full Text: View/download PDF

13. Unsupervised multimodal learning for image-text relation classification in tweets.

Author: Sun, Lin, Li, Qingyuan, Liu, Long, and Su, Yindu
Subjects: *SUPERVISED learning, *MULTIMODAL user interfaces, *SOCIAL media, *CLASSIFICATION, *DATA modeling, *USER-generated content
Abstract: Recent studies show that the use of multimodality can effectively enhance the understanding of social media content. The relations between texts and images become an important basis for developing multimodal data and models. Some studies have attempted to label image-text relation (ITR) and build supervised learning models. However, manually labeling ITR is a challenging task and incurs many controversial labels because of disagreements among the annotators. In this paper, we present a novel unsupervised multimodal method called ITR pseudo-labeling (ITRp) that learns multimodal representations for various ITR types using different finetuning strategies. Our ITRp method generates pseudo-labels by clustering and uses them as supervision to train the classifier and encoders. We evaluate the ITRp method on the ITR dataset and the effects of the samples with incorrect labels on both the supervised and unsupervised models. The code and data are available on the website https://github.com/SuYindu/ITRp. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

14. Intelligent Unsupervised Network Traffic Classification Method Using Adversarial Training and Deep Clustering for Secure Internet of Things †.

Author: Zhang, Weijie, Zhang, Lanping, Zhang, Xixi, Wang, Yu, Liu, Pengfei, and Gui, Guan
Subjects: COMPUTER network traffic, INTELLIGENT networks, SUPERVISED learning, INTERNET of things, COMPUTATIONAL complexity, CLASSIFICATION
Abstract: Network traffic classification (NTC) has attracted great attention in many applications such as secure communications, intrusion detection systems. The existing NTC methods based on supervised learning rely on sufficient labeled datasets in the training phase, but for most traffic datasets, it is difficult to obtain label information in practical applications. Although unsupervised learning does not rely on labels, its classification accuracy is not high, and the number of data classes is difficult to determine. This paper proposes an unsupervised NTC method based on adversarial training and deep clustering with improved network traffic classification (NTC) and lower computational complexity in comparison with the traditional clustering algorithms. Here, the training process does not require data labels, which greatly reduce the computational complexity of the network traffic classification through pretraining. In the pretraining stage, an autoencoder (AE) is used to reduce the dimension of features and reduce the complexity of the initial high-dimensional network traffic data features. Moreover, we employ the adversarial training model and a deep clustering structure to further optimize the extracted features. The experimental results show that our proposed method has robust performance, with a multiclassification accuracy of 92.2%, which is suitable for classification with a large number of unlabeled data in actual application scenarios. This paper only focuses on breakthroughs in the algorithm stage, and future work can be focused on the deployment and adaptation in practical environments. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

15. Breast Ultrasound Images Clustering Analysis Using Deep Clustering Method

Author: Huang, Cheng, Cui, Jinrong, Akan, Ozgur, Editorial Board Member, Bellavista, Paolo, Editorial Board Member, Cao, Jiannong, Editorial Board Member, Coulson, Geoffrey, Editorial Board Member, Dressler, Falko, Editorial Board Member, Ferrari, Domenico, Editorial Board Member, Gerla, Mario, Editorial Board Member, Kobayashi, Hisashi, Editorial Board Member, Palazzo, Sergio, Editorial Board Member, Sahni, Sartaj, Editorial Board Member, Shen, Xuemin, Editorial Board Member, Stan, Mircea, Editorial Board Member, Jia, Xiaohua, Editorial Board Member, Zomaya, Albert Y., Editorial Board Member, Wang, Shuihua, editor, Zhang, Zheng, editor, and Xu, Yuan, editor
Published: 2022
Full Text: View/download PDF

16. Deep Convolutional Embedded Fuzzy Clustering with Wasserstein Loss

Author: Chen, Tianzhen, Sun, Wei, Xhafa, Fatos, Series Editor, Dang, Ngoc Hoang Thanh, editor, Zhang, Yu-Dong, editor, Tavares, João Manuel R. S., editor, and Chen, Bo-Hao, editor
Published: 2022
Full Text: View/download PDF

17. Wearable Fall-Detection Using Deep Embedded Clustering Algorithm

Author: Jothi, R., Bansal, Jagdish Chand, Series Editor, Deep, Kusum, Series Editor, Nagar, Atulya K., Series Editor, Mathur, Garima, editor, Bundele, Mahesh, editor, Lalwani, Mahendra, editor, and Paprzycki, Marcin, editor
Published: 2022
Full Text: View/download PDF

18. High-Confidence Sample Labelling for Unsupervised Person Re-identification

Author: Wang, Lei, Zhao, Qingjie, Wang, Shihao, Lu, Jialin, Zhao, Ying, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Sun, Fuchun, editor, Hu, Dewen, editor, Wermter, Stefan, editor, Yang, Lei, editor, Liu, Huaping, editor, and Fang, Bin, editor
Published: 2022
Full Text: View/download PDF

19. Semi-Supervised Medical Image Classification Combined with Unsupervised Deep Clustering.

Author: Xiao, Bang and Lu, Chunyue
Subjects: IMAGE recognition (Computer vision), ARTIFICIAL neural networks, MEDICAL coding, DIAGNOSTIC imaging, NEURAL computers, SUPERVISED learning
Abstract: An effective way to improve the performance of deep neural networks in most computer vision tasks is to improve the quantity of labeled data and the quality of labels. However, in the analysis and processing of medical images, high-quality annotation depends on the experience and professional knowledge of experts, which makes it very difficult to obtain a large number of high-quality annotations. Therefore, we propose a new semi-supervised framework for medical image classification. It combines semi-supervised classification with unsupervised deep clustering. Spreading label information to unlabeled data by alternately running two tasks helps the model to extract semantic information from unlabeled data, and prevents the model from overfitting to a small amount of labeled data. Compared with current methods, our framework enhances the robustness of the model and reduces the influence of outliers. We conducted a comparative experiment on the public benchmark medical image dataset to verify our method. On the ISIC 2018 Dataset, our method surpasses other methods by more than 0.85% on AUC and 1.08% on Sensitivity. On the ICIAR BACH 2018 dataset, our method achieved 94.12% AUC, 77.92% F1-score, 77.69% Recall, and 78.16% Precision. The error rate is at least 1.76% lower than that of other methods. The result shows the effectiveness of our method in medical image classification. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

20. Datacube segmentation via deep spectral clustering

Author: Alessandro Bombini, Fernando García-Avello Bofías, Caterina Bracci, Michele Ginolfi, and Chiara Ruberto
Subjects: unsupervised learning, deep clustering, deep learning, nuclear computer vision, x-ray fluorescence macro-mapping (MA-XRF), Computer engineering. Computer hardware, TK7885-7895, Electronic computers. Computer science, QA75.5-76.95
Abstract: Extended vision techniques are ubiquitous in physics. However, the data cubes steaming from such analysis often pose a challenge in their interpretation, due to the intrinsic difficulty in discerning the relevant information from the spectra composing the data cube. Furthermore, the huge dimensionality of data cube spectra poses a complex task in its statistical interpretation; nevertheless, this complexity contains a massive amount of statistical information that can be exploited in an unsupervised manner to outline some essential properties of the case study at hand, e.g. it is possible to obtain an image segmentation via (deep) clustering of data-cube’s spectra, performed in a suitably defined low-dimensional embedding space. To tackle this topic, we explore the possibility of applying unsupervised clustering methods in encoded space, i.e. perform deep clustering on the spectral properties of datacube pixels. A statistical dimensional reduction is performed by an ad hoc trained (variational) AutoEncoder, in charge of mapping spectra into lower dimensional metric spaces, while the clustering process is performed by a (learnable) iterative K-means clustering algorithm. We apply this technique to two different use cases, of different physical origins: a set of macro mapping x-ray fluorescence (MA-XRF) synthetic data on pictorial artworks, and a dataset of simulated astrophysical observations.
Published: 2024
Full Text: View/download PDF

21. Application of unsupervised deep learning algorithms for identification of specific clusters of chronic cough patients from EMR data

Author: Wei Shao, Xiao Luo, Zuoyi Zhang, Zhi Han, Vasu Chandrasekaran, Vladimir Turzhitsky, Vishal Bali, Anna R. Roberts, Megan Metzger, Jarod Baker, Carmen La Rosa, Jessica Weaver, Paul Dexter, and Kun Huang
Subjects: Chronic cough, Unsupervised learning, Deep clustering, EMR data, Computer applications to medicine. Medical informatics, R858-859.7, Biology (General), QH301-705.5
Abstract: Abstract Background Chronic cough affects approximately 10% of adults. The lack of ICD codes for chronic cough makes it challenging to apply supervised learning methods to predict the characteristics of chronic cough patients, thereby requiring the identification of chronic cough patients by other mechanisms. We developed a deep clustering algorithm with auto-encoder embedding (DCAE) to identify clusters of chronic cough patients based on data from a large cohort of 264,146 patients from the Electronic Medical Records (EMR) system. We constructed features using the diagnosis within the EMR, then built a clustering-oriented loss function directly on embedded features of the deep autoencoder to jointly perform feature refinement and cluster assignment. Lastly, we performed statistical analysis on the identified clusters to characterize the chronic cough patients compared to the non-chronic cough patients. Results The experimental results show that the DCAE model generated three chronic cough clusters and one non-chronic cough patient cluster. We found various diagnoses, medications, and lab tests highly associated with chronic cough patients by comparing the chronic cough cluster with the non-chronic cough cluster. Comparison of chronic cough clusters demonstrated that certain combinations of medications and diagnoses characterize some chronic cough clusters. Conclusions To the best of our knowledge, this study is the first to test the potential of unsupervised deep learning methods for chronic cough investigation, which also shows a great advantage over existing algorithms for patient data clustering.
Published: 2022
Full Text: View/download PDF

22. DeepMCAT: Large-Scale Deep Clustering for Medical Image Categorization

Author: Kart, Turkay, Bai, Wenjia, Glocker, Ben, Rueckert, Daniel, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Engelhardt, Sandy, editor, Oksuz, Ilkay, editor, Zhu, Dajiang, editor, Yuan, Yixuan, editor, Mukhopadhyay, Anirban, editor, Heller, Nicholas, editor, Huang, Sharon Xiaolei, editor, Nguyen, Hien, editor, Sznitman, Raphael, editor, and Xue, Yuan, editor
Published: 2021
Full Text: View/download PDF

23. Intelligent Unsupervised Network Traffic Classification Method Using Adversarial Training and Deep Clustering for Secure Internet of Things

Author: Weijie Zhang, Lanping Zhang, Xixi Zhang, Yu Wang, Pengfei Liu, and Guan Gui
Subjects: network traffic classification, convolutional adversarial autoencoder, Internet of things, unsupervised learning, deep clustering, Information technology, T58.5-58.64
Abstract: Network traffic classification (NTC) has attracted great attention in many applications such as secure communications, intrusion detection systems. The existing NTC methods based on supervised learning rely on sufficient labeled datasets in the training phase, but for most traffic datasets, it is difficult to obtain label information in practical applications. Although unsupervised learning does not rely on labels, its classification accuracy is not high, and the number of data classes is difficult to determine. This paper proposes an unsupervised NTC method based on adversarial training and deep clustering with improved network traffic classification (NTC) and lower computational complexity in comparison with the traditional clustering algorithms. Here, the training process does not require data labels, which greatly reduce the computational complexity of the network traffic classification through pretraining. In the pretraining stage, an autoencoder (AE) is used to reduce the dimension of features and reduce the complexity of the initial high-dimensional network traffic data features. Moreover, we employ the adversarial training model and a deep clustering structure to further optimize the extracted features. The experimental results show that our proposed method has robust performance, with a multiclassification accuracy of 92.2%, which is suitable for classification with a large number of unlabeled data in actual application scenarios. This paper only focuses on breakthroughs in the algorithm stage, and future work can be focused on the deployment and adaptation in practical environments.
Published: 2023
Full Text: View/download PDF

24. Twin Contrastive Learning for Online Clustering.

Author: Li, Yunfan, Yang, Mouxing, Peng, Dezhong, Li, Taihao, Huang, Jiantao, and Peng, Xi
Subjects: *ONLINE education, *BOOSTING algorithms, *DATA augmentation, *COLUMNS
Abstract: This paper proposes to perform online clustering by conducting twin contrastive learning (TCL) at the instance and cluster level. Specifically, we find that when the data is projected into a feature space with a dimensionality of the target cluster number, the rows and columns of its feature matrix correspond to the instance and cluster representation, respectively. Based on the observation, for a given dataset, the proposed TCL first constructs positive and negative pairs through data augmentations. Thereafter, in the row and column space of the feature matrix, instance- and cluster-level contrastive learning are respectively conducted by pulling together positive pairs while pushing apart the negatives. To alleviate the influence of intrinsic false-negative pairs and rectify cluster assignments, we adopt a confidence-based criterion to select pseudo-labels for boosting both the instance- and cluster-level contrastive learning. As a result, the clustering performance is further improved. Besides the elegant idea of twin contrastive learning, another advantage of TCL is that it could independently predict the cluster assignment for each instance, thus effortlessly fitting online scenarios. Extensive experiments on six widely-used image and text benchmarks demonstrate the effectiveness of TCL. The code is released on https://pengxi.me. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

25. Automatic deep sparse clustering with a dynamic population-based evolutionary algorithm using reinforcement learning and transfer learning.

Author: Hadikhani, Parham, Lai, Daphne Teck Ching, Ong, Wee-Hong, and Nadimi-Shahraki, Mohammad H.
Subjects: *REINFORCEMENT learning, *GENERATIVE adversarial networks, *MACHINE learning, *EVOLUTIONARY algorithms, *FEATURE extraction, *DIFFERENTIAL evolution
Abstract: Clustering data effectively remains a significant challenge in machine learning, particularly when the optimal number of clusters is unknown. Traditional deep clustering methods often struggle with balancing local and global search, leading to premature convergence and inefficiency. To address these issues, we introduce ADSC-DPE-RT (Automatic Deep Sparse Clustering with a Dynamic Population-based Evolutionary Algorithm using Reinforcement Learning and Transfer Learning), a novel deep clustering approach. ADSC-DPE-RT builds on Multi-Trial Vector-based Differential Evolution (MTDE), an algorithm that integrates sparse auto-encoding and manifold learning to enable automatic clustering without prior knowledge of cluster count. However, MTDE's fixed population size can lead to either prolonged computation or premature convergence. Our approach introduces a dynamic population generation technique guided by Reinforcement Learning (RL) and Markov Decision Process (MDP) principles. This allows for flexible adjustment of population size, preventing premature convergence and reducing computation time. Additionally, we incorporate Generative Adversarial Networks (GANs) to facilitate dynamic knowledge transfer between MTDE strategies, enhancing diversity and accelerating convergence towards the global optimum. This is the first work to address the dynamic population issue in deep clustering through RL, combined with Transfer Learning to optimize evolutionary algorithms. Our results demonstrate significant improvements in clustering performance, positioning ADSC-DPE-RT as a competitive alternative to state-of-the-art deep clustering methods. • Introduces a novel clustering method integrating Auto-Encoders, Evolutionary Algorithms, and Reinforcement Learning. • Adjusts population size and strategy dynamically with Reinforcement Learning to enhance exploration. • Uses Generative Adversarial Networks to share elite populations among strategies, boosting diversity and solution quality. • Addresses feature extraction inefficiencies and local optima issues through dynamic, adaptive techniques. • Outperforms existing methods and works effectively without needing prior knowledge of the number of clusters. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

26. Deep image clustering: A survey.

Author: Huang, Huajuan, Wang, Chen, Wei, Xiuxi, and Zhou, Yongquan
Subjects: *FEATURE extraction, *DATA analysis, *ALGORITHMS, *CUSTOMIZATION, *DEEP learning
Abstract: Deep image clustering networks have the capability to categorize unlabeled images, thereby effectively utilizing them. This paper synthesizes recent researches about deep image clustering network and summarizes them within a general framework. Image Preprocessing part transforms collected images into a dataset accepted by the network, then Image Embedding part embeds images from the dataset into vectors which represents image features. After Feature Processing part further reduces the dimensionality or enhances these features, the following Feature Clustering part divides the images into several categories. Cluster Result Processing part treats the clustering outcomes to implement weighted, multi-view, subspace, multimodal, or self-supervised methods. Downstream Applications part employ the clustering results to address various real-world problems. The performance of common baseline clustering methods and several feature extraction architectures are also compared and analyzed. The results indicate that within deep image clustering networks, the choice of clustering algorithm has a relatively minor impact on clustering performance, whereas the selection of the feature extraction network architecture is decisive for clustering metrics. The choice of architecture should be customized based on the characteristics of the dataset. Finally, we provide suggestions for potential research directions in deep image clustering networks. • Analyzes deep image clustering theory, its performance benefits, and limitations. • Summarizes recent deep image clustering improvements with experimental data analysis. • Introduces deep image clustering applications and future improvement prospects. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

27. Learning to Cluster Under Domain Shift

Author: Menapace, Willi, Lathuilière, Stéphane, Ricci, Elisa, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Vedaldi, Andrea, editor, Bischof, Horst, editor, Brox, Thomas, editor, and Frahm, Jan-Michael, editor
Published: 2020
Full Text: View/download PDF

28. Unsupervised Outlier Detection Using Memory and Contrastive Learning.

Author: Huyan, Ning, Quan, Dou, Zhang, Xiangrong, Liang, Xuefeng, Chanussot, Jocelyn, and Jiao, Licheng
Subjects: *OUTLIER detection, *DEEP learning, *LEARNING modules, *MEMORY, *LEARNING, *IMAGE reconstruction, *FEATURE extraction
Abstract: Outlier detection is to separate anomalous data from inliers in the dataset. Recently, the most deep learning methods of outlier detection leverage an auxiliary reconstruction task by assuming that outliers are more difficult to recover than normal samples (inliers). However, it is not always true in deep auto-encoder (AE) based models. The auto-encoder based detectors may recover certain outliers even if outliers are not in the training data, because they do not constrain the feature learning. Instead, we think outlier detection can be done in the feature space by measuring the distance between outliers’ features and the consistency feature of inliers. To achieve this, we propose an unsupervised outlier detection method using a memory module and a contrastive learning module (MCOD). The memory module constrains the consistency of features, which merely represent the normal data. The contrastive learning module learns more discriminative features, which boosts the distinction between outliers and inliers. Extensive experiments on four benchmark datasets show that our proposed MCOD performs well and outperforms eleven state-of-the-art methods. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

29. Application of unsupervised deep learning algorithms for identification of specific clusters of chronic cough patients from EMR data.

Author: Shao, Wei, Luo, Xiao, Zhang, Zuoyi, Han, Zhi, Chandrasekaran, Vasu, Turzhitsky, Vladimir, Bali, Vishal, Roberts, Anna R., Metzger, Megan, Baker, Jarod, La Rosa, Carmen, Weaver, Jessica, Dexter, Paul, and Huang, Kun
Subjects: DEEP learning, MACHINE learning, COUGH, ELECTRONIC health records, SUPERVISED learning
Abstract: Background: Chronic cough affects approximately 10% of adults. The lack of ICD codes for chronic cough makes it challenging to apply supervised learning methods to predict the characteristics of chronic cough patients, thereby requiring the identification of chronic cough patients by other mechanisms. We developed a deep clustering algorithm with auto-encoder embedding (DCAE) to identify clusters of chronic cough patients based on data from a large cohort of 264,146 patients from the Electronic Medical Records (EMR) system. We constructed features using the diagnosis within the EMR, then built a clustering-oriented loss function directly on embedded features of the deep autoencoder to jointly perform feature refinement and cluster assignment. Lastly, we performed statistical analysis on the identified clusters to characterize the chronic cough patients compared to the non-chronic cough patients. Results: The experimental results show that the DCAE model generated three chronic cough clusters and one non-chronic cough patient cluster. We found various diagnoses, medications, and lab tests highly associated with chronic cough patients by comparing the chronic cough cluster with the non-chronic cough cluster. Comparison of chronic cough clusters demonstrated that certain combinations of medications and diagnoses characterize some chronic cough clusters. Conclusions: To the best of our knowledge, this study is the first to test the potential of unsupervised deep learning methods for chronic cough investigation, which also shows a great advantage over existing algorithms for patient data clustering. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

30. Semi-Supervised Medical Image Classification Combined with Unsupervised Deep Clustering

Author: Bang Xiao and Chunyue Lu
Subjects: medical image classification, semi-supervised learning, unsupervised learning, deep clustering, overclustering, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Abstract: An effective way to improve the performance of deep neural networks in most computer vision tasks is to improve the quantity of labeled data and the quality of labels. However, in the analysis and processing of medical images, high-quality annotation depends on the experience and professional knowledge of experts, which makes it very difficult to obtain a large number of high-quality annotations. Therefore, we propose a new semi-supervised framework for medical image classification. It combines semi-supervised classification with unsupervised deep clustering. Spreading label information to unlabeled data by alternately running two tasks helps the model to extract semantic information from unlabeled data, and prevents the model from overfitting to a small amount of labeled data. Compared with current methods, our framework enhances the robustness of the model and reduces the influence of outliers. We conducted a comparative experiment on the public benchmark medical image dataset to verify our method. On the ISIC 2018 Dataset, our method surpasses other methods by more than 0.85% on AUC and 1.08% on Sensitivity. On the ICIAR BACH 2018 dataset, our method achieved 94.12% AUC, 77.92% F1-score, 77.69% Recall, and 78.16% Precision. The error rate is at least 1.76% lower than that of other methods. The result shows the effectiveness of our method in medical image classification.
Published: 2023
Full Text: View/download PDF

31. Deep Convolutional Center-Based Clustering

Author: Yan, Qinhong, Tang, Meihan, Chen, Weifu, Feng, Guocan, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Lin, Zhouchen, editor, Wang, Liang, editor, Yang, Jian, editor, Shi, Guangming, editor, Tan, Tieniu, editor, Zheng, Nanning, editor, Chen, Xilin, editor, and Zhang, Yanning, editor
Published: 2019
Full Text: View/download PDF

32. Deep fair clustering with multi-level decorrelation.

Author: Wang, Xiang, Jing, Liping, Liu, Huafeng, Yu, Jian, Geng, Weifeng, and Ye, Gencheng
Subjects: *RACE, *STATISTICAL correlation, *GENDER
Abstract: Fair clustering aims to prevent sensitive attributes (e.g., race or gender) from dominating the clustering process. However, real-world datasets, often characterized by low quality and high dimensionality, restrict existing fair clustering methods from achieving satisfactory outcomes. Typically, these sensitive attributes are intricately intertwined with other attributes in high-dimensional continuous data, forming backgrounds or entities within the data. The integration results in a significant correlation of features and samples across different clusters, thereby hindering the model's ability to explore the intrinsic structure. To address these issues, we propose a novel fair clustering method that incorporates multi-level decorrelation constraints. Our goal is to extract inherent fair structural information under the interference of sensitive attributes, enhancing both the validity and fairness of the model. Specifically, we introduce a new cluster-wise similarity metric based on the partition correlation coefficient, which facilitates cluster-level decorrelation and captures cluster-discriminative properties. Furthermore, by incorporating softmax-formulated decorrelation at the sample-level and feature-level , we concurrently explore representations that favor fairness. These three components are seamlessly integrated into our clustering framework, yielding a more robust and confident data partition. Experiments conducted on six commonly-used datasets demonstrate the effectiveness of our proposed method. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

33. Image deep clustering based on local-topology embedding.

Author: Pan, Jing, Qian, Yuhua, Li, Feijiang, and Guo, Qian
Subjects: *DATA augmentation, *ALGORITHMS, *GENERALIZATION
Abstract: • The feature representation consists of two parts: data and its local-topology information. • We propose a replacement strategy to find local-topology representation of data. • Data augmentation is applied to improve the generalization performance of the model. • A two-stage image deep clustering algorithm is presented based on local-topology embedding called ITEC. Reasonable feature representation plays an important role in improving the performance of clustering algorithms. However, recent deep clustering studies only focusing on feature representation at the pixel level leads to feature representation with low discrimination. Our key insight is that considering local-topology information between images would help to get a highly discriminative representation, and therefore we design a replacement strategy to find local-topology representation of data, and propose a two-stage image deep clustering algorithm based on local-topology embedding called ITEC. Specifically, we take advantage of data augmentation technique to improve the generalization performance of the learning models; then local-topology representation of data is embedded into the representation of data itself, so as to better complete tasks of image clustering. Extensive experiments demonstrate that local-topology information effectively promotes the performance of deep clustering significantly. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

34. An overview on deep clustering.

Author: Wei, Xiuxi, Zhang, Zhihui, Huang, Huajuan, and Zhou, Yongquan
Subjects: *DEEP learning, *IMAGE processing
Abstract: In recent years, with the great success of deep learning and especially deep unsupervised learning, many deep architectural clustering methods, collectively known as deep clustering, have emerged. Deep clustering shows the potential to outperform traditional methods, especially in handling complex high-dimensional data, taking full advantage of deep learning. To achieve a comprehensive overview of the field of deep clustering, this review systematically explores deep clustering methods and their various applications. First, the basic network architecture of deep clustering is described in detail, including the common network frameworks, and loss functions. Subsequently, deep clustering is divided into several categories based on the network architecture, and benchmark datasets and evaluation metrics in the field are introduced. Next, the real-world applications of deep clustering are explored in depth, providing successful cases in the fields of bioinformatics, medicine, anomaly detection, and image processing, highlighting the broad applicability of deep clustering in solving real-world challenges. Finally, the paper summarizes its contributions and explores potential directions for future research in deep clustering. • Analyze the theory of deep clustering and its performance advantages and disadvantages. • Summarizes the improvement of deep clustering in recent years, analyzes the improvement effect, and shows it with experimental data. • Finally, the related application research of deep clustering in different fields is introduced. At the same time, we summarize and prospect the improvement and development of deep clustering. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

35. An End-to-end Deep Clustering Method with Consistency and Complementarity Attention Mechanism for Multisensor Fault Diagnosis.

Author: Wu, Zhangjun, Fang, Gang, Wang, Yifei, and Xu, Renli
Subjects: PATTERN recognition systems, DEEP learning, FAULT diagnosis, DIAGNOSIS methods
Abstract: Deep clustering methods have found successful applications in single-sensor data fault diagnosis. However, most of these methods employ separate optimization strategies that overlook the interaction between feature learning and clustering. Moreover, conventional deep learning methods for fault diagnosis often disregard the consistent and complementary information inherent in the multisensor data, resulting in unsatisfactory multisensor fault diagnosis performance. In this study, we introduce a novel End-to-end Deep Clustering Method with Consistency and Complementarity Attention Mechanism, termed EDCM-CCAM, tailored for multisensor fault diagnosis. Firstly, multiple deep autoencoder networks are utilized to concurrently extract the deep representation features from various sensor inputs. Secondly, we introduce a Consistency and Complementarity Attention Mechanism (CCAM) to facilitate multisensor feature fusion, accompanied by the design of two distinct loss functions to fully exploit the consistent and complementary information within multisensor data. Finally, fault pattern recognition in multisensor data is accomplished through Kullback-Leibler (KL) divergence-based clustering, while a joint optimization strategy is employed to simultaneously optimize all components of the EDCM-CCAM. The efficacy of the proposed method is validated on a gearbox dataset, demonstrating superior performance in multisensor fault diagnosis compared to alternative methods. • The deep clustering method with joint optimization strategy is presented. • The new attention CCAM and two losses are designed for feature fusion. • The proposed method outperforms other comparative fault diagnosis methods. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

36. Deep Clustering with Convolutional Autoencoders

Author: Guo, Xifeng, Liu, Xinwang, Zhu, En, Yin, Jianping, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Liu, Derong, editor, Xie, Shengli, editor, Li, Yuanqing, editor, Zhao, Dongbin, editor, and El-Alfy, El-Sayed M., editor
Published: 2017
Full Text: View/download PDF

37. Deep subspace clustering to achieve jointly latent feature extraction and discriminative learning.

Author: Huang, Qinjian, Zhang, Yue, Peng, Hong, Dan, Tingting, Weng, Wanlin, and Cai, Hongmin
Subjects: *FEATURE extraction, *LEARNING modules, *ARTIFICIAL neural networks, *SUPERVISED learning, *SELF-expression
Abstract: Deep subspace clustering based on autoencoder and self-expression layer has become popular for clustering method. However, these models only focus on feature extraction by minimizing reconstruction error rather than specific clustering task, leading to unsatisfactory performance. To overcome the above shortcoming, we propose a deep subspace clustering framework which jointly extracts features via the embedding neural network and performs subspace learning. The model contains three modules, an autoencoder, a self-expression layer and a supervised competitive feature learning module. The proposed model is highly capable of capturing characteristic features to guide the clustering task by using relative entropy to minimize probabilistic cluster assignments and the target variables. The three modules are consolidated to be jointly trained and optimized competitively. Experimental results on five benchmark datasets demonstrate the effectiveness of the proposed deep subspace clustering by comparing with eleven baseline methods. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

38. Contrastive learning for unsupervised medical image clustering and reconstruction

Author: Ferrante, Matteo, Boccato, Tommaso, Spasov, Simeon, Duggento, Andrea, and Toschi, Nicola
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, contrastive learning, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, General Medicine, deep clustering, patient stratification, unsupervised learning, Machine Learning (cs.LG)
Abstract: The lack of large labeled medical imaging datasets, along with significant inter-individual variability compared to clinically established disease classes, poses significant challenges in exploiting medical imaging information in a precision medicine paradigm, where in principle dense patient-specific data can be employed to formulate individual predictions and/or stratify patients into finer-grained groups which may follow more homogeneous trajectories and therefore empower clinical trials. In order to efficiently explore the effective degrees of freedom underlying variability in medical images in an unsupervised manner, in this work we propose an unsupervised autoencoder framework which is augmented with a contrastive loss to encourage high separability in the latent space. The model is validated on (medical) benchmark datasets. As cluster labels are assigned to each example according to cluster assignments, we compare performance with a supervised transfer learning baseline. Our methods achieves similar performance to the supervised architecture, indicating that separation in the latent space reproduces expert medical observer-assigned labels. The proposed method could be beneficial for patient stratification, exploring new subdivision of larger classes or pathological continua or, due to its sampling abilities in a variation setting, data augmentation in medical image processing.
Published: 2022

39. Automatic Deep Sparse Multi-Trial Vector-based Differential Evolution clustering with manifold learning and incremental technique.

Author: Hadikhani, Parham, Lai, Daphne Teck Ching, Ong, Wee-Hong, and Nadimi-Shahraki, Mohammad H.
Subjects: *MACHINE learning, *EVOLUTIONARY algorithms, *DIFFERENTIAL evolution, *FEATURE extraction, *ALGORITHMS
Abstract: • A novel deep evolutionary clustering (ADSMTDE) to overcome clustering drawbacks. • Improving the auto-encoder by applying sparsity constraint and manifold learning. • To enhance clustering, evolutionary algorithm is adopted to optimize solutions. • Employing an incremental clustering technique to perform clustering dynamically. • ADSMTDE is competitive and superior to over the latest deep clustering methods. Most deep clustering methods despite utilizing complex networks to learn better from data, use a shallow clustering method. These methods have difficulty in finding good clusters due to the lack of ability to handle between local search and global search to prevent premature convergence. In other words, they do not consider different aspects of the search and it causes them to get stuck in the local optimum. In addition, the majority of existing deep clustering approaches perform clustering with the knowledge of the number of clusters, which is not practical in most real scenarios where such information is not available. To address these problems, this paper presents a novel automatic deep sparse clustering approach based on an evolutionary algorithm called Multi-Trial Vector-based Differential Evolution (MTDE). Sparse auto-encoder is first applied to extract embedded features. Manifold learning is then adopted to obtain representation and extract the spatial structure of features. Afterward, MTDE clustering is performed without prior information on the number of clusters to find the optimal clustering solution. The proposed approach was evaluated on various datasets, including images and time-series. The results demonstrate that the proposed method improved MTDE by 18.94% on average and compared to the most recent deep clustering algorithms, is consistently among the top three in the majority of datasets. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

40. A Classification of Arab Ethnicity Based on Face Image Using Deep Learning Approach

Author: Norah A. Al-Humaidan And and Master Prince
Subjects: General Computer Science, Computer science, Feature extraction, 0211 other engineering and technologies, Arab, deep clustering, 02 engineering and technology, Machine learning, computer.software_genre, Convolutional neural network, Facial recognition system, 0202 electrical engineering, electronic engineering, information engineering, Profiling (information science), General Materials Science, Electrical and Electronic Engineering, Cluster analysis, convolutional neural network (CNN), 021110 strategic, defence & security studies, business.industry, Deep learning, General Engineering, deep learning, Support vector machine, ComputingMethodologies_PATTERNRECOGNITION, ethnicity, Unsupervised learning, 020201 artificial intelligence & image processing, lcsh:Electrical engineering. Electronics. Nuclear engineering, Artificial intelligence, business, lcsh:TK1-9971, computer
Abstract: Human face and facial features gain a lot of attention from researchers and are considered as one of the most popular topics recently. Features and information extracted from a person are known as soft biometric, they have been used to improve the recognition performance and enhance the search engine for face images, which can be further applied in various fields such as law enforcement, surveillance videos, advertisement, and social media profiling. By observing relevant studies in the field, we noted a lack of mention of the Arab world and an absence of Arab dataset as well. Therefore, our aim in this paper is to create an Arab dataset with proper labeling of Arab sub-ethnic groups, then classify these labels using deep learning approaches. Arab image dataset that was created consists of three labels: Gulf Cooperation Council countries (GCC), the Levant, and Egyptian. Two types of learning were used to solve the problem. The first type is supervised deep learning (classification); a Convolutional Neural Network (CNN) pre-trained model has been used as CNN models achieved state of art results in computer vision classification problems. The second type is unsupervised deep learning (deep clustering). The aim of using unsupervised learning is to explore the ability of such models in classifying ethnicities. To our knowledge, this is the first time deep clustering is used for ethnicity classification problems. For this, three methods were chosen. The best result of training a pre-trained CNN on the full Arab dataset then evaluating on a different dataset was 56.97%, and 52.12% when Arab dataset labels were balanced. The methods of deep clustering were applied on different datasets, showed an ACC from 32% to 59%, and NMI and ARI result from zero to 0.2714 and 0.2543 respectively.
Published: 2021
Full Text: View/download PDF

41. Unsupervised discriminative feature learning via finding a clustering-friendly embedding space.

Author: Cao, Wenming, Zhang, Zhongfan, Liu, Cheng, Li, Rui, Jiao, Qianfen, Yu, Zhiwen, and Wong, Hau-San
Abstract: • We exploit the Siamese Network to find a clustering-friendly embedding space to mine highly-reliable pseudo-supervised information for the application of VAT and Conditional-GAN to synthesize cluster-specific samples in the setting of unsupervised learning. • We proposed adopting VAT to synthesize samples with different levels of perturbations that can enhance the robustness of Feature Extractor to noise and improve the lower-dimensional latent coding space discovered by the Feature Extractor. • We conducted experiments to verify that the latent space discovered by the Feature Extractor can facilitate the Siamese Network to find a clustering-friendly embedding space and extract pseudo-supervised information for VAT and Conditional-GAN. • The training of our EDCN involves the adversarial gaming between three players, which not only boosts performance improvement of the clustering but also preserves the cluster-specific information from the Siamese Network in synthesizing samples. In this paper, we propose an enhanced deep clustering network (EDCN), which is composed of a Feature Extractor, a Conditional Generator, a Discriminator and a Siamese Network. Specifically, we will utilize two kinds of generated data based on adversarial training, as well as the original data, to train the Feature Extractor for learning effective latent representations. In addition, we adopt the Siamese network to find an embedding space, where a better affinity similarity matrix is obtained as the key to success of spectral clustering in providing reliable pseudo-labels. Particularly, the obtained pseudo-labels will be used to generate realistic data by the Generator. Finally, the discriminator is used to model the real joint distribution of data and corresponding latent representations for Feature Extractor enhancement. To evaluate our proposed EDCN, we conduct extensive experiments on multiple data sets including MNIST, USPS, FRGC, CIFAR-10, STL-10, and Fashion-MNIST by comparing our method with a number of state-of-the-art deep clustering methods, and experimental results demonstrate its effectiveness and superiority. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

42. Gručenje oglasov s pomočjo globokih nevronskih mrež

Author: Džubur, Benjamin and Demšar, Jure
Subjects: gručenje slik, image clustering, segmentacija slik, globoko gručenje, nenadzorovano učenje, convolutional neural network, deep clustering, unsupervised learning, image segmentation, konvolucijska nevronska mreža
Abstract: V oglaševalski industriji je razumevanje različnih parametrov oglaševalskih kampanj ključnega pomena za optimizacijo delovnega procesa. Eden izmed ključnih parametrov je število uporabljenih glavnih dizajnov za pripravo slikovnih oglasov, na podlagi katerega lahko sklepamo o kompleksnosti kampanje. Oglasi, ki pripadajo istemu glavnemu dizajnu, pogosto vsebujejo podobne tipografije besedila in grafične elemente, včasih pa tudi kompozicije. V eksperimentalnem delu diplomskega dela za namen napovedovanja števila uporabljenih glavnih dizajnov v množicah slikovnih oglasov razvijemo dva napovedna modela. Oba temeljita na konvolucijskih nevronskih mrežah za pridobitev značilk iz slik in na algoritmih za gručenje podatkov. Razlikujeta se predvsem po načinu določanja podobnosti med posameznimi oglasi. Oba razvita napovedna modela dosežeta boljše rezultate od izhodiščnega pristopa, ki na podlagi porazdelitve podatkov naključno napove število glavnih dizajnov. Napovedna modela na vzorcu 50 kampanj dosežeta 5,2% oz. 1,2% izboljšavo v klasifikacijski točnosti. Drugi napovedni model, ki temelji na podobnosti regij med oglasi, dosega kvalitativno boljše rezultate od prvega, ki temelji na enostavnih primerjavah značilk celotnih oglasov. In the advertising industry the understanding of different advertising campaigns' parameters is key for workflow optimization. One of these parameters is the number of master designs used to prepare image based adverts, which is a crucial for determining the complexity of a campaign. Adverts which originate from the same master design typically use similar typographies, graphical elements and compositions. In the experimental part of this thesis, we develop two predictive pipelines for the task of predicting the number of master designs in sets of image based adverts. Both pipelines use convolutional neural networks for feature extraction and clustering algorithms. The main difference between the two is in the way that the similarity between individual adverts is computed. Both developed models achieve better results than our baseline approach which, based on the distribution of data, randomly predicts the number of master designs. Our predictive models achieve a 5.2% and 1.2% classification accuracy improvement respectively over the baseline when tested on a sample of 50 campaigns. Our second model, which is based on the similarity of regions between adverts, achieves qualitatively better results than our first model, which is based on simple comparisons of the adverts' features.
Published: 2020

43. Deep Clusteringwith Concrete K-Means

Author: Timothy M. Hospedales, Yongxin Yang, Henry Gouk, and Boyan Gao
Subjects: business.industry, Computer science, Feature extraction, k-means clustering, Unsupervised Learning, Estimator, Contrast (statistics), Pattern recognition, Gradient Estimation, Deep Clustering, Feature (computer vision), Unsupervised learning, Artificial intelligence, business, Representation (mathematics), Cluster analysis
Abstract: We address the problem of simultaneously learning a k-means clustering and deep feature representation from unlabelled data, which is of interest due to the potential for deep k-means to outperform traditional two-step feature extraction and shallow clustering strategies. We achieve this by developing a gradient estimator for the non-differentiable k-means objective via the Gumbel-Softmax reparameterisation trick. In contrast to previous attempts at deep clustering, our concrete k-means model can be optimised with respect to the canonical k-means objective and is easily trained end-to-end without resorting to time consuming alternating optimisation techniques. We demonstrate the efficacy of our method on standard clustering benchmarks.
Published: 2020
Full Text: View/download PDF

44. Point Symmetry-based Deep Clustering

Author: Jose G. Moreno, Recherche d’Information et Synthèse d’Information (IRIT-IRIS), Institut de recherche en informatique de Toulouse (IRIT), Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3), Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse 1 Capitole (UT1), and Université Fédérale Toulouse Midi-Pyrénées
Subjects: symmetry-based distances, Computer science, Point symmetry, 02 engineering and technology, deep clustering, Ellipse, unsupervised learning, 01 natural sciences, Euclidean distance, 0103 physical sciences, Euclidean geometry, 0202 electrical engineering, electronic engineering, information engineering, Unsupervised learning, 020201 artificial intelligence & image processing, [INFO]Computer Science [cs], Symmetry (geometry), 010306 general physics, Cluster analysis, Algorithm
Abstract: International audience; Clustering is a central task in unsupervised learning. Recent advances that perform clustering into learned deep features (such as DEC[14], IDEC [6] or VaDe [10]) have shown improvements over classical algorithms, but most of them are based on the Euclidean distance. Moreover, symmetry-based distances have shown to be a powerful tool to distinguish symmetric shapes -- such as circles, ellipses, squares, etc. This paper presents an adaptation of symmetry-based distances into deep clustering algorithms, named SymDEC. Our results show that the proposed strategy outperforms significantly the existing Euclidean-based deep clustering as well as recent symmetry-based algorithms in several of the synthetic symmetric and UCI studied datasets.
Published: 2018
Full Text: View/download PDF

45. Deep clustering by maximizing mutual information in variational auto-encoder.

Author: Xu, Chaoyang, Dai, Yuanfei, Lin, Renjie, and Wang, Shiping
Subjects: *COMPUTER vision, *KEY performance indicators (Management), *MACHINE learning, *DEEP learning
Abstract: Unsupervised clustering, which is extensively employed in deep learning and computer vision as a fundamental technique, has attracted much attention in recent years. Deep embedding clustering often uses auto-encoders to learn representations for clustering. However, auto-encoders tend to corrupt the learning representations when simultaneously learning embedded representations and performing clustering. In this paper, we propose a Deep Clustering via Variational Auto-Encoder (DC-VAE) of mutual information maximization. First, we formulate the deep clustering problem as learning soft cluster assignments within the framework of variational auto-encoder. Second, we impose mutual information maximization on the observed data and the representations to prevent soft cluster assignments from distorting learning representations. Third, we derive a new generalization evidence lower bound objects related to several previous models and introduce parameters to balance learning informative representations and clustering. It is shown that the proposed model can significantly boost the performance of clustering by learning effective and reliable representations for downstream machine learning tasks. Through experimental results on several datasets, we demonstrate that the proposed model is competitive with existing state-of-the-arts on multiple performance metrics. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

45 results on '"deep clustering"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources