Back to Search Start Over

Sc-GPE: A Graph Partitioning-Based Cluster Ensemble Method for Single-Cell

Authors :
Xiaoshu Zhu
Jian Li
Hong-Dong Li
Miao Xie
Jianxin Wang
Source :
Frontiers in Genetics, Vol 11 (2020)
Publication Year :
2020
Publisher :
Frontiers Media S.A., 2020.

Abstract

Clustering is an efficient way to analyze single-cell RNA sequencing data. It is commonly used to identify cell types, which can help in understanding cell differentiation processes. However, different clustering results can be obtained from different single-cell clustering methods, sometimes including conflicting conclusions, and biologists will often fail to get the right clustering results and interpret the biological significance. The cluster ensemble strategy can be an effective solution for the problem. As the graph partitioning-based clustering methods are good at clustering single-cell, we developed Sc-GPE, a novel cluster ensemble method combining five single-cell graph partitioning-based clustering methods. The five methods are SNN-cliq, PhenoGraph, SC3, SSNN-Louvain, and MPGS-Louvain. In Sc-GPE, a consensus matrix is constructed based on the five clustering solutions by calculating the probability that the cell pairs are divided into the same cluster. It solved the problem in the hypergraph-based ensemble approach, including the different cluster labels that were assigned in the individual clustering method, and it was difficult to find the corresponding cluster labels across all methods. Then, to distinguish the different importance of each method in a clustering ensemble, a weighted consensus matrix was constructed by designing an importance score strategy. Finally, hierarchical clustering was performed on the weighted consensus matrix to cluster cells. To evaluate the performance, we compared Sc-GPE with the individual clustering methods and the state-of-the-art SAME-clustering on 12 single-cell RNA-seq datasets. The results show that Sc-GPE obtained the best average performance, and achieved the highest NMI and ARI value in five datasets.

Details

Language :
English
ISSN :
16648021
Volume :
11
Database :
Directory of Open Access Journals
Journal :
Frontiers in Genetics
Publication Type :
Academic Journal
Accession number :
edsdoj.ba71f5cd43364e55a274875463d97571
Document Type :
article
Full Text :
https://doi.org/10.3389/fgene.2020.604790