Back to Search Start Over

Triku: a feature selection method based on nearest neighbors for single-cell data.

Authors :
M Ascensión, Alex
Ibáñez-Solé, Olga
Inza, Iñaki
Izeta, Ander
Araúzo-Bravo, Marcos J
Source :
GigaScience; 2022, Vol. 11, p1-16, 16p
Publication Year :
2022

Abstract

Background Feature selection is a relevant step in the analysis of single-cell RNA sequencing datasets. Most of the current feature selection methods are based on general univariate descriptors of the data such as the dispersion or the percentage of zeros. Despite the use of correction methods, the generality of these feature selection methods biases the genes selected towards highly expressed genes, instead of the genes defining the cell populations of the dataset. Results Triku is a feature selection method that favors genes defining the main cell populations. It does so by selecting genes expressed by groups of cells that are close in the k -nearest neighbor graph. The expression of these genes is higher than the expected expression if the k -cells were chosen at random. Triku efficiently recovers cell populations present in artificial and biological benchmarking datasets, based on adjusted Rand index, normalized mutual information, supervised classification, and silhouette coefficient measurements. Additionally, gene sets selected by triku are more likely to be related to relevant Gene Ontology terms and contain fewer ribosomal and mitochondrial genes. Conclusion Triku is developed in Python 3 and is available at https://github.com/alexmascension/triku. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
2047217X
Volume :
11
Database :
Complementary Index
Journal :
GigaScience
Publication Type :
Academic Journal
Accession number :
170084451
Full Text :
https://doi.org/10.1093/gigascience/giac017