Back to Search
Start Over
scClassify: sample size estimation and multiscale classification of cells using single and multiple reference
- Source :
- Molecular Systems Biology, Molecular Systems Biology, Vol 16, Iss 6, Pp n/a-n/a (2020)
- Publication Year :
- 2020
- Publisher :
- John Wiley and Sons Inc., 2020.
-
Abstract
- Automated cell type identification is a key computational challenge in single‐cell RNA‐sequencing (scRNA‐seq) data. To capitalise on the large collection of well‐annotated scRNA‐seq datasets, we developed scClassify, a multiscale classification framework based on ensemble learning and cell type hierarchies constructed from single or multiple annotated datasets as references. scClassify enables the estimation of sample size required for accurate classification of cell types in a cell type hierarchy and allows joint classification of cells when multiple references are available. We show that scClassify consistently performs better than other supervised cell type classification methods across 114 pairs of reference and testing data, representing a diverse combination of sizes, technologies and levels of complexity, and further demonstrate the unique components of scClassify through simulations and compendia of experimental datasets. Finally, we demonstrate the scalability of scClassify on large single‐cell atlases and highlight a novel application of identifying subpopulations of cells from the Tabula Muris data that were unidentified in the original publication. Together, scClassify represents state‐of‐the‐art methodology in automated cell type identification from scRNA‐seq data.<br />scClassify is a multiscale classification framework based on ensemble learning and cell type hierarchies, enabling sample size estimation required for accurate cell type classification and joint classification of cells using multiple references.
- Subjects :
- Medicine (General)
QH301-705.5
Cells
Method
Methods & Resources
Biology
single‐cell
Chromatin, Epigenetics, Genomics & Functional Genomics
General Biochemistry, Genetics and Molecular Biology
Machine Learning
03 medical and health sciences
Mice
R5-920
0302 clinical medicine
cell type identification
Methods
Animals
Cluster Analysis
Humans
multiscale classification
Biology (General)
Pancreas
030304 developmental biology
0303 health sciences
Hierarchy
General Immunology and Microbiology
sample size estimation
business.industry
Applied Mathematics
Computational Biology
Pattern recognition
Ensemble learning
Visualization
Identification (information)
Computational Theory and Mathematics
Databases as Topic
Sample size determination
Sample Size
Scalability
Key (cryptography)
cell type hierarchy
Leukocytes, Mononuclear
Artificial intelligence
General Agricultural and Biological Sciences
business
030217 neurology & neurosurgery
Software
Information Systems
Test data
Subjects
Details
- Language :
- English
- ISSN :
- 17444292
- Volume :
- 16
- Issue :
- 6
- Database :
- OpenAIRE
- Journal :
- Molecular Systems Biology
- Accession number :
- edsair.doi.dedup.....7fdc0a70945211c0d19fff24594b4a38