1. Leveraging the Cell Ontology to classify unseen cell types
- Author
-
Angela Oliveira Pisco, Russ B. Altman, Marinka Zitnik, Jure Leskovec, Aaron McGeever, Spyros Darmanis, Jim Karkanias, Sheng Wang, and Maria Brbic
- Subjects
Cell type ,Computer science ,Science ,Cell ,Datasets as Topic ,Gene Expression ,General Physics and Astronomy ,Ontology (information science) ,Machine learning ,computer.software_genre ,Article ,General Biochemistry, Genetics and Molecular Biology ,Terminology ,Annotation ,Software ,Terminology as Topic ,Controlled vocabulary ,medicine ,Animals ,Humans ,Cell Lineage ,natural sciences ,Network topology ,Multidisciplinary ,business.industry ,General Chemistry ,Eukaryotic Cells ,medicine.anatomical_structure ,Vocabulary, Controlled ,Key (cryptography) ,Artificial intelligence ,business ,computer ,Algorithms ,Biomarkers ,psychological phenomena and processes - Abstract
Single cell technologies are rapidly generating large amounts of data that enables us to understand biological systems at single-cell resolution. However, joint analysis of datasets generated by independent labs remains challenging due to a lack of consistent terminology to describe cell types. Here, we present OnClass, an algorithm and accompanying software for automatically classifying cells into cell types that are part of the controlled vocabulary that forms the Cell Ontology. A key advantage of OnClass is its capability to classify cells into cell types not present in the training data because it uses the Cell Ontology graph to infer cell type relationships. Furthermore, OnClass can be used to identify marker genes for all the cell ontology categories, regardless of whether the cell types are present or absent in the training data, suggesting that OnClass goes beyond a simple annotation tool for single cell datasets, being the first algorithm capable to identify marker genes specific to all terms of the Cell Ontology and offering the possibility of refining the Cell Ontology using a data-centric approach., Classifying cells into unseen cell types remains challenging in scRNA-seq analysis. Here we show that Cell Ontology enables an accurate classification of unseen cell types through considering the cell type relationships in the Cell Ontology graph.
- Published
- 2021