1. Integrated analysis of multimodal single-cell data
- Author
-
Marlon Stoeckius, Shiwei Zheng, Stephanie Hao, Rahul Satija, Lamar M. Fleming, Maddie Jane Lee, Eleni P. Mimitou, Raphael Gottardo, Erica Andersen-Nissen, Jaison Jain, Peter Smibert, Juliana M. McElrath, Andrew Butler, Avi Srivastava, Catherine A. Blish, William M. Mauck, Yuhan Hao, Aaron J. Wilk, Bertrand Z. Yeung, Michael Zager, Efthymia Papalexi, Paul Hoffman, Angela J. Rogers, Charlotte A. Darby, and Tim Stuart
- Subjects
Resource ,Coronavirus disease 2019 (COVID-19) ,Computer science ,Genomics ,Computational biology ,Machine learning ,computer.software_genre ,Data type ,General Biochemistry, Genetics and Molecular Biology ,Cell Line ,03 medical and health sciences ,Mice ,0302 clinical medicine ,Immune system ,Single-cell analysis ,Multimodal analysis ,Leverage (statistics) ,Animals ,Humans ,Lymphocytes ,single cell genomics ,030304 developmental biology ,0303 health sciences ,multimodal analysis ,biology ,business.industry ,SARS-CoV-2 ,Sequence Analysis, RNA ,Gene Expression Profiling ,Vaccination ,Immunity ,COVID-19 ,T cell ,reference mapping ,Construct (python library) ,3T3 Cells ,immune system ,CITE-seq ,Identity (object-oriented programming) ,biology.protein ,Leukocytes, Mononuclear ,Artificial intelligence ,Antibody ,Single-Cell Analysis ,business ,Transcriptome ,computer ,030217 neurology & neurosurgery - Abstract
Summary The simultaneous measurement of multiple modalities represents an exciting frontier for single-cell genomics and necessitates computational methods that can define cellular states based on multimodal data. Here, we introduce “weighted-nearest neighbor” analysis, an unsupervised framework to learn the relative utility of each data type in each cell, enabling an integrative analysis of multiple modalities. We apply our procedure to a CITE-seq dataset of 211,000 human peripheral blood mononuclear cells (PBMCs) with panels extending to 228 antibodies to construct a multimodal reference atlas of the circulating immune system. Multimodal analysis substantially improves our ability to resolve cell states, allowing us to identify and validate previously unreported lymphoid subpopulations. Moreover, we demonstrate how to leverage this reference to rapidly map new datasets and to interpret immune responses to vaccination and coronavirus disease 2019 (COVID-19). Our approach represents a broadly applicable strategy to analyze single-cell multimodal datasets and to look beyond the transcriptome toward a unified and multimodal definition of cellular identity., Graphical abstract, Highlights • “Weighted nearest neighbor” analysis integrates multimodal single-cell data • A multimodal reference “atlas” of the circulating human immune system • Identification and validation of novel sources of lymphoid heterogeneity • “Reference-based” mapping of query datasets onto a multimodal atlas, A framework that allows for the integration of multiple data types using single cells is applied to understand distinct immune cell states, previously unidentified immune populations, and to interpret immune responses to vaccinations.
- Published
- 2020