Back to Search
Start Over
NEBULA is a fast negative binomial mixed model for differential or co-expression analysis of large-scale multi-subject single-cell data
- Source :
- Communications Biology, Vol 4, Iss 1, Pp 1-17 (2021), Communications Biology
- Publication Year :
- 2021
- Publisher :
- Nature Portfolio, 2021.
-
Abstract
- The increasing availability of single-cell data revolutionizes the understanding of biological mechanisms at cellular resolution. For differential expression analysis in multi-subject single-cell data, negative binomial mixed models account for both subject-level and cell-level overdispersions, but are computationally demanding. Here, we propose an efficient NEgative Binomial mixed model Using a Large-sample Approximation (NEBULA). The speed gain is achieved by analytically solving high-dimensional integrals instead of using the Laplace approximation. We demonstrate that NEBULA is orders of magnitude faster than existing tools and controls false-positive errors in marker gene identification and co-expression analysis. Using NEBULA in Alzheimer’s disease cohort data sets, we found that the cell-level expression of APOE correlated with that of other genetic risk factors (including CLU, CST3, TREM2, C1q, and ITM2B) in a cell-type-specific pattern and an isoform-dependent manner in microglia. NEBULA opens up a new avenue for the broad application of mixed models to large-scale multi-subject single-cell data.<br />The application of negative binomial mixed models (NBMMs) to single-cell data is computationally demanding. To address this issue, Liang He et al. have developed NEBULA, an efficient algorithm that can analyze differential gene expression or co-expression networks in multi-subject single-cell data sets, and validate it on snRNA-seq and scRNA-seq data sets comprising ~200k cells from cohorts of Alzheimer’s disease and multiple sclerosis patients.
- Subjects :
- 0301 basic medicine
Mixed model
Computer science
QH301-705.5
genetic processes
Negative binomial distribution
Gene Expression
Medicine (miscellaneous)
Scale (descriptive set theory)
Article
General Biochemistry, Genetics and Molecular Biology
03 medical and health sciences
Apolipoproteins E
0302 clinical medicine
Alzheimer Disease
Humans
Computational models
Biology (General)
Transcriptomics
Computational model
Nebula
Models, Statistical
Gene Expression Profiling
Computational Biology
Alzheimer's disease
Expression (mathematics)
Binomial Distribution
030104 developmental biology
Orders of magnitude (time)
Laplace's method
Microglia
Single-Cell Analysis
General Agricultural and Biological Sciences
Algorithm
030217 neurology & neurosurgery
Subjects
Details
- Language :
- English
- ISSN :
- 23993642
- Volume :
- 4
- Issue :
- 1
- Database :
- OpenAIRE
- Journal :
- Communications Biology
- Accession number :
- edsair.doi.dedup.....ff4dac10de7fd6e88490469bf37d6080