Back to Search Start Over

scaDA: A novel statistical method for differential analysis of single-cell chromatin accessibility sequencing data.

Authors :
Zhao, Fengdi
Ma, Xin
Yao, Bing
Lu, Qing
Chen, Li
Source :
PLoS Computational Biology; 8/2/2024, Vol. 20 Issue 8, p1-28, 28p
Publication Year :
2024

Abstract

Single-cell ATAC-seq sequencing data (scATAC-seq) has been widely used to investigate chromatin accessibility on the single-cell level. One important application of scATAC-seq data analysis is differential chromatin accessibility (DA) analysis. However, the data characteristics of scATAC-seq such as excessive zeros and large variability of chromatin accessibility across cells impose a unique challenge for DA analysis. Existing statistical methods focus on detecting the mean difference of the chromatin accessible regions while overlooking the distribution difference. Motivated by real data exploration that distribution difference exists among cell types, we introduce a novel composite statistical test named "scaDA", which is based on zero-inflated negative binomial model (ZINB), for performing differential distribution analysis of chromatin accessibility by jointly testing the abundance, prevalence and dispersion simultaneously. Benefiting from both dispersion shrinkage and iterative refinement of mean and prevalence parameter estimates, scaDA demonstrates its superiority to both ZINB-based likelihood ratio tests and published methods by achieving the highest power and best FDR control in a comprehensive simulation study. In addition to demonstrating the highest power in three real sc-multiome data analyses, scaDA successfully identifies differentially accessible regions in microglia from sc-multiome data for an Alzheimer's disease (AD) study that are most enriched in GO terms related to neurogenesis and the clinical phenotype of AD, and AD-associated GWAS SNPs. Author summary: Understanding the cis-regulatory elements that control the fundamental gene regulatory process is important to basic biology. scATAC-seq data offers an unprecedented opportunity to investigate chromatin accessibility on the single-cell level and explore cell heterogeneity to reveal the dynamic changes of cis-regulatory elements among different cell types. To understand the dynamic change of gene regulation using scATAC-seq data, differential chromatin accessibility (DA) analysis, which is one of the most fundamental analyses for scATAC-seq data, can enable the identification of differentially accessible regions between cell types or between multiple conditions. Subsequently, DA analysis has many applications such as identifying cell type-specific chromatin accessible regions to reveal the cell type-specific gene regulatory program, assessing disease-associated changes in chromatin accessibility to detect potential biomarkers, and linking differentially accessible regions to differentially expressed genes for building a comprehensive gene regulatory map. This paper proposes a novel statistical method named "scaDA" to improve the detection of differentially accessible regions by performing differential distribution analysis. scaDA is believed to benefit the research community of single-cell genomics. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
1553734X
Volume :
20
Issue :
8
Database :
Complementary Index
Journal :
PLoS Computational Biology
Publication Type :
Academic Journal
Accession number :
178815339
Full Text :
https://doi.org/10.1371/journal.pcbi.1011854