1. RankCompV3: a differential expression analysis algorithm based on relative expression orderings and applications in single-cell RNA transcriptomics.
- Author
-
Yan, Jing, Zeng, Qiuhong, and Wang, Xianlong
- Subjects
- *
GENE expression , *TRANSCRIPTOMES , *RNA sequencing , *CONTINGENCY tables , *SOURCE code - Abstract
Background: Effective identification of differentially expressed genes (DEGs) has been challenging for single-cell RNA sequencing (scRNA-seq) profiles. Many existing algorithms have high false positive rates (FPRs) and often fail to identify weak biological signals. Results: We present a novel method for identifying DEGs in scRNA-seq data called RankCompV3. It is based on the comparison of relative expression orderings (REOs) of gene pairs which are determined by comparing the expression levels of a pair of genes in a set of single-cell profiles. The numbers of genes with consistently higher or lower expression levels than the gene of interest are counted in two groups in comparison, respectively, and the result is tabulated in a 3 × 3 contingency table which is tested by McCullagh's method to determine if the gene is dysregulated. In both simulated and real scRNA-seq data, RankCompV3 tightly controlled the FPR and demonstrated high accuracy, outperforming 11 other common single-cell DEG detection algorithms. Analysis with either regular single-cell or synthetic pseudo-bulk profiles produced highly concordant DEGs with the ground-truth. In addition, RankCompV3 demonstrates higher sensitivity to weak biological signals than other methods. The algorithm was implemented using Julia and can be called in R. The source code is available at https://github.com/pathint/RankCompV3.jl. Conclusions: The REOs-based algorithm is a valuable tool for analyzing single-cell RNA profiles and identifying DEGs with high accuracy and sensitivity. Key points: RankCompV3 is a method for identifying differentially expressed genes (DEGs) in either bulk or single-cell RNA transcriptomics. It is based on the counts of relative expression orderings (REOs) of gene pairs in the two groups. The contingency tables are tested using McCullagh's method. RankCompV3 has comparable or better performance than that of other conventional methods. It has been shown to be effective in identifying DEGs in both single-cell and pseudo-bulk profiles. Pseudo-bulk method is implemented in RankCompV3, which allows the method to achieve higher computational efficiency and improves the concordance with the bulk ground-truth. RankCompV3 is effective in identifying functionally relevant DEGs in weak-signal datasets. The method is not biased towards highly expressed genes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF