1. A rank-based algorithm of differential expression analysis for small cell line data with statistical control
- Author
-
Xu Lin, Lu Ao, Zheng Guo, Xiangyu Li, Jun He, Xianlong Wang, Hao Cai, Qingzhou Guan, Yunyan Gu, Lishuang Qi, and You Guo
- Subjects
0301 basic medicine ,Paper ,differentially expressed genes ,technical replicates ,Statistical power ,within-sample relative expression orderings ,03 medical and health sciences ,0302 clinical medicine ,Cell Line, Tumor ,Neoplasms ,Humans ,Molecular Biology ,Gene ,Mathematics ,Oligonucleotide Array Sequence Analysis ,Gene knockdown ,Gene Expression Profiling ,Rank (computer programming) ,Statistical process control ,Fold change ,Expression (mathematics) ,Gene Expression Regulation, Neoplastic ,small-scale cell line data ,030104 developmental biology ,030220 oncology & carcinogenesis ,Data Interpretation, Statistical ,Significance analysis of microarrays ,Algorithm ,Algorithms ,Information Systems - Abstract
To detect differentially expressed genes (DEGs) in small-scale cell line experiments, usually with only two or three technical replicates for each state, the commonly used statistical methods such as significance analysis of microarrays (SAM), limma and RankProd (RP) lack statistical power, while the fold change method lacks any statistical control. In this study, we demonstrated that the within-sample relative expression orderings (REOs) of gene pairs were highly stable among technical replicates of a cell line but often widely disrupted after certain treatments such like gene knockdown, gene transfection and drug treatment. Based on this finding, we customized the RankComp algorithm, previously designed for individualized differential expression analysis through REO comparison, to identify DEGs with certain statistical control for small-scale cell line data. In both simulated and real data, the new algorithm, named CellComp, exhibited high precision with much higher sensitivity than the original RankComp, SAM, limma and RP methods. Therefore, CellComp provides an efficient tool for analyzing small-scale cell line data.
- Published
- 2017