1. Assessing RNA-Seq Workflow Methodologies Using Shannon Entropy.
- Author
-
Carels, Nicolas
- Subjects
- *
UNCERTAINTY (Information theory) , *OVERALL survival , *BIOLOGICAL systems , *ARRAY processing , *STATISTICAL correlation - Abstract
Simple Summary: We show how the relationship between the sub-network entropy of malignant up-regulated genes in twelve different types of cancer, spanning the entire spectrum of 5-year overall survival rates, can serve as a benchmark for optimizing RNA-seq workflows. Assessing the Shannon entropy of sub-networks formed by malignant up-regulated genes by several RNA-seq workflow approaches, such as DESeq2 and edgeR, but also by evaluating nine normalization methods on paired samples of TCGA RNA-seq, we found that the pipeline incorporating TPM normalization coupled with log2 fold change yielded the best correlation coefficient between cancer aggressiveness and tumor entropy. RNA-seq faces persistent challenges due to the ongoing, expanding array of data processing workflows, none of which have yet achieved standardization to date. It is imperative to determine which method most effectively preserves biological facts. Here, we used Shannon entropy as a tool for depicting the biological status of a system. Thus, we assessed the measurement of Shannon entropy by several RNA-seq workflow approaches, such as DESeq2 and edgeR, but also by combining nine normalization methods with log2 fold change on paired samples of TCGA RNA-seq representing datasets of 515 patients and spanning 12 different cancer types with 5-year overall survival rates ranging from 20% to 98%. Our analysis revealed that TPM, RLE, and TMM normalization, coupled with a threshold of log2 fold change ≥1, for identifying differentially expressed genes, yielded the best results. We propose that Shannon entropy can serve as an objective metric for refining the optimization of RNA-seq workflows and mRNA sequencing technologies. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF