Back to Search
Start Over
Additional file 6 of MethCORR infers gene expression from DNA methylation and allows molecular analysis of ten common cancer types using fresh-frozen and formalin-fixed paraffin-embedded tumor samples
- Publication Year :
- 2021
- Publisher :
- figshare, 2021.
-
Abstract
- Additional file 6: Figure S2. Molecular subtyping with MethCORR inferred RNA expression. a) Scatterplot with correlation between AUC values from a tumor vs normal analysis performed with RNA expression (x-axis) or iRNA expression (y-axis). b) Scatterplot with the first principal component (PC1; X-axis) and the second principal component (PC2; Y-axis) from a PCA performed with (left) RNA expression from TCGA BRCA samples, (middle) iRNA expression calculated in an independent fresh-frozen (GSE84207) cohort, and (right) iRNA expression calculated in an independent FFPE (GSE117439) cohort. Samples are colored according to their estrogen receptor (ER) status. c) Cluster dendrograms from hierarchical boostrap clustering (1000 repetitions) performed with BRCA RNA expression, BRCA iRNA expression, and iRNA expression from the fresh-frozen GSE84207 cohort, and the FFPE GSE117439 cohort. Samples with a “long id name” are ER negative samples. Approximately unbiased p-values (AU) values are given for each cluster node and clusters with AU>0.9 are highlighted by pink rectangles. d+e) Caleydo StratomeX [40] plots showing the concordance between TCGA BRCA microarray based PAM50 subtypes and RNA (d) or iRNA (e) expression based PAM50 subtypes (confidence=1). f) Scatterplot with regression model performance R2 (in independent validation samples) for the 50 genes that constitutes the PAM50 subtype classifier. Top three genes with the highest centroid value is marked for each PAM50 subtype. g) Kaplan–Meier plot showing the overall survival of AJCC stage I-IV patients from the TCGA BRCA cohort stratified according to microarray-based PAM50 subtypes (left panel), RNA-based PAM50 subtypes with confidence call=1 (middle panel), and iRNA-based PAM subtypes with confidence call=1 (right panel). Significance was evaluated by the log-rank test. In parenthesis is provided the Bonferroni-adjusted P values (two comparisons, i.e., LumA vs. HER2 and LumA vs. Basal). h) Consensus cumulative distribution function (CDF) plots for ConsensusClusterPlus analysis performed with (left) iRNA expression and (right) RNA expression for 497 TCGA PRAD tumor samples. The number of clusters, k, is determined where the CDF first approaches maximum [41, 48]. Here, a large increase is seen between k=2 and k=3 and further increases in k does not improve consensus substantially, i.e., k=3 for both iRNA expression and RNA expression. i) Scatterplot with the first principal component (PC1; X-axis) and the third principal component (PC3; Y-axis) from a PCA performed with TCGA PRAD RNA expression, TCGA PRAD iRNA expression, and iRNA expression calculated in an independent prostate cancer FFPE cohort (GSE73549). Samples are colored according to their predicted subtype.
Details
- Database :
- OpenAIRE
- Accession number :
- edsair.doi.dedup.....0eb555f07463e78e4cc70ee217501911
- Full Text :
- https://doi.org/10.6084/m9.figshare.13663296.v1