Back to Search Start Over

Additional file 1 of MEDALT: single-cell copy number lineage tracing enabling gene discovery

Authors :
Wang, Fang
Qihan Wang
Vakul Mohanty
Shaoheng Liang
Jinzhuang Dou
Jincheng Han
Darlan Conterno Minussi
Ruli Gao
Ding, Li
Navin, Nicholas
Chen, Ken
Publication Year :
2021
Publisher :
figshare, 2021.

Abstract

Additional file 1: Fig. S1. Methodology of the framework. a. Illustration of minimal event distance (MED) calculation. b. Average lineage partitioning accuracy (LPA) on 100 simulation datasets without noise. c. Estimating lineage specific cumulative fold level (CFL). d. Estimating significance of CFL in an individual sample. e. AUC of non-random fitness-associated alterations (FAAs) detection based on LSA, permutated SCCN matrix rather than reconstructing tree, GISTIC test and one-side Wilcoxon signed-rank test on 100 simulation datasets without noise. f. Identification of non-random fitness-associated CNAs in a cohort of samples. g. Identification of parallel evolution CNAs in an individual sample. Fig. S2. The efficiency of MEDALT based on 9 × 3 × 20 simulation datasets with the population size from 400 to 2000, genome size from 100 to 1000. Fig. S3. Simulation and evaluation of CNA evolution model. a. Illustration of simulated genomic structural rearrangements in the evolution of a tumor. K represents the number of CNAs during ∆t period. r represents the number of adjacent regions which are affected by a CNA. TD: tandem duplication. TER: terminal deletion. DEL: interstitial deletion. BFB: breakage fusion bridge. b. Simulated and inferred copy number evolution distance between two genomes. Compared with MED are commonly used distance metrics Hamming, Euclidean and Manhattan. c. The AUC for identifying FAAs based on different combinations of models. Wilcox represents one-side Wilcoxon signed-rank test. d. The effects of noise on FAAs detection. Fig. S4. SCCN profile of TNBC patient KTN102. Each row represents a cell from pre-, mid-, or post-treatment. Fig. S5. Average distance between root node and cells from pre-, mid- or post-treatment based on MEDALT, maximal parsimony (MP), neighbor-joining (NJ) and maximum likelihood tree. FC refers to the fold changes between the average distance to root of the mid−/post- cells and that of the pre-treatment cells. Fig. S6. Stratified average CNA rates and fractions of DDR genes loss among lineages (distinguished by colors) in 6 primary TNBC samples. Fig. S7. Gene set enrichment analysis (GSEA) for genes identified by LSA in patient t1. Colors correspond to branches. Fig. S8. Significant genes identified through cohort LSA from the TNBC scDNA-seq data. a. Venn diagram of the genes identified by the MEDALT, MP and GISTIC but not reported in oncoKB, COSMIC and intOGen. b. Overall survival (OS) analysis of breast cancer patients in TCGA. c. Progression free survival (PFS) analysis of breast cancer patients in TCGA. d. Overall survival analysis of breast cancer patients in the METABRIC. e. The fraction of cancer genes overlapping with events which were significant in single lineage (#Lineage = 1), multiple lineages (#Lineage > 1), parallel evolution test ((#Lineage > 1& PLSA

Details

Database :
OpenAIRE
Accession number :
edsair.doi.dedup.....87d1d74c9b4a5eb71b801e03b3b4f29e
Full Text :
https://doi.org/10.6084/m9.figshare.14100249.v1