1. Quantifying Markov Chain Monte Carlo Exploration of Tumour Progression Tree Spaces: Initialisation Strategies, Convergence Diagnostics & Multi-modalities
- Author
-
Köhn, Gordon; id_orcid 0000-0003-3397-7769
- Subjects
- Cancer genomics, Tumor progression, Markov chain Monte Carlo (MCMC), Convergence diagnostics, Bayesian Inference, trees (mathematics), Data processing, computer science, Natural sciences, Mathematics, Life sciences, Technology (applied sciences)
- Abstract
Understanding the mutational intra-tumour heterogeneity within tumours is crucial to developing effective personalised cancer therapies. Bayesian Markov chain Monte Carlo (MCMC) sampling schemes have proven successful and trusted in reconstructing tumour progression histories, particularly mutation trees. To understand the effectiveness of mutation tree MCMC methods and their required runtimes, it is crucial to understand how quickly the empirical distribution of the MCMC converges to the posterior distribution. We quantify the MCMC exploration of the mutation tree space for the landmark inference scheme SCITE using tree similarity measures. In this simulation study, the tree similarities map features informative of a tumour’s clonal expansion from the mutation tree space to a scalar space, allowing the study of the MCMC exploration. Quantification of the exploration is provided by the novel application of convergence diagnostics established in continuous space to the discrete space of mutation trees via tree similarities. Consequently, we estimate the required runtime of SCITE for simulated data, which may imply significantly reduced runtimes for real-world datasets. Further, we find the dependence of the initial state of the MCMC to vanish quickly. We recommend trialling the significant reduction of the warm-up period for real-world datasets, implying another reduction in required runtime. In the process of exploring initialisation strategies, we validated the performance of the fast heuristic inference method HUNTRESS. Lastly, we investigate the topology of the Bayesian tree posterior, which is thought to contain multi-modalities potentially. For simulated data, we did not find evidence for any multi-modalities justifying the design of SCITE as a single-chain MCMC scheme.
- Published
- 2023