1. STIC: Predicting Single Nucleotide Variants and Tumor Purity in Cancer Genome
- Author
-
Xiguo Yuan, Jianing Xi, Chao Ma, Yang Liying, Shuzhen Wang, and Haiyong Zhao
- Subjects
Genome, Human ,Somatic cell ,Applied Mathematics ,0206 medical engineering ,High-Throughput Nucleotide Sequencing ,Genomics ,Sequence Analysis, DNA ,02 engineering and technology ,Computational biology ,Biology ,Polymorphism, Single Nucleotide ,Germline ,Neoplasms ,Cancer genome ,Genetics ,Humans ,Allele frequency ,Algorithms ,020602 bioinformatics ,Human cancer ,Biotechnology - Abstract
Single nucleotide variant (SNV) plays an important role in cellular proliferation and tumorigenesis in various types of human cancer. Next-generation sequencing (NGS) has provided high-throughput data at an unprecedented resolution to predict SNVs. Currently, there exist many computational methods for either germline or somatic SNV discovery from NGS data, but very few of them are versatile enough to adapt to any situations. In the absence of matched normal samples, the prediction of somatic SNVs from single-tumor samples becomes considerably challenging, especially when the tumor purity is unknown. Here, we propose a new approach, STIC, to predict somatic SNVs and estimate tumor purity from NGS data without matched normal samples. The main features of STIC include: (1) extracting a set of SNV-relevant features on each site and training the BP neural network algorithm on the features to predict SNVs; (2) creating an iterative process to distinguish somatic SNVs from germline ones by disturbing allele frequency; and (3) establishing a reasonable relationship between tumor purity and allele frequencies of somatic SNVs to accurately estimate the purity. We quantitatively evaluate the performance of STIC on both simulation and real sequencing datasets, the results of which indicate that STIC outperforms competing methods.
- Published
- 2021