Back to Search
Start Over
SVSR: A Program to Simulate Structural Variations and Generate Sequencing Reads for Multiple Platforms.
- Source :
-
IEEE/ACM transactions on computational biology and bioinformatics [IEEE/ACM Trans Comput Biol Bioinform] 2020 May-Jun; Vol. 17 (3), pp. 1082-1091. Date of Electronic Publication: 2018 Oct 17. - Publication Year :
- 2020
-
Abstract
- Structural variation accounts for a major fraction of mutations in the human genome and confers susceptibility to complex diseases. Next generation sequencing along with the rapid development of computational methods provides a cost-effective procedure to detect such variations. Simulation of structural variations and sequencing reads with real characteristics is essential for benchmarking the computational methods. Here, we develop a new program, SVSR, to simulate five types of structural variations (indels, tandem duplication, CNVs, inversions, and translocations) and SNPs for the human genome and to generate sequencing reads with features from popular platforms (Illumina, SOLiD, 454, and Ion Torrent). We adopt a selection model trained from real data to predict copy number states, starting from the first site of a particular genome to the end. Furthermore, we utilize references of microbial genomes to produce insertion fragments and design probabilistic models to imitate inversions and translocations. Moreover, we create platform-specific errors and base quality profiles to generate normal, tumor, or normal-tumor mixture reads. Experimental results show that SVSR could capture more features that are realistic and generate datasets with satisfactory quality scores. SVSR is able to evaluate the performance of structural variation detection methods and guide the development of new computational methods.
Details
- Language :
- English
- ISSN :
- 1557-9964
- Volume :
- 17
- Issue :
- 3
- Database :
- MEDLINE
- Journal :
- IEEE/ACM transactions on computational biology and bioinformatics
- Publication Type :
- Academic Journal
- Accession number :
- 30334804
- Full Text :
- https://doi.org/10.1109/TCBB.2018.2876527