Back to Search Start Over

SVSR: A Program to Simulate Structural Variations and Generate Sequencing Reads for Multiple Platforms.

Authors :
Yuan X
Gao M
Bai J
Duan J
Source :
IEEE/ACM transactions on computational biology and bioinformatics [IEEE/ACM Trans Comput Biol Bioinform] 2020 May-Jun; Vol. 17 (3), pp. 1082-1091. Date of Electronic Publication: 2018 Oct 17.
Publication Year :
2020

Abstract

Structural variation accounts for a major fraction of mutations in the human genome and confers susceptibility to complex diseases. Next generation sequencing along with the rapid development of computational methods provides a cost-effective procedure to detect such variations. Simulation of structural variations and sequencing reads with real characteristics is essential for benchmarking the computational methods. Here, we develop a new program, SVSR, to simulate five types of structural variations (indels, tandem duplication, CNVs, inversions, and translocations) and SNPs for the human genome and to generate sequencing reads with features from popular platforms (Illumina, SOLiD, 454, and Ion Torrent). We adopt a selection model trained from real data to predict copy number states, starting from the first site of a particular genome to the end. Furthermore, we utilize references of microbial genomes to produce insertion fragments and design probabilistic models to imitate inversions and translocations. Moreover, we create platform-specific errors and base quality profiles to generate normal, tumor, or normal-tumor mixture reads. Experimental results show that SVSR could capture more features that are realistic and generate datasets with satisfactory quality scores. SVSR is able to evaluate the performance of structural variation detection methods and guide the development of new computational methods.

Details

Language :
English
ISSN :
1557-9964
Volume :
17
Issue :
3
Database :
MEDLINE
Journal :
IEEE/ACM transactions on computational biology and bioinformatics
Publication Type :
Academic Journal
Accession number :
30334804
Full Text :
https://doi.org/10.1109/TCBB.2018.2876527