Back to Search Start Over

Performance evaluation of six popular short-read simulators

Authors :
Mark Milhaven
Susanne P. Pfeifer
Source :
Heredity. 130:55-63
Publication Year :
2022
Publisher :
Springer Science and Business Media LLC, 2022.

Abstract

High-throughput sequencing data enables the comprehensive study of genomes and the variation therein. Essential for the interpretation of this genomic data is a thorough understanding of the computational methods used for processing and analysis. Whereas “gold-standard” empirical datasets exist for this purpose in humans, synthetic (i.e., simulated) sequencing data can offer important insights into the capabilities and limitations of computational pipelines for any arbitrary species and/or study design—yet, the ability of read simulator software to emulate genomic characteristics of empirical datasets remains poorly understood. We here compare the performance of six popular short-read simulators—ART, DWGSIM, InSilicoSeq, Mason, NEAT, and wgsim—and discuss important considerations for selecting suitable models for benchmarking.

Subjects

Subjects :
Genetics
Genetics (clinical)

Details

ISSN :
13652540 and 0018067X
Volume :
130
Database :
OpenAIRE
Journal :
Heredity
Accession number :
edsair.doi...........5cc0ec67904ee8b5c9647866f51be273