Back to Search Start Over

Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations.

Authors :
Lauterbur ME
Cavassim MIA
Gladstein AL
Gower G
Pope NS
Tsambos G
Adrion J
Belsare S
Biddanda A
Caudill V
Cury J
Echevarria I
Haller BC
Hasan AR
Huang X
Iasi LNM
Noskova E
Obsteter J
Pavinato VAC
Pearson A
Peede D
Perez MF
Rodrigues MF
Smith CCR
Spence JP
Teterina A
Tittes S
Unneberg P
Vazquez JM
Waples RK
Wohns AW
Wong Y
Baumdicker F
Cartwright RA
Gorjanc G
Gutenkunst RN
Kelleher J
Kern AD
Ragsdale AP
Ralph PL
Schrider DR
Gronau I
Source :
ELife [Elife] 2023 Jun 21; Vol. 12. Date of Electronic Publication: 2023 Jun 21.
Publication Year :
2023

Abstract

Simulation is a key tool in population genetics for both methods development and empirical research, but producing simulations that recapitulate the main features of genomic datasets remains a major obstacle. Today, more realistic simulations are possible thanks to large increases in the quantity and quality of available genetic data, and the sophistication of inference and simulation software. However, implementing these simulations still requires substantial time and specialized knowledge. These challenges are especially pronounced for simulating genomes for species that are not well-studied, since it is not always clear what information is required to produce simulations with a level of realism sufficient to confidently answer a given question. The community-developed framework stdpopsim seeks to lower this barrier by facilitating the simulation of complex population genetic models using up-to-date information. The initial version of stdpopsim focused on establishing this framework using six well-characterized model species (Adrion et al., 2020). Here, we report on major improvements made in the new release of stdpopsim (version 0.2), which includes a significant expansion of the species catalog and substantial additions to simulation capabilities. Features added to improve the realism of the simulated genomes include non-crossover recombination and provision of species-specific genomic annotations. Through community-driven efforts, we expanded the number of species in the catalog more than threefold and broadened coverage across the tree of life. During the process of expanding the catalog, we have identified common sticking points and developed the best practices for setting up genome-scale simulations. We describe the input data required for generating a realistic simulation, suggest good practices for obtaining the relevant information from the literature, and discuss common pitfalls and major considerations. These improvements to stdpopsim aim to further promote the use of realistic whole-genome population genetic simulations, especially in non-model organisms, making them available, transparent, and accessible to everyone.<br />Competing Interests: ML, MC, GG, NP, GT, SB, VC, JC, IE, BH, AH, XH, LI, EN, JO, VP, AP, DP, MP, MR, CS, JS, AT, ST, PU, JV, RW, AW, YW, FB, RC, GG, RG, JK, AK, AR, PR, DS, IG No competing interests declared, AG is an employee of Embark Veterinary, Inc. The author declares that no other competing interests exist, JA is an employee of Ancestry DNA. The author declares that no other competing interests exist, AB is an employee of 54Gene, Inc. The author declares that no other competing interests exist<br /> (© 2023, Lauterbur et al.)

Details

Language :
English
ISSN :
2050-084X
Volume :
12
Database :
MEDLINE
Journal :
ELife
Publication Type :
Academic Journal
Accession number :
37342968
Full Text :
https://doi.org/10.7554/eLife.84874