148,033 results on '"Models, Genetic"'
Search Results
2. Probabilistic graphical models for genetics, genomics, and postgenomics.
- Author
-
Mourad, Raphaël and Sinoquet, Christine
- Subjects
Genetics -- Statistical methods ,Genomics -- Statistical methods ,Graphical modeling (Statistics) ,Genetics -- Mathematical models ,Computational Biology -- methods ,Bayes Theorem ,Computer Simulation ,Genomics -- methods ,Models, Genetic ,Models, Statistical - Abstract
Summary: At the crossroads between statistics and machine learning, probabilistic graphical models (PGMs) provide a powerful formal framework to model complex data. An expanding volume of biological data of various types, the so-called 'omics', is in need of accurate and efficient methods for modelling and PGMs are expected to have a prominent role to play. This book provides an overview of the applications of PGMs to genetics, genomics and postgenomics to meet this increased interest.
- Published
- 2014
3. Peripheral Neuropathy Phenotyping in Rat Models of Type 2 Diabetes Mellitus: Evaluating Uptake of the Neurodiab Guidelines and Identifying Future Directions
- Author
-
Md Jakir Hossain, Michael D. Kendig, Meg E. Letton, Margaret J. Morris, and Ria Arnold
- Subjects
diabetes mellitus, type 2 ,diabetic neuropathies ,diet, high-fat ,models, animal ,models, genetic ,peripheral nerves ,rats ,streptozotocin ,Diseases of the endocrine glands. Clinical endocrinology ,RC648-665 - Abstract
Diabetic peripheral neuropathy (DPN) affects over half of type 2 diabetes mellitus (T2DM) patients, with an urgent need for effective pharmacotherapies. While many rat and mouse models of T2DM exist, the phenotyping of DPN has been challenging with inconsistencies across laboratories. To better characterize DPN in rodents, a consensus guideline was published in 2014 to accelerate the translation of preclinical findings. Here we review DPN phenotyping in rat models of T2DM against the ‘Neurodiab’ criteria to identify uptake of the guidelines and discuss how DPN phenotypes differ between models and according to diabetes duration and sex. A search of PubMed, Scopus and Web of Science databases identified 125 studies, categorised as either diet and/or chemically induced models or transgenic/spontaneous models of T2DM. The use of diet and chemically induced T2DM models has exceeded that of transgenic models in recent years, and the introduction of the Neurodiab guidelines has not appreciably increased the number of studies assessing all key DPN endpoints. Combined high-fat diet and low dose streptozotocin rat models are the most frequently used and well characterised. Overall, we recommend adherence to Neurodiab guidelines for creating better animal models of DPN to accelerate translation and drug development.
- Published
- 2022
- Full Text
- View/download PDF
4. Evaluating the Contribution of Model Complexity in Predicting Robustness in Synthetic Genetic Circuits.
- Author
-
Buecherl L, Myers CJ, and Fontanarrosa P
- Subjects
- Computer Simulation, Synthetic Biology methods, Gene Regulatory Networks, Models, Genetic
- Abstract
The design-build-test-learn workflow is pivotal in synthetic biology as it seeks to broaden access to diverse levels of expertise and enhance circuit complexity through recent advancements in automation. The design of complex circuits depends on developing precise models and parameter values for predicting the circuit performance and noise resilience. However, obtaining characterized parameters under diverse experimental conditions is a significant challenge, often requiring substantial time, funding, and expertise. This work compares five computational models of three different genetic circuit implementations of the same logic function to evaluate their relative predictive capabilities. The primary focus is on determining whether simpler models can yield conclusions similar to those of more complex ones and whether certain models offer greater analytical benefits. These models explore the influence of noise, parametrization, and model complexity on predictions of synthetic circuit performance through simulation. The findings suggest that when developing a new circuit without characterized parts or an existing design, any model can effectively predict the optimal implementation by facilitating qualitative comparison of designs' failure probabilities (e.g., higher or lower). However, when characterized parts are available and accurate quantitative differences in failure probabilities are desired, employing a more precise model with characterized parts becomes necessary, albeit requiring additional effort.
- Published
- 2024
- Full Text
- View/download PDF
5. Multiple distinct evolutionary mechanisms govern the dynamics of selfish mitochondrial genomes in Caenorhabditis elegans.
- Author
-
Gitschlag BL, Pereira CV, Held JP, McCandlish DM, and Patel MR
- Subjects
- Animals, Selection, Genetic, Genetic Drift, Models, Genetic, Mitochondria genetics, Mitochondria metabolism, Genotype, Caenorhabditis elegans genetics, Genome, Mitochondrial, DNA, Mitochondrial genetics, Mutation, Evolution, Molecular
- Abstract
Cells possess multiple mitochondrial DNA (mtDNA) copies, which undergo semi-autonomous replication and stochastic inheritance. This enables mutant mtDNA variants to arise and selfishly compete with cooperative (wildtype) mtDNA. Selfish mitochondrial genomes are subject to selection at different levels: they compete against wildtype mtDNA directly within hosts and indirectly through organism-level selection. However, determining the relative contributions of selection at different levels has proven challenging. We overcome this challenge by combining mathematical modeling with experiments designed to isolate the levels of selection. Applying this approach to many selfish mitochondrial genotypes in Caenorhabditis elegans reveals an unexpected diversity of evolutionary mechanisms. Some mutant genomes persist at high frequency for many generations, despite a host fitness cost, by aggressively outcompeting cooperative genomes within hosts. Conversely, some mutant genomes persist by evading inter-organismal selection. Strikingly, the mutant genomes vary dramatically in their susceptibility to genetic drift. Although different mechanisms can cause high frequency of selfish mtDNA, we show how they give rise to characteristically different distributions of mutant frequency among individuals. Given that heteroplasmic frequency represents a key determinant of phenotypic severity, this work outlines an evolutionary theoretic framework for predicting the distribution of phenotypic consequences among individuals carrying a selfish mitochondrial genome., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
6. Genotyping both live and dead animals to improve post-weaning survival of pigs in breeding programs.
- Author
-
Sharif-Islam M, van der Werf JHJ, Henryon M, Chu TT, Wood BJ, and Hermesch S
- Subjects
- Animals, Swine genetics, Pedigree, Breeding methods, Litter Size genetics, Inbreeding methods, Female, Models, Genetic, Male, Phenotype, Genotyping Techniques methods, Selection, Genetic, Weaning, Genotype
- Abstract
Background: In this study, we tested whether genotyping both live and dead animals (GSD) realises more genetic gain for post-weaning survival (PWS) in pigs compared to genotyping only live animals (GOS)., Methods: Stochastic simulation was used to estimate the rate of genetic gain realised by GSD and GOS at a 0.01 rate of pedigree-based inbreeding in three breeding schemes, which differed in PWS (95%, 90% and 50%) and litter size (6 and 10). Pedigree-based selection was conducted as a point of reference. Variance components were estimated and then estimated breeding values (EBV) were obtained in each breeding scheme using a linear or a threshold model. Selection was for a single trait, i.e. PWS with a heritability of 0.02 on the observed scale. The trait was simulated on the underlying scale and was recorded as binary (0/1). Selection candidates were genotyped and phenotyped before selection, with only live candidates eligible for selection. Genotyping strategies differed in the proportion of live and dead animals genotyped, but the phenotypes of all animals were used for predicting EBV of the selection candidates., Results: Based on a 0.01 rate of pedigree-based inbreeding, GSD realised 14 to 33% more genetic gain than GOS for all breeding schemes depending on PWS and litter size. GSD increased the prediction accuracy of EBV for PWS by at least 14% compared to GOS. The use of a linear versus a threshold model did not have an impact on genetic gain for PWS regardless of the genotyping strategy and the bias of the EBV did not differ significantly among genotyping strategies., Conclusions: Genotyping both dead and live animals was more informative than genotyping only live animals to predict the EBV for PWS of selection candidates, but with marginal increases in genetic gain when the proportion of dead animals genotyped was 60% or greater. Therefore, it would be worthwhile to use genomic information on both live and more than 20% dead animals to compute EBV for the genetic improvement of PWS under the assumption that dead animals reflect increased liability on the underlying scale., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
7. Improving on polygenic scores across complex traits using select and shrink with summary statistics (S4) and LDpred2.
- Author
-
Tyrer JP, Peng PC, DeVries AA, Gayther SA, Jones MR, and Pharoah PD
- Subjects
- Humans, Phenotype, Polymorphism, Single Nucleotide, Models, Genetic, Female, Multifactorial Inheritance, Genome-Wide Association Study methods
- Abstract
Background: As precision medicine advances, polygenic scores (PGS) have become increasingly important for clinical risk assessment. Many methods have been developed to create polygenic models with increased accuracy for risk prediction. Our select and shrink with summary statistics (S4) PGS method has previously been shown to accurately predict the polygenic risk of epithelial ovarian cancer. Here, we applied S4 PGS to 12 phenotypes for UK Biobank participants, and compared it with the LDpred2 and a combined S4 + LDpred2 method., Results: The S4 + LDpred2 method provided overall improved PGS accuracy across a variety of phenotypes for UK Biobank participants. Additionally, the S4 + LDpred2 method had the best estimated PGS accuracy in Finnish and Japanese populations. We also addressed the challenge of limited genotype level data by developing the PGS models using only GWAS summary statistics., Conclusions: Taken together, the S4 + LDpred2 method represents an improvement in overall PGS accuracy across multiple phenotypes and populations., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
8. Validation of cross-progeny variance genomic prediction using simulations and experimental data in winter elite bread wheat.
- Author
-
Oget-Ebrad C, Heumez E, Duchalais L, Goudemand-Dugué E, Oury FX, Elsen JM, and Bouchet S
- Subjects
- Crosses, Genetic, Genome, Plant, Genomics methods, Genotype, Genetic Markers, Triticum genetics, Quantitative Trait Loci, Models, Genetic, Plant Breeding, Phenotype, Computer Simulation
- Abstract
Key Message: From simulations and experimental data, the quality of cross progeny variance genomic predictions may be high, but depends on trait architecture and necessitates sufficient number of progenies. Genomic predictions are used to select genitors and crosses in plant breeding. The usefulness criterion (UC) is a cross-selection criterion that necessitates the estimation of parental mean (PM) and progeny standard deviation (SD). This study evaluates the parameters that affect the predictive ability of UC and its two components using simulations. Predictive ability increased with heritability and progeny size and decreased with QTL number, most notably for SD. Comparing scenarios where marker effects were known or estimated using prediction models, SD was strongly impacted by the quality of marker effect estimates. We proposed a new algebraic formula for SD estimation that takes into account the uncertainty of the estimation of marker effects. It improved predictions when the number of QTL was superior to 300, especially when heritability was low. We also compared estimated and observed UC using experimental data for heading date, plant height, grain protein content and yield. PM and UC estimates were significantly correlated for all traits (PM: 0.38, 0.63, 0.51 and 0.91; UC: 0.45, 0.52, 0.54 and 0.74; for yield, grain protein content, plant height and heading date, respectively), while SD was correlated only for heading date and plant height (0.64 and 0.49, respectively). According to simulations, SD estimations in the field would necessitate large progenies. This pioneering study experimentally validates genomic prediction of UC but the predictive ability depends on trait architecture and precision of marker effect estimates. We advise the breeders to adjust progeny size to realize the SD potential of a cross., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
9. Maternal effects and its importance in the genetic evaluations of preweaning live weight traits of beef cattle. A review.
- Author
-
Javier ER, Gabriel MJ, Candelario SJ, and Manuel PG
- Subjects
- Animals, Cattle genetics, Female, Body Weight, Phenotype, Models, Genetic, Breeding, Maternal Inheritance, Weaning
- Abstract
Maternal effects in cattle genetics are defined as the causal influence of the phenotype or maternal genotype on the offspring's phenotype by effects occurring when the genetic and environmental characteristics of the mother influence the phenotype of the offspring beyond the direct inheritance of genes. Its relevance has been strongly described in genetic models focused on the genetic improvement of preweaning traits in cow-calf beef cattle production systems. Here, basic concepts and the importance of maternal effects when using linear and animal model procedures for genetic evaluations of growth and live-weight traits in beef cattle are reviewed and discussed. A brief history of estimation methods from classical studies to recent studies used for the development of animal models for studying maternal effects is also provided. Some important biometric concepts for maternal effect estimation are described, and the antagonism between direct genetic effects and maternal effects, its biological basis, and sources of error in the estimation of direct genetic and maternal covariance are discussed. Finally, some genomic perspectives are presented., (© 2024. The Author(s), under exclusive licence to Springer Nature B.V.)
- Published
- 2024
- Full Text
- View/download PDF
10. Causal interpretations of family GWAS in the presence of heterogeneous effects.
- Author
-
Veller C, Przeworski M, and Coop G
- Subjects
- Humans, Multifactorial Inheritance genetics, Models, Genetic, Heterozygote, Alleles, Homozygote, Family, Gene-Environment Interaction, Genome-Wide Association Study methods, Polymorphism, Single Nucleotide, Linkage Disequilibrium
- Abstract
Family-based genome-wide association studies (GWASs) are often claimed to provide an unbiased estimate of the average causal effects (or average treatment effects; ATEs) of alleles, on the basis of an analogy between the random transmission of alleles from parents to children and a randomized controlled trial. We show that this claim does not hold in general. Because Mendelian segregation only randomizes alleles among children of heterozygotes, the effects of alleles in the children of homozygotes are not observable. This feature will matter if an allele has different average effects in the children of homozygotes and heterozygotes, as can arise in the presence of gene-by-environment interactions, gene-by-gene interactions, or differences in linkage disequilibrium patterns. At a single locus, family-based GWAS can be thought of as providing an unbiased estimate of the average effect in the children of heterozygotes (i.e., a local average treatment effect; LATE). This interpretation does not extend to polygenic scores (PGSs), however, because different sets of SNPs are heterozygous in each family. Therefore, other than under specific conditions, the within-family regression slope of a PGS cannot be assumed to provide an unbiased estimate of the LATE for any subset or weighted average of families. In practice, the potential biases of a family-based GWAS are likely smaller than those that can arise from confounding in a standard, population-based GWAS, and so family studies remain important for the dissection of genetic contributions to phenotypic variation. Nonetheless, their causal interpretation is less straightforward than has been widely appreciated., Competing Interests: Competing interests statement:The authors declare no competing interest.
- Published
- 2024
- Full Text
- View/download PDF
11. Gene sequence analysis model construction based on k-mer statistics.
- Author
-
Gao D
- Subjects
- Algorithms, Sequence Analysis, DNA methods, Models, Genetic, Models, Statistical, Computational Biology methods, Sequence Alignment methods
- Abstract
With the rapid development of biotechnology, gene sequencing methods are gradually improved. The structure of gene sequences is also more complex. However, the traditional sequence alignment method is difficult to deal with the complex gene sequence alignment work. In order to improve the efficiency of gene sequence analysis, D2 series method of k-mer statistics is selected to build the model of gene sequence alignment analysis. According to the structure of the foreground sequence, the sequence to be aligned can be cut by different lengths and divided into multiple subsequences. Finally, according to the selected subsequences, the maximum dissimilarity in the alignment results is determined as the statistical result. At the same time, the research also designed an application system for the sequence alignment analysis of the model. The experimental results showed that the statistical power of the sequence alignment analysis model was directly proportional to the sequence coverage and cutting length, and inversely proportional to the K value and module length. At the same time, the model was applied to the system designed in this paper. The maximum storage capacity of the system was 71 GB, the maximum disk capacity was 135 GB, and the running time was less than 2.0s. Therefore, the k-mer statistic sequence alignment model and system proposed in this study have considerable application value in gene alignment analysis., Competing Interests: The authors have declared that no competing interests exist., (Copyright: © 2024 Dongjie Gao. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.)
- Published
- 2024
- Full Text
- View/download PDF
12. The effect of family structure on the still-missing heritability and genomic prediction accuracy of type 2 diabetes.
- Author
-
Amiri Roudbar M, Vahedi SM, Jin J, Jahangiri M, Lanjanian H, Habibi D, Masjoudi S, Riahi P, Fateh ST, Neshati F, Zahedi AS, Moazzam-Jazi M, Najd-Hassan-Bonab L, Mousavi SF, Asgarian S, Zarkesh M, Moghaddas MR, Tenesa A, Kazemnejad A, Vahidnezhad H, Hakonarson H, Azizi F, Hedayati M, Daneshpour MS, and Akbarzadeh M
- Subjects
- Humans, Female, Male, Genomics methods, Iran, Models, Genetic, Cohort Studies, Genome-Wide Association Study, Genotype, Case-Control Studies, Middle Aged, Family, Family Structure, Diabetes Mellitus, Type 2 genetics, Pedigree, Genetic Predisposition to Disease, Polymorphism, Single Nucleotide genetics
- Abstract
This study aims to assess the effect of familial structures on the still-missing heritability estimate and prediction accuracy of Type 2 Diabetes (T2D) using pedigree estimated risk values (ERV) and genomic ERV. We used 11,818 individuals (T2D cases: 2,210) with genotype (649,932 SNPs) and pedigree information from the ongoing periodic cohort study of the Iranian population project. We considered three different familial structure scenarios, including (i) all families, (ii) all families with ≥ 1 generation, and (iii) families with ≥ 1 generation in which both case and control individuals are presented. Comprehensive simulation strategies were implemented to quantify the difference between estimates of [Formula: see text] and [Formula: see text]. A proportion of still-missing heritability in T2D could be explained by overestimation of pedigree-based heritability due to the presence of families with individuals having only one of the two disease statuses. Our research findings underscore the significance of including families with only case/control individuals in cohort studies. The presence of such family structures (as observed in scenarios i and ii) contributes to a more accurate estimation of disease heritability, addressing the underestimation that was previously overlooked in prior research. However, when predicting disease risk, the absence of these families (as seen in scenario iii) can yield the highest prediction accuracy and the strongest correlation with Polygenic Risk Scores. Our findings represent the first evidence of the important contribution of familial structure for heritability estimations and genomic prediction studies in T2D., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
13. The benefits of permutation-based genome-wide association studies.
- Author
-
John M, Korte A, and Grimm DG
- Subjects
- Models, Genetic, Linear Models, Computer Simulation, Genome-Wide Association Study, Arabidopsis genetics, Phenotype
- Abstract
Linear mixed models (LMMs) are a commonly used method for genome-wide association studies (GWAS) that aim to detect associations between genetic markers and phenotypic measurements in a population of individuals while accounting for population structure and cryptic relatedness. In a standard GWAS, hundreds of thousands to millions of statistical tests are performed, requiring control for multiple hypothesis testing. Typically, static corrections that penalize the number of tests performed are used to control for the family-wise error rate, which is the probability of making at least one false positive. However, it has been shown that in practice this threshold is too conservative for normally distributed phenotypes and not stringent enough for non-normally distributed phenotypes. Therefore, permutation-based LMM approaches have recently been proposed to provide a more realistic threshold that takes phenotypic distributions into account. In this work, we discuss the advantages of permutation-based GWAS approaches, including new simulations and results from a re-analysis of all publicly available Arabidopsis phenotypes from the AraPheno database., (© The Author(s) 2024. Published by Oxford University Press on behalf of the Society for Experimental Biology.)
- Published
- 2024
- Full Text
- View/download PDF
14. A new perspective on microRNA-guided gene regulation specificity, and its potential generalization to transcription factors and RNA-binding proteins.
- Author
-
Seitz H
- Subjects
- Binding Sites, Animals, Humans, Models, Genetic, MicroRNAs metabolism, MicroRNAs genetics, RNA-Binding Proteins metabolism, RNA-Binding Proteins genetics, Transcription Factors metabolism, Transcription Factors genetics, Gene Expression Regulation
- Abstract
Our conception of gene regulation specificity has undergone profound changes over the last 20 years. Previously, regulators were considered to control few genes, recognized with exquisite specificity by a 'lock and key' mechanism. However, recently genome-wide exploration of regulator binding site occupancy (whether on DNA or RNA targets) revealed extensive lists of molecular targets for every studied regulator. Such poor biochemical specificity suggested that each regulator controls many genes, collectively contributing to biological phenotypes. Here, I propose a third model, whereby regulators' biological specificity is only partially due to 'lock and key' biochemistry. Rather, regulators affect many genes at the microscopic scale, but biological consequences for most interactions are attenuated at the mesoscopic scale: only a few regulatory events propagate from microscopic to macroscopic scale; others are made inconsequential by homeostatic mechanisms. This model is well supported by the microRNA literature, and data suggest that it extends to other regulators. It reconciles contradicting observations from biochemistry and comparative genomics on one hand and in vivo genetics on the other hand, but this conceptual unification is obscured by common misconceptions and counter-intuitive modes of graphical display. Profound understanding of gene regulation requires conceptual clarification, and better suited statistical analyses and graphical representation., (© The Author(s) 2024. Published by Oxford University Press on behalf of Nucleic Acids Research.)
- Published
- 2024
- Full Text
- View/download PDF
15. Mutant fate in spatially structured populations on graphs: Connecting models to experiments.
- Author
-
Abbara A, Pagani L, García-Pareja C, and Bitbol AF
- Subjects
- Biological Evolution, Models, Genetic, Selection, Genetic genetics, Computer Simulation, Genetics, Population, Mutation, Computational Biology methods
- Abstract
In nature, most microbial populations have complex spatial structures that can affect their evolution. Evolutionary graph theory predicts that some spatial structures modelled by placing individuals on the nodes of a graph affect the probability that a mutant will fix. Evolution experiments are beginning to explicitly address the impact of graph structures on mutant fixation. However, the assumptions of evolutionary graph theory differ from the conditions of modern evolution experiments, making the comparison between theory and experiment challenging. Here, we aim to bridge this gap by using our new model of spatially structured populations. This model considers connected subpopulations that lie on the nodes of a graph, and allows asymmetric migrations. It can handle large populations, and explicitly models serial passage events with migrations, thus closely mimicking experimental conditions. We analyze recent experiments in light of this model. We suggest useful parameter regimes for future experiments, and we make quantitative predictions for these experiments. In particular, we propose experiments to directly test our recent prediction that the star graph with asymmetric migrations suppresses natural selection and can accelerate mutant fixation or extinction, compared to a well-mixed population., Competing Interests: The authors have declared that no competing interests exist., (Copyright: © 2024 Abbara et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.)
- Published
- 2024
- Full Text
- View/download PDF
16. Reversions mask the contribution of adaptive evolution in microbiomes.
- Author
-
Torrillo PA and Lieberman TD
- Subjects
- Selection, Genetic, Genome, Bacterial, Microbiota genetics, Gastrointestinal Microbiome genetics, Bacteroides genetics, Adaptation, Physiological genetics, Models, Genetic, Bacteria genetics, Bacteria classification, Mutation, Evolution, Molecular
- Abstract
When examining bacterial genomes for evidence of past selection, the results depend heavily on the mutational distance between chosen genomes. Even within a bacterial species, genomes separated by larger mutational distances exhibit stronger evidence of purifying selection as assessed by d
N /dS , the normalized ratio of nonsynonymous to synonymous mutations. Here, we show that the classical interpretation of this scale dependence, weak purifying selection, leads to problematic mutation accumulation when applied to available gut microbiome data. We propose an alternative, adaptive reversion model with opposite implications for dynamical intuition and applications of dN /dS . Reversions that occur and sweep within-host populations are nearly guaranteed in microbiomes due to large population sizes, short generation times, and variable environments. Using analytical and simulation approaches, we show that adaptive reversion can explain the dN /dS decay given only dozens of locally fluctuating selective pressures, which is realistic in the context of Bacteroides genomes. The success of the adaptive reversion model argues for interpreting low values of dN /dS obtained from long timescales with caution as they may emerge even when adaptive sweeps are frequent. Our work thus inverts the interpretation of an old observation in bacterial evolution, illustrates the potential of mutational reversions to shape genomic landscapes over time, and highlights the importance of studying bacterial genomic evolution on short timescales., Competing Interests: PT, TL No competing interests declared, (© 2024, Torrillo and Lieberman.)- Published
- 2024
- Full Text
- View/download PDF
17. Random-Effects Substitution Models for Phylogenetics via Scalable Gradient Approximations.
- Author
-
Magee AF, Holbrook AJ, Pekar JE, Caviedes-Solis IW, Matsen Iv FA, Baele G, Wertheim JO, Ji X, Lemey P, and Suchard MA
- Subjects
- SARS-CoV-2 genetics, SARS-CoV-2 classification, Influenza A Virus, H3N2 Subtype genetics, Influenza A Virus, H3N2 Subtype classification, Models, Genetic, Markov Chains, Bayes Theorem, Phylogeny, Classification methods
- Abstract
Phylogenetic and discrete-trait evolutionary inference depend heavily on an appropriate characterization of the underlying character substitution process. In this paper, we present random-effects substitution models that extend common continuous-time Markov chain models into a richer class of processes capable of capturing a wider variety of substitution dynamics. As these random-effects substitution models often require many more parameters than their usual counterparts, inference can be both statistically and computationally challenging. Thus, we also propose an efficient approach to compute an approximation to the gradient of the data likelihood with respect to all unknown substitution model parameters. We demonstrate that this approximate gradient enables scaling of sampling-based inference, namely Bayesian inference via Hamiltonian Monte Carlo, under random-effects substitution models across large trees and state-spaces. Applied to a dataset of 583 SARS-CoV-2 sequences, an HKY model with random-effects shows strong signals of nonreversibility in the substitution process, and posterior predictive model checks clearly show that it is a more adequate model than a reversible model. When analyzing the pattern of phylogeographic spread of 1441 influenza A virus (H3N2) sequences between 14 regions, a random-effects phylogeographic substitution model infers that air travel volume adequately predicts almost all dispersal rates. A random-effects state-dependent substitution model reveals no evidence for an effect of arboreality on the swimming mode in the tree frog subfamily Hylinae. Simulations reveal that random-effects substitution models can accommodate both negligible and radical departures from the underlying base substitution model. We show that our gradient-based inference approach is over an order of magnitude more time efficient than conventional approaches., (© The Author(s) 2024. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.)
- Published
- 2024
- Full Text
- View/download PDF
18. A general and efficient representation of ancestral recombination graphs.
- Author
-
Wong Y, Ignatieva A, Koskela J, Gorjanc G, Wohns AW, and Kelleher J
- Subjects
- Evolution, Molecular, Software, Genome, Humans, Recombination, Genetic, Models, Genetic
- Abstract
As a result of recombination, adjacent nucleotides can have different paths of genetic inheritance and therefore the genealogical trees for a sample of DNA sequences vary along the genome. The structure capturing the details of these intricately interwoven paths of inheritance is referred to as an ancestral recombination graph (ARG). Classical formalisms have focused on mapping coalescence and recombination events to the nodes in an ARG. However, this approach is out of step with some modern developments, which do not represent genetic inheritance in terms of these events or explicitly infer them. We present a simple formalism that defines an ARG in terms of specific genomes and their intervals of genetic inheritance, and show how it generalizes these classical treatments and encompasses the outputs of recent methods. We discuss nuances arising from this more general structure, and argue that it forms an appropriate basis for a software standard in this rapidly growing field., Competing Interests: Conflicts of interest: The author(s) declare no conflicts of interest., (© The Author(s) 2024. Published by Oxford University Press on behalf of The Genetics Society of America.)
- Published
- 2024
- Full Text
- View/download PDF
19. On the evolutionary origin of discrete phenotypic plasticity.
- Author
-
Sakamoto T and Innan H
- Subjects
- Biological Evolution, Evolution, Molecular, Genotype, Adaptation, Physiological genetics, Phenotype, Gene Regulatory Networks, Models, Genetic
- Abstract
Phenotypic plasticity provides an attractive strategy for adapting to various environments, but the evolutionary mechanism of the underlying genetic system is poorly understood. We use a simple gene regulatory network model to explore how a species acquires phenotypic plasticity, particularly focusing on discrete phenotypic plasticity, which has been difficult to explain by quantitative genetic models. Our approach employs a population genetic framework that integrates the developmental process, where each individual undergoes growth to develop its phenotype, which subsequently becomes subject to selection pressures. Our model considers two alternative types of environments, with the gene regulatory network including a sensor gene that turns on and off depending on the type of environment. With this assumption, we demonstrate that the system gradually adapts by acquiring the ability to produce two distinct optimum phenotypes under two types of environments without changing genotype, resulting in phenotypic plasticity. We find that the resulting plasticity is often discrete after a lengthy period of evolution. Our results suggest that gene regulatory networks have a notable capacity to flexibly produce various phenotypes in response to environmental changes. This study also shows that the evolutionary dynamics of phenotype may differ significantly between mechanistic-based developmental models and quantitative genetics models, suggesting the utility of incorporating gene regulatory networks into evolutionary models., Competing Interests: Conflicts of interest The authors declare no competing interest., (© The Author(s) 2024. Published by Oxford University Press on behalf of The Genetics Society of America.)
- Published
- 2024
- Full Text
- View/download PDF
20. Selection with two alleles of X-linkage and its application to the fitness component analysis of OdsH in Drosophila.
- Author
-
Sun S, Ting CT, and Wu CI
- Subjects
- Animals, Female, Male, Alleles, Gene Frequency, Genes, X-Linked, Genetic Fitness, Genetic Linkage, Models, Genetic, Selection, Genetic, X Chromosome genetics, Drosophila melanogaster genetics, Drosophila Proteins genetics, Homeodomain Proteins genetics
- Abstract
In organisms with the XY sex-determination system, there is an imbalance in the inheritance and transmission of the X chromosome between males and females. Unlike an autosomal allele, an X-linked recessive allele in a female will have phenotypic effects on its male counterpart. Thus, genes located on the X chromosome are of particular interest to researchers in molecular evolution and genetics. Here we present a model for selection with two alleles of X-linkage to understand fitness components associated with genes on the X chromosome. We apply this model to the fitness analysis of an X-linked gene, OdsH (16D), in the fruit fly Drosophila melanogaster. The function of OdsH is involved in sperm production and the gene is rapidly evolving under positive selection. Using site-directed gene targeting, we generated functional and defective OdsH variants tagged with the eye-color marker gene white. We compare the allele frequency changes of the two OdsH variants, each directly competing against a wild-type OdsH allele in concurrent but separate experimental populations. After 20 generations, the two genetically modified OdsH variants displayed a 40% difference in allele frequencies, with the functional OdsH variant demonstrating an advantage over the defective variant. Using maximum likelihood estimation, we determined the fitness components associated with the OdsH alleles in males and females. Our analysis revealed functional aspects of the fitness determinants associated with OdsH, and that sex-specific fertility and viability consequences both contribute to selection on an X-linked gene., Competing Interests: Conflicts of interest The authors declare no conflicts of interest., (© The Author(s) 2024. Published by Oxford University Press on behalf of The Genetics Society of America.)
- Published
- 2024
- Full Text
- View/download PDF
21. Impacts of pleiotropy and migration on repeated genetic adaptation.
- Author
-
Battlay P, Yeaman S, and Hodgins KA
- Subjects
- Mutation, Selection, Genetic, Adaptation, Physiological genetics, Animals, Evolution, Molecular, Animal Migration, Genetic Fitness, Genetic Pleiotropy, Models, Genetic
- Abstract
Observations of genetically repeated evolution (repeatability) in complex organisms are incongruent with the Fisher-Orr model, which implies that repeated use of the same gene should be rare when mutations are pleiotropic (i.e. affect multiple traits). When spatially divergent selection occurs in the presence of migration, mutations of large effect are more strongly favored, and hence, repeatability is more likely, but it is unclear whether this observation is limited by pleiotropy. Here, we explore this question using individual-based simulations of a two-patch model incorporating multiple quantitative traits governed by mutations with pleiotropic effects. We explore the relationship between fitness trade-offs and repeatability by varying the alignment between mutation effect and spatial variation in trait optima. While repeatability decreases with increasing trait dimensionality, trade-offs in mutation effects on traits do not strongly limit the contribution of a locus of large effect to repeated adaptation, particularly under increased migration. These results suggest that repeatability will be more pronounced for local rather than global adaptation. Whereas pleiotropy limits repeatability in a single-population model, when there is local adaptation with gene flow, repeatability can occur if some loci are able to produce alleles of large effect, even when there are pleiotropic trade-offs., Competing Interests: Conflicts of interest: The authors declare no conflict of interest., (© The Author(s) 2024. Published by Oxford University Press on behalf of The Genetics Society of America.)
- Published
- 2024
- Full Text
- View/download PDF
22. Genetic constitution and variability in synthetic populations of intermediate wheatgrass, an outcrossing perennial grain crop.
- Author
-
Bajgain P, Jungers JM, and Anderson JA
- Subjects
- Edible Grain genetics, Genome, Plant, Genetics, Population, Genotype, Phenotype, Crops, Agricultural genetics, Models, Genetic, Genetic Variation, Linkage Disequilibrium
- Abstract
Intermediate wheatgrass (IWG) is a perennial grass that produces nutritious grain while offering substantial ecosystem services. Commercial varieties of this crop are mostly synthetic panmictic populations that are developed by intermating a few selected individuals. As development and generation advancement of these synthetic populations is a multiyear process, earlier synthetic generations are tested by the breeders and subsequent generations are released to the growers. A comparison of generations within IWG synthetic cultivars is currently lacking. In this study, we used simulation models and genomic prediction to analyze population differences and trends of genetic variance in 4 synthetic generations of MN-Clearwater, a commercial cultivar released by the University of Minnesota. Little to no differences were observed among the 4 generations for population genetic, genetic kinship, and genome-wide marker relationships measured via linkage disequilibrium. A reduction in genetic variance was observed when 7 parents were used to generate synthetic populations while using 20 led to the best possible outcome in determining population variance. Genomic prediction of plant height, free threshing ability, seed mass, and grain yield among the 4 synthetic generations showed a few significant differences among the generations, yet the differences in values were negligible. Based on these observations, we make 2 major conclusions: (1) the earlier and latter synthetic generations of IWG are mostly similar to each other with minimal differences and (2) using 20 genotypes to create synthetic populations is recommended to sustain ample genetic variance and trait expression among all synthetic generations., Competing Interests: Conflicts of interest The authors declare no conflicts of interest., (© The Author(s) 2024. Published by Oxford University Press on behalf of The Genetics Society of America.)
- Published
- 2024
- Full Text
- View/download PDF
23. Insertions and Deletions: Computational Methods, Evolutionary Dynamics, and Biological Applications.
- Author
-
Redelings BD, Holmes I, Lunter G, Pupko T, and Anisimova M
- Subjects
- Humans, Models, Genetic, Computational Biology methods, Animals, Genomics methods, INDEL Mutation, Evolution, Molecular
- Abstract
Insertions and deletions constitute the second most important source of natural genomic variation. Insertions and deletions make up to 25% of genomic variants in humans and are involved in complex evolutionary processes including genomic rearrangements, adaptation, and speciation. Recent advances in long-read sequencing technologies allow detailed inference of insertions and deletion variation in species and populations. Yet, despite their importance, evolutionary studies have traditionally ignored or mishandled insertions and deletions due to a lack of comprehensive methodologies and statistical models of insertions and deletion dynamics. Here, we discuss methods for describing insertions and deletion variation and modeling insertions and deletions over evolutionary time. We provide practical advice for tackling insertions and deletions in genomic sequences and illustrate our discussion with examples of insertions and deletion-induced effects in human and other natural populations and their contribution to evolutionary processes. We outline promising directions for future developments in statistical methodologies that would allow researchers to analyze insertions and deletion variation and their effects in large genomic data sets and to incorporate insertions and deletions in evolutionary inference., (© The Author(s) 2024. Published by Oxford University Press on behalf of Society for Molecular Biology and Evolution.)
- Published
- 2024
- Full Text
- View/download PDF
24. Testing times: disentangling admixture histories in recent and complex demographies using ancient DNA.
- Author
-
Williams MP, Flegontov P, Maier R, and Huber CD
- Subjects
- Humans, Genetics, Population methods, Gene Flow, Polymorphism, Single Nucleotide, Genome, Human, Evolution, Molecular, DNA, Ancient analysis, Models, Genetic
- Abstract
Our knowledge of human evolutionary history has been greatly advanced by paleogenomics. Since the 2020s, the study of ancient DNA has increasingly focused on reconstructing the recent past. However, the accuracy of paleogenomic methods in resolving questions of historical and archaeological importance amidst the increased demographic complexity and decreased genetic differentiation remains an open question. We evaluated the performance and behavior of two commonly used methods, qpAdm and the f3-statistic, on admixture inference under a diversity of demographic models and data conditions. We performed two complementary simulation approaches-firstly exploring a wide demographic parameter space under four simple demographic models of varying complexities and configurations using branch-length data from two chromosomes-and secondly, we analyzed a model of Eurasian history composed of 59 populations using whole-genome data modified with ancient DNA conditions such as SNP ascertainment, data missingness, and pseudohaploidization. We observe that population differentiation is the primary factor driving qpAdm performance. Notably, while complex gene flow histories influence which models are classified as plausible, they do not reduce overall performance. Under conditions reflective of the historical period, qpAdm most frequently identifies the true model as plausible among a small candidate set of closely related populations. To increase the utility for resolving fine-scaled hypotheses, we provide a heuristic for further distinguishing between candidate models that incorporates qpAdm model P-values and f3-statistics. Finally, we demonstrate a significant performance increase for qpAdm using whole-genome branch-length f2-statistics, highlighting the potential for improved demographic inference that could be achieved with future advancements in f-statistic estimations., Competing Interests: Conflicts of interest: The authors declare no conflicts of interest., (© The Author(s) 2024. Published by Oxford University Press on behalf of The Genetics Society of America.)
- Published
- 2024
- Full Text
- View/download PDF
25. Modeling the evolution of Schizosaccharomyces pombe populations with multiple killer meiotic drivers.
- Author
-
López Hernández JF, Rubinstein BY, Unckless RL, and Zanders SE
- Subjects
- Genome, Fungal, Schizosaccharomyces genetics, Meiosis genetics, Evolution, Molecular, Models, Genetic
- Abstract
Meiotic drivers are selfish genetic loci that can be transmitted to more than half of the viable gametes produced by a heterozygote. This biased transmission gives meiotic drivers an evolutionary advantage that can allow them to spread over generations until all members of a population carry the driver. This evolutionary power can also be exploited to modify natural populations using synthetic drivers known as "gene drives." Recently, it has become clear that natural drivers can spread within genomes to birth multicopy gene families. To understand intragenomic spread of drivers, we model the evolution of 2 or more distinct meiotic drivers in a population. We employ the wtf killer meiotic drivers from Schizosaccharomyces pombe, which are multicopy in all sequenced isolates, as models. We find that a duplicate wtf driver identical to the parent gene can spread in a population unless, or until, the original driver is fixed. When the duplicate driver diverges to be distinct from the parent gene, we find that both drivers spread to fixation under most conditions, but both drivers can be lost under some conditions. Finally, we show that stronger drivers make weaker drivers go extinct in most, but not all, polymorphic populations with absolutely linked drivers. These results reveal the strong potential for natural meiotic drive loci to duplicate and diverge within genomes. Our findings also highlight duplication potential as a factor to consider in the design of synthetic gene drives., Competing Interests: Conflicts of interest S.E.Z. is an inventor on patent application 834 serial 62/491,107 based on wtf killers., (© The Author(s) 2024. Published by Oxford University Press on behalf of The Genetics Society of America.)
- Published
- 2024
- Full Text
- View/download PDF
26. GTRpmix: A Linked General Time-Reversible Model for Profile Mixture Models.
- Author
-
Banos H, Wong TKF, Daneau J, Susko E, Minh BQ, Lanfear R, Brown MW, Eme L, and Roger AJ
- Subjects
- Archaea genetics, Likelihood Functions, Amino Acid Substitution, Evolution, Molecular, Eukaryota genetics, Phylogeny, Models, Genetic
- Abstract
Profile mixture models capture distinct biochemical constraints on the amino acid substitution process at different sites in proteins. These models feature a mixture of time-reversible models with a common matrix of exchangeabilities and distinct sets of equilibrium amino acid frequencies known as profiles. Combining the exchangeability matrix with each profile generates the matrix of instantaneous rates of amino acid exchange for that profile. Currently, empirically estimated exchangeability matrices (e.g. the LG matrix) are widely used for phylogenetic inference under profile mixture models. However, these were estimated using a single profile and are unlikely optimal for profile mixture models. Here, we describe the GTRpmix model that allows maximum likelihood estimation of a common exchangeability matrix under any profile mixture model. We show that exchangeability matrices estimated under profile mixture models differ from the LG matrix, dramatically improving model fit and topological estimation accuracy for empirical test cases. Because the GTRpmix model is computationally expensive, we provide two exchangeability matrices estimated from large concatenated phylogenomic-supermatrices to be used for phylogenetic analyses. One, called Eukaryotic Linked Mixture (ELM), is designed for phylogenetic analysis of proteins encoded by nuclear genomes of eukaryotes, and the other, Eukaryotic and Archaeal Linked mixture (EAL), for reconstructing relationships between eukaryotes and Archaea. These matrices, combined with profile mixture models, fit data better and have improved topology estimation relative to the LG matrix combined with the same mixture models. Starting with version 2.3.1, IQ-TREE2 allows users to estimate linked exchangeabilities (i.e. amino acid exchange rates) under profile mixture models., (© The Author(s) 2024. Published by Oxford University Press on behalf of Society for Molecular Biology and Evolution.)
- Published
- 2024
- Full Text
- View/download PDF
27. A New Model and Dating for the Evolution of Complex Plastids of Red Alga Origin.
- Author
-
Pietluch F, Mackiewicz P, Ludwig K, and Gagat P
- Subjects
- Models, Genetic, Rhodophyta genetics, Rhodophyta classification, Plastids genetics, Phylogeny, Symbiosis, Evolution, Molecular
- Abstract
Complex plastids, characterized by more than two bounding membranes, still present an evolutionary puzzle for the traditional endosymbiotic theory. Unlike primary plastids that directly evolved from cyanobacteria, complex plastids originated from green or red algae. The Chromalveolata hypothesis proposes a single red alga endosymbiosis that involved the ancestor of all the Chromalveolata lineages: cryptophytes, haptophytes, stramenopiles, and alveolates. As extensive phylogenetic analyses contradict the monophyly of Chromalveolata, serial plastid endosymbiosis models were proposed, suggesting a single secondary red alga endosymbiosis within Cryptophyta, followed by subsequent plastid transfers to other chromalveolates. Our findings based on 97 plastid-encoded markers, 112 species, and robust phylogenetic methods challenge all the existing models. They reveal two independent secondary endosymbioses, one within Cryptophyta and one within stramenopiles, precisely the phylum Ochrophyta, with two different groups of red algae. Consequently, we propose a new model for the emergence of red alga plastid-containing lineages and, through molecular clock analyses, estimate their ages., (© The Author(s) 2024. Published by Oxford University Press on behalf of Society for Molecular Biology and Evolution.)
- Published
- 2024
- Full Text
- View/download PDF
28. Segregation GWAS to linearize a non-additive locus with incomplete penetrance: an example of horn status in sheep.
- Author
-
Duijvesteijn N, van der Werf JHJ, and Kinghorn BP
- Subjects
- Animals, Female, Male, Sheep genetics, Genotype, Models, Genetic, Pedigree, Alleles, Genome-Wide Association Study methods, Genome-Wide Association Study veterinary, Penetrance, Polymorphism, Single Nucleotide, Phenotype, Horns
- Abstract
Background: The objective of this study was to introduce a genome-wide association study (GWAS) in conjunction with segregation analysis on monogenic categorical traits. Genotype probabilities calculated from phenotypes, mode of inheritance and pedigree information, are expressed as the expected allele count (EAC) (range 0 to 2), and are inherited additively, by definition, unlike the original phenotypes, which are non-additive and could be of incomplete penetrance. The EAC are regressed on the single nucleotide polymorphism (SNP) genotypes, similar to an additive GWAS. In this study, horn phenotypes in Merino sheep are used to illustrate the advantages of using the segregation GWAS, a trait believed to be monogenic, affected by dominance, sex-dependent expression and likely affected by incomplete penetrance. We also used simulation to investigate whether incomplete penetrance can cause prediction errors in Merino sheep for horn status., Results: Estimated penetrance values differed between the sexes, where males showed almost complete penetrance, especially for horned and polled phenotypes, while females had low penetrance values for the horned status. This suggests that females homozygous for the 'horned allele' have a horned phenotype in only 22% of the cases while 78% will be knobbed or have scurs. The GWAS using EAC on 4001 animals and 510,174 SNP genotypes from the Illumina Ovine high-density (600k) chip gave a stronger association compared to using actual phenotypes. The correlation between the EAC and the allele count of the SNP with the highest -log10(p-value) was 0.73 in males and 0.67 in females. Simulations using penetrance values found by the segregation analyses resulted in higher correlations between the EAC and the causative mutation (0.95 for males and 0.89 for females, respectively), suggesting that the most predictive SNP is not in full LD with the causative mutation., Conclusions: Our results show clear differences in penetrance values between males and female Merino sheep for horn status. Segregation analysis for a trait with mutually exclusive phenotypes, non-additive inheritance, and/or incomplete penetrance can lead to considerably more power in a GWAS because the linearized genotype probabilities are additive and can accommodate incomplete penetrance. This method can be extended to any monogenic controlled categorical trait of which the phenotypes are mutually exclusive., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
29. Overcoming CRISPR-Cas9 off-target prediction hurdles: A novel approach with ESB rebalancing strategy and CRISPR-MCA model.
- Author
-
Yang Y, Zheng Y, Zou Q, Li J, and Feng H
- Subjects
- Humans, Deep Learning, Neural Networks, Computer, Models, Genetic, CRISPR-Cas Systems genetics, Computational Biology methods, Gene Editing methods
- Abstract
The off-target activities within the CRISPR-Cas9 system remains a formidable barrier to its broader application and development. Recent advancements have highlighted the potential of deep learning models in predicting these off-target effects, yet they encounter significant hurdles including imbalances within datasets and the intricacies associated with encoding schemes and model architectures. To surmount these challenges, our study innovatively introduces an Efficiency and Specificity-Based (ESB) class rebalancing strategy, specifically devised for datasets featuring mismatches-only off-target instances, marking a pioneering approach in this realm. Furthermore, through a meticulous evaluation of various One-hot encoding schemes alongside numerous hybrid neural network models, we discern that encoding and models of moderate complexity ideally balance performance and efficiency. On this foundation, we advance a novel hybrid model, the CRISPR-MCA, which capitalizes on multi-feature extraction to enhance predictive accuracy. The empirical results affirm that the ESB class rebalancing strategy surpasses five conventional methods in addressing extreme dataset imbalances, demonstrating superior efficacy and broader applicability across diverse models. Notably, the CRISPR-MCA model excels in off-target effect prediction across four distinct mismatches-only datasets and significantly outperforms contemporary state-of-the-art models in datasets comprising both mismatches and indels. In summation, the CRISPR-MCA model, coupled with the ESB rebalancing strategy, offers profound insights and a robust framework for future explorations in this field., Competing Interests: The authors have declared that no competing interests exist., (Copyright: © 2024 Yang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.)
- Published
- 2024
- Full Text
- View/download PDF
30. Evolution of a bistable genetic system in fluctuating and nonfluctuating environments.
- Author
-
Fernández-Fernández R, Olivenza DR, Weyer E, Singh A, Casadesús J, and Antonia Sánchez-Romero M
- Subjects
- Epigenesis, Genetic, Operon genetics, Environment, Phenotype, Evolution, Molecular, Biological Evolution, Bacteriophages genetics, Models, Genetic, Mutation, Gene Expression Regulation, Bacterial, Salmonella enterica genetics
- Abstract
Epigenetic mechanisms can generate bacterial lineages capable of spontaneously switching between distinct phenotypes. Currently, mathematical models and simulations propose epigenetic switches as a mechanism of adaptation to deal with fluctuating environments. However, bacterial evolution experiments for testing these predictions are lacking. Here, we exploit an epigenetic switch in Salmonella enterica, the opvAB operon, to show clear evidence that OpvAB bistability persists in changing environments but not in stable conditions. Epigenetic control of transcription in the opvAB operon produces OpvAB
OFF (phage-sensitive) and OpvABON (phage-resistant) cells in a reversible manner and may be interpreted as an example of bet-hedging to preadapt Salmonella populations to the encounter with phages. Our experimental observations and computational simulations illustrate the adaptive value of epigenetic variation as an evolutionary strategy for mutation avoidance in fluctuating environments. In addition, our study provides experimental support to game theory models predicting that phenotypic heterogeneity is advantageous in changing and unpredictable environments., Competing Interests: Competing interests statement:The authors declare no competing interest.- Published
- 2024
- Full Text
- View/download PDF
31. Fission as a source of variation for group selection.
- Author
-
Simon B, Ispolatov Y, and Doebeli M
- Subjects
- Biological Evolution, Models, Genetic, Animals, Stochastic Processes, Selection, Genetic, Reproduction
- Abstract
Without heritable variation natural selection cannot effect evolutionary change. In the case of group selection, there must be variation in the population of groups. Where does this variation come from? One source of variation is from the stochastic birth-death processes that occur within groups. This is where variation between groups comes from in most mathematical models of group selection. Here, we argue that another important source of variation between groups is fission, the (generally random) group-level reproduction where parent groups split into two or more offspring groups. We construct a simple model of the fissioning process with a parameter that controls how much variation is produced among the offspring groups. We then illustrate the effect of that parameter with some examples. In most models of group selection in the literature, no variation is produced during group reproduction events; that is, groups "clone" themselves when they reproduce. Fission is often a more biologically realistic method of group reproduction, and it can significantly increase the efficacy of group selection., (© The Author(s) 2024. Published by Oxford University Press on behalf of The Society for the Study of Evolution (SSE). Published by Oxford University Press on behalf of The Society for the Study of Evolution (SSE). All rights reserved. For commercial re-use, please contact reprints@oup.com for reprints and translation rights for reprints. All other permissions can be obtained through our RightsLink service via the Permissions link on the article page on our site—for further information please contact journals.permissions@oup.com.)
- Published
- 2024
- Full Text
- View/download PDF
32. Spectral neural approximations for models of transcriptional dynamics.
- Author
-
Gorin G, Carilli M, Chari T, and Pachter L
- Subjects
- Neural Networks, Computer, Models, Genetic, Transcription, Genetic
- Abstract
The advent of high-throughput transcriptomics provides an opportunity to advance mechanistic understanding of transcriptional processes and their connections to cellular function at an unprecedented, genome-wide scale. These transcriptional systems, which involve discrete stochastic events, are naturally modeled using chemical master equations (CMEs), which can be solved for probability distributions to fit biophysical rates that govern system dynamics. While CME models have been used as standards in fluorescence transcriptomics for decades to analyze single-species RNA distributions, there are often no closed-form solutions to CMEs that model multiple species, such as nascent and mature RNA transcript counts. This has prevented the application of standard likelihood-based statistical methods for analyzing high-throughput, multi-species transcriptomic datasets using biophysical models. Inspired by recent work in machine learning to learn solutions to complex dynamical systems, we leverage neural networks and statistical understanding of system distributions to produce accurate approximations to a steady-state bivariate distribution for a model of the RNA life cycle that includes nascent and mature molecules. The steady-state distribution to this simple model has no closed-form solution and requires intensive numerical solving techniques: our approach reduces likelihood evaluation time by several orders of magnitude. We demonstrate two approaches, whereby solutions are approximated by 1) learning the weights of kernel distributions with constrained parameters or 2) learning both weights and scaling factors for parameters of kernel distributions. We show that our strategies, denoted by kernel weight regression and parameter-scaled kernel weight regression, respectively, enable broad exploration of parameter space and can be used in existing likelihood frameworks to infer transcriptional burst sizes, RNA splicing rates, and mRNA degradation rates from experimental transcriptomic data., Competing Interests: Declaration of interests The authors declare no competing interests., (Copyright © 2024. Published by Elsevier Inc.)
- Published
- 2024
- Full Text
- View/download PDF
33. On the importance of scale in evolutionary quantitative genetics.
- Author
-
Hansen TF, Holstad A, Houle D, and Pélabon C
- Subjects
- Genetic Variation, Quantitative Trait, Heritable, Biological Evolution, Models, Genetic
- Abstract
The informed use of scales and units in evolutionary quantitative genetics is often neglected, and naïve standardizations can cause misinterpretations of empirical results. A potentially influential example of such neglect can be found in the recent book by Arnold (2023. Evolutionary quantitative genetics. Oxford University Press). There, Arnold championed the use of heritability over mean-scaled genetic variance as a measure of evolutionary potential arguing that mean-scaled genetic variances are correlated with trait means while heritabilities are not. Here, we show that Arnold's empirical result is an artifact of ignoring the units in which traits are measured. More importantly, Arnold's argument mistakenly assumes that the goal of mean scaling is to remove the relationship between mean and variance. In our view, the purpose of mean scaling is to put traits with different units on a common scale that makes evolutionary changes, or their potential, readily interpretable and comparable in terms of proportions of the mean., (© The Author(s) 2024. Published by Oxford University Press on behalf of The Society for the Study of Evolution (SSE). All rights reserved. For commercial re-use, please contact reprints@oup.com for reprints and translation rights for reprints. All other permissions can be obtained through our RightsLink service via the Permissions link on the article page on our site—for further information please contact journals.permissions@oup.com.)
- Published
- 2024
- Full Text
- View/download PDF
34. Recombination fraction in pre-recombinant inbred lines (PRERIL) - revisiting a century old problem in genetics.
- Author
-
Xu S and Osorio Y Fortéa J
- Subjects
- Chromosome Mapping, Homozygote, Models, Genetic, Genotype, Phenotype, Quantitative Trait Loci, Recombination, Genetic, Inbreeding
- Abstract
Background: Traditional recombinant inbred lines (RILs) are generated from repeated self-fertilization or brother-sister mating from the F
1 hybrid of two inbred parents. Compared with the F2 population, RILs cumulate more crossovers between loci and thus increase the number of recombinants, resulting in an increased resolution of genetic mapping. Since they are inbred to the isogenic stage, another consequence of the heterozygosity reduction is the increased genetic variance and thus the increased power of QTL detection. Self-fertilization is the primary form of developing RILs in plants. Brother-sister mating is another way to develop RILs but in small laboratory animals. To ensure that the RILs have at least 98% of homozygosity, we need about seven generations of self-fertilization or 20 generations of brother-sister mating. Prior to homozygosity, these lines are called pre-recombinant inbred lines (PRERIL). Phenotypic values of traits in PRERILs are often collected but not used in QTL mapping. To perform QTL mapping in PRERILs, we need the recombination fraction between two markers at generation t for t < 7 (selfing) or t < 20 (brother-sister mating) so that the genotypes of QTL flanked by the markers can be inferred., Results: In this study, we developed formulas to calculate the recombination fractions of PRERILs at generation t in self-fertilization, brother-sister mating, and random mating. In contrast to existing works in this topic, we used computer code to construct the transition matrix to form the Markov chain of genotype array between consecutive generations, the so-called recurrent equations., Conclusions: We provide R functions to calculate the recombination fraction using the newly developed recurrent equations of ordered genotype array. With the recurrent equations and the R code, users can perform QTL mapping in PRERILs. Substantial time and effort can be saved compared with QTL mapping in RILs., (© 2024. The Author(s).)- Published
- 2024
- Full Text
- View/download PDF
35. Polygenic risk score portability for common diseases across genetically diverse populations.
- Author
-
Moreno-Grau S, Vernekar M, Lopez-Pineda A, Mas-Montserrat D, Barrabés M, Quinto-Cortés CD, Moatamed B, Lee MTM, Yu Z, Numakura K, Matsuda Y, Wall JD, Ioannidis AG, Katsanis N, Takano T, and Bustamante CD
- Subjects
- Female, Humans, Asian People genetics, Genome-Wide Association Study, Models, Genetic, Polymorphism, Single Nucleotide, White People genetics, Black People genetics, Genetic Predisposition to Disease, Genetic Risk Score
- Abstract
Background: Polygenic risk scores (PRS) derived from European individuals have reduced portability across global populations, limiting their clinical implementation at worldwide scale. Here, we investigate the performance of a wide range of PRS models across four ancestry groups (Africans, Europeans, East Asians, and South Asians) for 14 conditions of high-medical interest., Methods: To select the best-performing model per trait, we first compared PRS performances for publicly available scores, and constructed new models using different methods (LDpred2, PRS-CSx and SNPnet). We used 285 K European individuals from the UK Biobank (UKBB) for training and 18 K, including diverse ancestries, for testing. We then evaluated PRS portability for the best models in Europeans and compared their accuracies with respect to the best PRS per ancestry. Finally, we validated the selected PRS models using an independent set of 8,417 individuals from Biobank of the Americas-Genomelink (BbofA-GL); and performed a PRS-Phewas., Results: We confirmed a decay in PRS performances relative to Europeans when the evaluation was conducted using the best-PRS model for Europeans (51.3% for South Asians, 46.6% for East Asians and 39.4% for Africans). We observed an improvement in the PRS performances when specifically selecting ancestry specific PRS models (phenotype variance increase: 1.62 for Africans, 1.40 for South Asians and 0.96 for East Asians). Additionally, when we selected the optimal model conditional on ancestry for CAD, HDL-C and LDL-C, hypertension, hypothyroidism and T2D, PRS performance for studied populations was more comparable to what was observed in Europeans. Finally, we were able to independently validate tested models for Europeans, and conducted a PRS-Phewas, identifying cross-trait interplay between cardiometabolic conditions, and between immune-mediated components., Conclusion: Our work comprehensively evaluated PRS accuracy across a wide range of phenotypes, reducing the uncertainty with respect to which PRS model to choose and in which ancestry group. This evaluation has let us identify specific conditions where implementing risk-prioritization strategies could have practical utility across diverse ancestral groups, contributing to democratizing the implementation of PRS., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
36. Error-induced extinction in a multi-type critical birth-death process.
- Author
-
Guasch MB, Krapivsky PL, and Antal T
- Subjects
- Humans, Mathematical Concepts, Mutation Rate, Models, Biological, Neoplasms genetics, Neoplasms mortality, Neoplasms pathology, Models, Genetic, Animals, Cell Death, Extinction, Biological, Mutation
- Abstract
Extreme mutation rates in microbes and cancer cells can result in error-induced extinction (EEX), where every descendant cell eventually acquires a lethal mutation. In this work, we investigate critical birth-death processes with n distinct types as a birth-death model of EEX in a growing population. Each type-i cell divides independently ( i ) → ( i ) + ( i ) or mutates ( i ) → ( i + 1 ) at the same rate. The total number of cells grows exponentially as a Yule process until a cell of type-n appears, which cell type can only divide or die at rate one. This makes the whole process critical and hence after the exponentially growing phase eventually all cells die with probability one. We present large-time asymptotic results for the general n-type critical birth-death process. We find that the mass function of the number of cells of type-k has algebraic and stationary tail ( size ) - 1 - χ k , with χ k = 2 1 - k , for k = 2 , ⋯ , n , in sharp contrast to the exponential tail of the first type. The same exponents describe the tail of the asymptotic survival probability ( time ) - ξ k . We present applications of the results for studying extinction due to intolerable mutation rates in biological populations., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
37. Tautology explains evolution without variation and selection. A Comment on: 'An evolutionary process without variation and selection' (2021), by Gabora et al.
- Author
-
Zachar I, Máté J, and Számadó S
- Subjects
- Models, Biological, Models, Genetic, Selection, Genetic, Biological Evolution
- Abstract
Gabora and Steel (Gabora L, Steel M. 2021 An evolutionary process without variation and selection. J. R. Soc. Interface 18, 20210334. [doi:10.1098/rsif.2021.0334]) claim that cumulative adaptive evolution is possible without natural selection, that is, without variation and competition. To support this claim, the authors modelled a theoretical process called self-other reorganization (SOR) that envisages a population of reflexively autocatalytic sets that can accumulate beneficial changes without any form of birth, death or selection, that is without population dynamics. The authors claim that despite being non-Darwinian, adaptive evolution happens in SOR, deeming it relevant to the origin of life and to cultural evolution. We analysed SOR and the claim that it implements evolution without variation and selection. We found that the authors, by design, ignore the growth and/or degradation of autocatalytic sets or their components, assuming all effects are beneficial and all entities in SOR are identical and immutable. We prove that due to these assumptions, SOR is a trivial model of horizontal percolation of beneficial effects over a static population. We implemented an extended model of SOR including more realistic assumptions to prove that accounting for any of the ignored processes inevitably leads to conventional Darwinian dynamics. Our analysis directly challenges the authors' claims, revealing evidence of an overly fragile foundation. While the best-case scenario the authors incorrectly generalize from may be mathematically valid, stripping away their unrealistic assumptions reveals that SOR does not represent real entities (e.g. protocells) but rather models the triviality that fast horizontal diffusion of effects can effectively equalize a population. Adaptation in SOR is solely because the authors only consider beneficial effects. The omission of death/growth dynamics and maladaptive effects renders SOR unrealistic and its relevance to evolution, cultural or biological, questionable.
- Published
- 2024
- Full Text
- View/download PDF
38. (Epi)mutation Rates and the Evolution of Composite Trait Architectures.
- Author
-
Polizzi B, Calvez V, Charlat S, and Rajon E
- Subjects
- Phenotype, Selection, Genetic, Epistasis, Genetic, Mutation, Models, Genetic, Mutation Rate, Biological Evolution
- Abstract
AbstractMutation rates vary widely along genomes and across inheritance systems. This suggests that complex traits-resulting from the contributions of multiple determinants-might be composite in terms of the underlying mutation rates. Here we investigate through mathematical modeling whether such a heterogeneity may drive changes in a trait's architecture, especially in fluctuating environments, where phenotypic instability can be beneficial. We first identify a convexity principle related to the shape of the trait's fitness function, setting conditions under which composite architectures should be adaptive or, conversely and more commonly, should be selected against. Simulations reveal, however, that applying this principle to realistic evolving populations requires taking into account pervasive epistatic interactions that take place in the system. Indeed, the fate of a mutation affecting the architecture depends on the (epi)genetic background, which itself depends on the current architecture in the population. We tackle this problem by borrowing the adaptive dynamics framework from evolutionary ecology-where it is routinely used to deal with such resident/mutant dependencies-and find that the principle excluding composite architectures generally prevails. Yet the predicted evolutionary trajectories will typically depend on the initial architecture, possibly resulting in historical contingencies. Finally, by relaxing the large population size assumption, we unexpectedly find that not only the strength of selection on a trait's architecture but also its direction depend on population size, revealing a new occurrence of the recently identified phenomenon coined "sign inversion."
- Published
- 2024
- Full Text
- View/download PDF
39. Unveiling Commonalities and Differences in Genetic Regulations via Two-Way Fusion.
- Author
-
Mei B, Jiang Y, and Sun Y
- Subjects
- Humans, Computational Biology methods, DNA Copy Number Variations, Lung Neoplasms genetics, Gene Expression Regulation, Gene Regulatory Networks, Models, Genetic, DNA Methylation genetics, Algorithms
- Abstract
Understanding the genetic regulation, for example, gene expressions (GEs) by copy number variations and methylations, is crucial to uncover the development and progression of complex diseases. Advancing from early studies that are mostly focused on homogeneous groups of patients, some recent studies have shifted their focus toward different patient groups, explored their commonalities and differences, and led to insightful findings. However, the analysis can be very challenging with one GE possibly regulated by multiple regulators and one regulator potentially regulating the expressions of multiple genes, leading to two distinct types of commonalities/differences in the patterns of genetic regulation. In addition, the high dimensionality of both sides of regulation poses challenges to computation. In this study, we develop a two-way fusion integrative analysis approach, which innovatively applies two fusion penalties to simultaneously identify commonalities/differences in the regulated pattern of GEs and regulating pattern of regulators, and adopt a Huber loss function to accommodate the possible data contamination. Moreover, a simple yet efficient iterative optimization algorithm is developed, which does not need to introduce any auxiliary variables and extra tuning parameters and is guaranteed to converge to a globally optimal solution. The advantages of the proposed approach are demonstrated in extensive simulations. The analysis of The Cancer Genome Atlas data on melanoma and lung cancer leads to interesting findings and satisfactory prediction performance.
- Published
- 2024
- Full Text
- View/download PDF
40. Integrating environmental gradients into breeding: application of genomic reactions norms in a perennial species.
- Author
-
Papin V, Bosc A, Sanchez L, and Bouffier L
- Subjects
- Environment, Genotype, Plant Breeding methods, Forests, Genomics methods, Models, Genetic, Phenotype, Gene-Environment Interaction, Trees genetics, Trees growth & development
- Abstract
Global warming threatens the productivity of forest plantations. We propose here the integration of environmental information into a genomic evaluation scheme using individual reaction norms, to enable the quantification of resilience in forest tree improvement and conservation strategies in the coming decades. Random regression models were used to fit wood ring series, reflecting the longitudinal phenotypic plasticity of tree growth, according to various environmental gradients. The predictive ability of the models was considered to select the most relevant environmental gradient, namely a gradient derived from an ecophysiological model and combining trunk water potential and temperature. Even if the individual ranking was preserved over most of the environmental gradient, strong genotype x environment interactions were detected in the extreme unfavorable part of the gradient, which includes environmental conditions that are very likely to be more frequent in the future. Combining genomic information and longitudinal data allowed to predict the growth of individuals in environments where they have not been observed. Phenotyping of 50% of the individuals in all the environments studied allowed to predict the growth of the remaining 50% of individuals in all these environments with a predictive ability of 0.25. Without changing the total number of observations, adding observations in a reduced number of environments for the individuals to be predicted, while decreasing the number of individuals phenotyped in all environments, increased the predictive ability to 0.59, highlighting the importance of phenotypic data allocation. We found that genomic reaction norms are useful for the characterization and prediction of the function of genetic parameters and facilitate breeding in a climate change context., (© 2024. The Author(s), under exclusive licence to The Genetics Society.)
- Published
- 2024
- Full Text
- View/download PDF
41. Partitioning the Genomic Components of Behavioral Disinhibition and Substance Use (Disorder) Using Genomic Structural Equation Modeling.
- Author
-
Horwitz TB, Zorina-Lichtenwalter K, Gustavson DE, Grotzinger AD, and Stallings MC
- Subjects
- Humans, Models, Genetic, Male, Phenotype, Female, Risk-Taking, Genomics methods, Genetic Predisposition to Disease genetics, Substance-Related Disorders genetics, Genome-Wide Association Study methods, Latent Class Analysis, Impulsive Behavior
- Abstract
Externalizing behaviors encompass manifestations of risk-taking, self-regulation, aggression, sensation-/reward-seeking, and impulsivity. Externalizing research often includes substance use (SUB), substance use disorder (SUD), and other (non-SUB/SUD) "behavioral disinhibition" (BD) traits. Genome-wide and twin research have pointed to overlapping genetic architecture within and across SUB, SUD, and BD. We created single-factor measurement models-each describing SUB, SUD, or BD traits-based on mutually exclusive sets of European ancestry genome-wide association study (GWAS) statistics exploring externalizing variables. We then assessed the partitioning of genetic covariance among the three facets using correlated factors models and Cholesky decomposition. Even when the residuals for indicators relating to the same substance were correlated across the SUB and SUD factors, the two factors yielded a large correlation (r
g = 0.803). BD correlated strongly with the SUD (rg = 0.774) and SUB (rg = 0.778) factors. In our initial decompositions, 33% of total BD variance remained after partialing out SUD and SUB. The majority of covariance between BD and SUB and between BD and SUD was shared across all factors, and, within these models, only a small fraction of the total variation in BD operated via an independent pathway with SUD or SUB outside of the other factor. When only nicotine/tobacco, cannabis, and alcohol were included for the SUB/SUD factors, their correlation increased to rg = 0.861; in corresponding decompositions, BD-specific variance decreased to 27%. Further research can better elucidate the properties of BD-specific variation by exploring its genetic/molecular correlates., (© 2024. The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.)- Published
- 2024
- Full Text
- View/download PDF
42. Natural Selection Across Three Generations of Americans.
- Author
-
Hugh-Jones D and Edwards T
- Subjects
- Humans, United States, Male, Female, Middle Aged, Aged, Educational Status, Fertility genetics, Models, Genetic, Selection, Genetic genetics, Multifactorial Inheritance genetics
- Abstract
We investigate natural selection on polygenic scores in the contemporary US, using the Health and Retirement Study. Across three generations, scores which correlate negatively (positively) with education are selected for (against). However, results only partially support the economic theory of fertility as an explanation for natural selection. The theory predicts that selection coefficients should be stronger among low-income, less educated, unmarried and younger parents, but these predictions are only half borne out: coefficients are larger only among low-income parents and unmarried parents. We also estimate effect sizes corrected for noise in the polygenic scores. Selection for some health traits is similar in magnitude to that for cognitive traits., (© 2024. The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.)
- Published
- 2024
- Full Text
- View/download PDF
43. Calculating Within-Pair Difference Scores in the Co-twin Control Design. Effects of Alternative Strategies.
- Author
-
Madrid-Valero JJ, Verhulst B, López-López JA, and Ordoñana JR
- Subjects
- Humans, Twins genetics, Twins, Monozygotic genetics, Computer Simulation, Research Design, Twins, Dizygotic genetics, Models, Statistical, Twin Studies as Topic methods, Models, Genetic
- Abstract
Co-twin studies are an elegant and powerful design that allows controlling for the effect of confounding variables, including genetic and a range of environmental factors. There are several approaches to carry out this design. One of the methods commonly used, when contrasting continuous variables, is to calculate difference scores between members of a twin pair on two associated variables, in order to analyse the covariation of such differences. However, information regarding whether and how the different ways of estimating within-pair difference scores may impact the results is scant. This study aimed to compare the results obtained by different methods of data transformation when performing a co-twin study and test how the magnitude of the association changes using each of those approaches. Data was simulated using a direction of causation model and by fixing the effect size of causal path to low, medium, and high values. Within-pair difference scores were calculated as relative scores for diverse within-pair ordering conditions or absolute scores. Pearson's correlations using relative difference scores vary across the established scenarios (how twins were ordered within pairs) and these discrepancies become larger as the within-twin correlation increases. Absolute difference scores tended to produce the lowest correlation in every condition. Our results show that both using absolute difference scores or ordering twins within pairs, may produce an artificial decrease in the magnitude of the studied association, obscuring the ability to detect patterns compatible with causation, which could lead to discrepancies across studies and erroneous conclusions., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
44. A Rigorous Framework to Classify the Postduplication Fate of Paralogous Genes.
- Author
-
Kalhor R, Beslon G, Lafond M, and Scornavacca C
- Subjects
- Computer Simulation, Humans, Gene Duplication, Models, Genetic, Evolution, Molecular
- Abstract
Gene duplication has a central role in evolution; still, little is known on the fates of the duplicated copies, their relative frequency, and on how environmental conditions affect them. Moreover, the lack of rigorous definitions concerning the fate of duplicated genes hinders the development of a global vision of this process. In this paper, we present a new framework aiming at characterizing and formally differentiating the fate of duplicated genes. Our framework has been tested via simulations, where the evolution of populations has been simulated using aevol, an in silico experimental evolution platform. Our results show several patterns that confirm some of the conclusions from previous studies, while also exhibiting new tendencies; this may open up new avenues to better understand the role of duplications as a driver of evolution.
- Published
- 2024
- Full Text
- View/download PDF
45. Pleiotropy, epistasis and the genetic architecture of quantitative traits.
- Author
-
Mackay TFC and Anholt RRH
- Subjects
- Humans, Animals, Phenotype, Models, Genetic, Polymorphism, Genetic, Quantitative Trait, Heritable, Genetic Pleiotropy, Epistasis, Genetic, Quantitative Trait Loci
- Abstract
Pleiotropy (whereby one genetic polymorphism affects multiple traits) and epistasis (whereby non-linear interactions between genetic polymorphisms affect the same trait) are fundamental aspects of the genetic architecture of quantitative traits. Recent advances in the ability to characterize the effects of polymorphic variants on molecular and organismal phenotypes in human and model organism populations have revealed the prevalence of pleiotropy and unexpected shared molecular genetic bases among quantitative traits, including diseases. By contrast, epistasis is common between polymorphic loci associated with quantitative traits in model organisms, such that alleles at one locus have different effects in different genetic backgrounds, but is rarely observed for human quantitative traits and common diseases. Here, we review the concepts and recent inferences about pleiotropy and epistasis, and discuss factors that contribute to similarities and differences between the genetic architecture of quantitative traits in model organisms and humans., (© 2024. Springer Nature Limited.)
- Published
- 2024
- Full Text
- View/download PDF
46. MTML: An Efficient Multitrait Multilocus GWAS Method Based on the Cauchy Combination Test.
- Author
-
Guo H, Li T, Shi Y, and Wang X
- Subjects
- Biometry methods, Quantitative Trait Loci, Models, Genetic, Monte Carlo Method, Genome-Wide Association Study methods, Arabidopsis genetics
- Abstract
Genome-wide association study (GWAS) by measuring the joint effect of multiple loci on multiple traits, has recently attracted interest, due to the decreased costs of high-throughput genotyping and phenotyping technologies. Previous studies mainly focused on either multilocus models that identify associations with a single trait or multitrait models that scan a single marker at a time. Since these types of models cannot fully utilize the association information, the powers of the tests are usually low. To potentially address this problem, we present here a multitrait multilocus (MTML) modeling framework that implements in three steps: (1) simplify the complex calculation; (2) reduce the model dimension; (3) integrate the joint contribution of single markers to multiple traits by Cauchy combination. The performances of MTML are evaluated and compared with other three published methods by Monte Carlo simulations. Simulation results show that MTML is more powerful for quantitative trait nucleotide detection and robust for various numbers of traits. In the meanwhile, MTML can effectively control type I error rate at a reasonable level. Real data analysis of Arabidopsis thaliana shows that MTML identifies more pleiotropic genetic associations. Therefore, we conclude that MTML is an efficient GWAS method for joint analysis of multiple quantitative traits. The R package MTML, which facilitates the implementation of the proposed method, is publicly available on GitHub https://github.com/Guohongping/MTML., (© 2024 Wiley‐VCH GmbH.)
- Published
- 2024
- Full Text
- View/download PDF
47. Artificial intelligence enables unified analysis of historical and landscape influences on genetic diversity.
- Author
-
Fonseca EM and Carstens BC
- Subjects
- Animals, Brazil, Genetics, Population, Models, Genetic, Polymorphism, Single Nucleotide, Anura genetics, Anura classification, Genetic Variation, Artificial Intelligence, Phylogeography
- Abstract
While genetic variation in any species is potentially shaped by a range of processes, phylogeography and landscape genetics are largely concerned with inferring how environmental conditions and landscape features impact neutral intraspecific diversity. However, even as both disciplines have come to utilize SNP data over the last decades, analytical approaches have remained for the most part focused on either broad-scale inferences of historical processes (phylogeography) or on more localized inferences about environmental and/or landscape features (landscape genetics). Here we demonstrate that an artificial intelligence model-based analytical framework can consider both deeper historical factors and landscape-level processes in an integrated analysis. We implement this framework using data collected from two Brazilian anurans, the Brazilian sibilator frog (Leptodactylus troglodytes) and granular toad (Rhinella granulosa). Our results indicate that historical demographic processes shape most the genetic variation in the sibulator frog, while landscape processes primarily influence variation in the granular toad. The machine learning framework used here allows both historical and landscape processes to be considered equally, rather than requiring researchers to make an a priori decision about which factors are important., Competing Interests: Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper., (Copyright © 2024 Elsevier Inc. All rights reserved.)
- Published
- 2024
- Full Text
- View/download PDF
48. Circular cut codes in genetic information.
- Author
-
Fimmel E, Michel CJ, and Strüngmann L
- Subjects
- Codon genetics, Evolution, Molecular, Codon Usage genetics, Archaea genetics, Nucleotides genetics, Bacteria genetics, Bacteria classification, Models, Genetic, Eukaryota genetics, Genetic Code genetics
- Abstract
In this work we present an analysis of the dinucleotide occurrences in the three codon sites 1-2, 2-3 and 1-3, based on a computation of the codon usage of three large sets of bacterial, archaeal and eukaryotic genes using the same method that identified a maximal C
3 self-complementary trinucleotide circular code X in genes of bacteria and eukaryotes in 1996 (Arquès and Michel, 1996). Surprisingly, two dinucleotide circular codes are identified in the codon sites 1-2 and 2-3. Furthermore, these two codes are shifted versions of each other. Moreover, the dinucleotide code in the codon site 1-3 is circular, self-complementary and contained in the projection of X onto the 1st and 3rd bases, i.e. by cutting the middle base in each codon of X. We prove several results showing that the circularity and the self-complementarity of trinucleotide codes is induced by the circularity and the self-complementarity of its dinucleotide cut codes. Finally, we present several evolutionary approaches for an emergence of trinucleotide codes from dinucleotide codes., Competing Interests: Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper., (Copyright © 2024 Elsevier B.V. All rights reserved.)- Published
- 2024
- Full Text
- View/download PDF
49. Estimation of genetic parameters for reproductive indices in sheep.
- Author
-
Senes BB, da Cruz VAR, Azevedo HC, Costa RB, and de Camargo GMF
- Subjects
- Animals, Female, Sheep genetics, Sheep physiology, Birth Weight genetics, Male, Bayes Theorem, Phenotype, Weaning, Litter Size genetics, Breeding, Body Weight genetics, Pedigree, Models, Genetic, Reproduction genetics
- Abstract
This study aimed to estimate two reproductive efficiency indices in sheep based on the ratio between litter weight (at birth and weaning) and dam weight, as well as their genetic parameters. Phenotypic and pedigree data comprising the period from 1990 to 2018 were obtained from the Santa Inês sheep database of Embrapa Tabuleiros Costeiros. For estimation of the genetic parameters of the indices, a repeatability model was applied in single- and two-trait analyses by a Bayesian approach. The mean reproductive efficiency index was 0.069 ± 0.0163 and 0.43 ± 0.0955 at birth and weaning, respectively. These values indicate that, on average, ewes give birth to 69 g of lamb per kg body weight and wean 430 g of lamb per kg body weight. Described here for the first time, the heritability estimate obtained in single- and two-trait analyses was 0.24 for the index based on birth weights and ranged from 0.13 to 0.15 for the index based on weaning weights. The estimates indicate the possibility of genetic gain by selection and are similar to those reported for reproductive traits in sheep, representing an option for selection criterion. The genetic correlation between indices was positive and moderate (0.26). The repeatability estimates were high (0.49 for the birth weight index and 0.71 for the weaning weight index). These values indicate good prediction of future performance with few observations. The weaning weight index might be a good culling criterion of females., (© 2024 Wiley‐VCH GmbH. Published by John Wiley & Sons Ltd.)
- Published
- 2024
- Full Text
- View/download PDF
50. A novel application of data-consistent inversion to overcome spurious inference in genome-wide association studies.
- Author
-
Janani N, Young KA, Kinney G, Strand M, Hokanson JE, Liu Y, Butler T, and Austin E
- Subjects
- Humans, Computer Simulation, Polymorphism, Single Nucleotide, Phenotype, Models, Statistical, Genotype, Pulmonary Disease, Chronic Obstructive genetics, Genetic Variation, Genome-Wide Association Study methods, Genome-Wide Association Study statistics & numerical data, Models, Genetic
- Abstract
The genome-wide association studies (GWAS) typically use linear or logistic regression models to identify associations between phenotypes (traits) and genotypes (genetic variants) of interest. However, the use of regression with the additive assumption has potential limitations. First, the normality assumption of residuals is the one that is rarely seen in practice, and deviation from normality increases the Type-I error rate. Second, building a model based on such an assumption ignores genetic structures, like, dominant, recessive, and protective-risk cases. Ignoring genetic variants may result in spurious conclusions about the associations between a variant and a trait. We propose an assumption-free model built upon data-consistent inversion (DCI), which is a recently developed measure-theoretic framework utilized for uncertainty quantification. This proposed DCI-derived model builds a nonparametric distribution on model inputs that propagates to the distribution of observed data without the required normality assumption of residuals in the regression model. This characteristic enables the proposed DCI-derived model to cover all genetic variants without emphasizing on additivity of the classic-GWAS model. Simulations and a replication GWAS with data from the COPDGene demonstrate the ability of this model to control the Type-I error rate at least as well as the classic-GWAS (additive linear model) approach while having similar or greater power to discover variants in different genetic modes of transmission., (© 2024 Wiley Periodicals LLC.)
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.