74 results on '"Bielawski JP"'
Search Results
2. Why are whales big? Genes behind ocean giants.
- Author
-
Magpali L and Bielawski JP
- Subjects
- Animals, Oceans and Seas, Whales genetics, Neoplasms genetics
- Abstract
Gigantism is prevalent in animals, but it has never reached more extreme levels than in aquatic mammals such as whales, dolphins, and porpoises. A new study by Silva et al. has uncovered five genes underlying this gigantism, a phenotype with important connections to aging and cancer suppression in long-lived animals., Competing Interests: Declaration of interests None are declared., (Copyright © 2023 Elsevier Ltd. All rights reserved.)
- Published
- 2023
- Full Text
- View/download PDF
3. The role of the ecological scaffold in the origin and maintenance of whole-group trait altruism in microbial populations.
- Author
-
Jones CT, Meynell L, Neto C, Susko E, and Bielawski JP
- Subjects
- Models, Theoretical, Computer Simulation, Cooperative Behavior, Biological Evolution, Altruism
- Abstract
Background: Kin and multilevel selection provide explanations for the existence of altruism based on traits or processes that enhance the inclusive fitness of an altruist individual. Kin selection is often based on individual-level traits, such as the ability to recognize other altruists, whereas multilevel selection requires a metapopulation structure and dispersal process. These theories are unified by the general principle that altruism can be fixed by positive selection provided the benefit of altruism is preferentially conferred to other altruists. Here we take a different explanatory approach based on the recently proposed concept of an "ecological scaffold". We demonstrate that ecological conditions consisting of a patchy nutrient supply that generates a metapopulation structure, episodic mixing of groups, and severe nutrient limitation, can support or "scaffold" the evolution of altruism in a population of microbes by amplifying drift. This contrasts with recent papers in which the ecological scaffold was shown to support selective processes and demonstrates the power of scaffolding even in the absence of selection., Results: Using computer simulations motivated by a simple theoretical model, we show that, although an altruistic mutant can be fixed within a single population of non-altruists by drift when nutrients are severely limited, the resulting altruistic population remains vulnerable to non-altruistic mutants. We then show how the imposition of the "ecological scaffold" onto a population of non-altruists alters the balance between selection and drift in a way that supports the fixation and subsequent persistence of altruism despite the possibility of invasion by non-altruists., Conclusions: The fixation of an altruistic mutant by drift is possible when supported by ecological conditions that impose a metapopulation structure, episodic mixing of groups, and severe nutrient limitation. This is significant because it offers an alternative explanation for the evolution of altruism based on drift rather than selection. Given the ubiquity of low-nutrient "oligotrophic" environments in which microbes exist (e.g., the open ocean, deep subsurface soils, or under the polar ice caps) our results suggest that altruistic and cooperative behaviors may be highly prevalent among microbial populations., (© 2023. The Author(s).)
- Published
- 2023
- Full Text
- View/download PDF
4. The gastrointestinal antibiotic resistome in pediatric leukemia and lymphoma patients.
- Author
-
MacDonald T, Dunn KA, MacDonald J, Langille MGI, Van Limbergen JE, Bielawski JP, and Kulkarni K
- Subjects
- Humans, Child, Anti-Bacterial Agents, Vancomycin, Genes, Bacterial, Gastrointestinal Tract microbiology, beta-Lactams, Leukemia genetics, Lymphoma genetics
- Abstract
Introduction: Most children with leukemia and lymphoma experience febrile neutropenia. These are treated with empiric antibiotics that include β-lactams and/or vancomycin. These are often administered for extended periods, and the effect on the resistome is unknown., Methods: We examined the impact of repeated courses and duration of antibiotic use on the resistome of 39 pediatric leukemia and lymphoma patients. Shotgun metagenome sequences from 127 stool samples of pediatric oncology patients were examined for abundance of antibiotic resistance genes (ARGs) in each sample. Abundances were grouped by repeated courses (no antibiotics, 1-2 courses, 3+ courses) and duration (no use, short duration, long and/or mixed durationg) of β-lactams, vancomycin and "any antibiotic" use. We assessed changes in both taxonomic composition and prevalence of ARGs among these groups., Results: We found that Bacteroidetes taxa and β-lactam resistance genes decreased, while opportunistic Firmicutes and Proteobacteria taxa, along with multidrug resistance genes, increased with repeated courses and/or duration of antibiotics. Efflux pump related genes predominated (92%) among the increased multidrug genes. While we found β-lactam ARGs present in the resistome, the taxa that appear to contain them were kept in check by antibiotic treatment. Multidrug ARGs, mostly efflux pumps or regulators of efflux pump genes, were associated with opportunistic pathogens, and both increased in the resistome with repeated antibiotic use and/or increased duration., Conclusions: Given the strong association between opportunistic pathogens and multidrug-related efflux pumps, we suggest that drug efflux capacity might allow the opportunistic pathogens to persist or increase despite repeated courses and/or duration of antibiotics. While drug efflux is the most direct explanation, other mechanisms that enhance the ability of opportunistic pathogens to handle environmental stress, or other aspects of the treatment environment, could also contribute to their ability to flourish within the gut during treatment. Persistence of opportunistic pathogens in an already dysbiotic and weakened gastrointestinal tract could increase the likelihood of life-threatening blood borne infections. Of the 39 patients, 59% experienced at least one gastrointestinal or blood infection and 60% of bacteremia's were bacteria found in stool samples. Antimicrobial stewardship and appropriate use and duration of antibiotics could help reduce morbidity and mortality in this vulnerable population., Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (Copyright © 2023 MacDonald, Dunn, MacDonald, Langille, Van Limbergen, Bielawski and Kulkarni.)
- Published
- 2023
- Full Text
- View/download PDF
5. Successful Dietary Therapy in Paediatric Crohn's Disease is Associated with Shifts in Bacterial Dysbiosis and Inflammatory Metabotype Towards Healthy Controls.
- Author
-
Verburgt CM, Dunn KA, Ghiboub M, Lewis JD, Wine E, Sigall Boneh R, Gerasimidis K, Shamir R, Penny S, Pinto DM, Cohen A, Bjorndahl P, Svolos V, Bielawski JP, Benninga MA, de Jonge WJ, and Van Limbergen JE
- Subjects
- Child, Humans, Bacteria genetics, Dysbiosis therapy, Escherichia coli, Firmicutes, Proteobacteria, Remission Induction, Case-Control Studies, Crohn Disease drug therapy
- Abstract
Background and Aims: Nutritional therapy with the Crohn's Disease Exclusion Diet + Partial Enteral Nutrition [CDED+PEN] or Exclusive Enteral Nutrition [EEN] induces remission and reduces inflammation in mild-to-moderate paediatric Crohn's disease [CD]. We aimed to assess if reaching remission with nutritional therapy is mediated by correcting compositional or functional dysbiosis., Methods: We assessed metagenome sequences, short chain fatty acids [SCFA] and bile acids [BA] in 54 paediatric CD patients reaching remission after nutritional therapy [with CDED + PEN or EEN] [NCT01728870], compared to 26 paediatric healthy controls., Results: Successful dietary therapy decreased the relative abundance of Proteobacteria and increased Firmicutes towards healthy controls. CD patients possessed a mixture of two metabotypes [M1 and M2], whereas all healthy controls had metabotype M1. M1 was characterised by high Bacteroidetes and Firmicutes, low Proteobacteria, and higher SCFA synthesis pathways, and M2 was associated with high Proteobacteria and genes involved in SCFA degradation. M1 contribution increased during diet: 48%, 63%, up to 74% [Weeks 0, 6, 12, respectively.]. By Week 12, genera from Proteobacteria reached relative abundance levels of healthy controls with the exception of E. coli. Despite an increase in SCFA synthesis pathways, remission was not associated with increased SCFAs. Primary BA decreased with EEN but not with CDED+PEN, and secondary BA did not change during diet., Conclusion: Successful dietary therapy induced correction of both compositional and functional dysbiosis. However, 12 weeks of diet was not enough to achieve complete correction of dysbiosis. Our data suggests that composition and metabotype are important and change quickly during the early clinical response to dietary intervention. Correction of dysbiosis may therefore be an important future treatment goal for CD., (© The Author(s) 2022. Published by Oxford University Press on behalf of European Crohn’s and Colitis Organisation.)
- Published
- 2023
- Full Text
- View/download PDF
6. Evolution of the connectivity and indispensability of a transferable gene: the simplicity hypothesis.
- Author
-
Jones CT, Susko E, and Bielawski JP
- Subjects
- Referral and Consultation, Gene Transfer, Horizontal, RNA
- Abstract
Background: The number of interactions between a transferable gene or its protein product and genes or gene products native to its microbial host is referred to as connectivity. Such interactions impact the tendency of the gene to be retained by evolution following horizontal gene transfer (HGT) into a microbial population. The complexity hypothesis posits that the protein product of a transferable gene with lower connectivity is more likely to function in a way that is beneficial to a new microbial host compared to the protein product of a transferable gene with higher connectivity. A gene with lower connectivity is consequently more likely to be fixed in any microbial population it enters by HGT. The more recently proposed simplicity hypothesis posits that the connectivity of a transferable gene might increase over time within any single microbial population due to gene-host coevolution, but that differential rates of colonization of microbial populations by HGT in accordance with differences in connectivity might act to counter this and even reduce connectivity over time, comprising an evolutionary trade-off., Results: We present a theoretical model that can be used to predict the conditions under which gene-host coevolution might increase or decrease the connectivity of a transferable gene over time. We show that the opportunity to enter new microbial populations by HGT can cause the connectivity of a transferable gene to evolve toward lower values, particularly in an environment that is unstable with respect to the function of the gene's protein product. We also show that a lack of such opportunity in a stable environment can cause the connectivity of a transferable gene to evolve toward higher values., Conclusion: Our theoretical model suggests that the connectivity of a transferable gene can change over time toward higher values corresponding to a more sessile state of lower transferability or lower values corresponding to a more itinerant state of higher transferability, depending on the ecological milieu in which the gene exists. We note, however, that a better understanding of gene-host coevolutionary dynamics in natural microbial systems is required before any further conclusions about the veracity of the simplicity hypothesis can be drawn., (© 2022. The Author(s).)
- Published
- 2022
- Full Text
- View/download PDF
7. Community-level evolutionary processes: Linking community genetics with replicator-interactor theory.
- Author
-
Lean CH, Doolittle WF, and Bielawski JP
- Subjects
- Phenotype, Ecosystem, Biological Evolution
- Abstract
Understanding community-level selection using Lewontin's criteria requires both community-level inheritance and community-level heritability, and in the discipline of community and ecosystem genetics, these are often conflated. While there are existing studies that show the possibility of both, these studies impose community-level inheritance as a product of the experimental design. For this reason, these experiments provide only weak support for the existence of community-level selection in nature. By contrast, treating communities as interactors (in line with Hull's replicator-interactor framework or Dawkins's idea of the "extended phenotype") provides a more plausible and empirically supportable model for the role of ecological communities in the evolutionary process.
- Published
- 2022
- Full Text
- View/download PDF
8. Antibiotic and antifungal use in pediatric leukemia and lymphoma patients are associated with increasing opportunistic pathogens and decreasing bacteria responsible for activities that enhance colonic defense.
- Author
-
Dunn KA, MacDonald T, Rodrigues GJ, Forbrigger Z, Bielawski JP, Langille MGI, Van Limbergen J, and Kulkarni K
- Subjects
- Animals, Anti-Bacterial Agents pharmacology, Anti-Bacterial Agents therapeutic use, Antifungal Agents pharmacology, Antifungal Agents therapeutic use, Bacteria, Butyrates, Child, Child, Preschool, Humans, Mice, RNA, Ribosomal, 16S genetics, Leukemia complications, Leukemia drug therapy, Lymphoma drug therapy
- Abstract
Due to decreased immunity, both antibiotics and antifungals are regularly used in pediatric hematologic-cancer patients as a means to prevent severe infections and febrile neutropenia. The general effect of antibiotics on the human gut microbiome is profound, yielding decreased diversity and changes in community structure. However, the specific effect on pediatric oncology patients is not well-studied. The effect of antifungal use is even less understood, having been studied only in mouse models. Because the composition of the gut microbiome is associated with regulation of hematopoiesis, immune function and gastrointestinal integrity, changes within the patient gut can have implications for the clinical management of hematologic malignancies. The pediatric population is particularly challenging because the composition of the microbiome is age dependent, with some of the most pronounced changes occurring in the first three years of life. We investigated how antibiotic and antifungal use shapes the taxonomic composition of the stool microbiome in pediatric patients with leukemia and lymphoma, as inferred from both 16S rRNA and metagenome data. Associations with age, antibiotic use and antifungal use were investigated using multiple analysis methods. In addition, multivariable differential abundance was used to identify and assess specific taxa that were associated with multiple variables. Both antibiotics and antifungals were linked to a general decline in diversity in stool samples, which included a decrease in relative abundance in butyrate producers that play a critical role in host gut physiology ( e.g. , Faecalibacterium , Anaerostipes, Dorea, Blautia ),. Furthermore, antifungal use was associated with a significant increase in relative abundance of opportunistic pathogens. Collectively, these findings have important implications for the treatment of leukemia and lymphoma patients. Butyrate is important for gastrointestinal integrity; it inhibits inflammation, reinforces colonic defense, mucosal immunity. and decreases oxidative stress. The routine use of broad-spectrum anti-infectives in pediatric oncology patients could simultaneously contribute to a decline in gastrointestinal integrity and colonic defense while promoting increases in opportunistic pathogens within the patient gut. Because the gut microbiome has been linked to both short-term clinical outcomes, and longer-lasting health effects, systematic characterization of the gut microbiome in pediatric patients during, and beyond, treatment is warranted., Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (Copyright © 2022 Dunn, MacDonald, Rodrigues, Forbrigger, Bielawski, Langille, Van Limbergen and Kulkarni.)
- Published
- 2022
- Full Text
- View/download PDF
9. Novel Application of Survival Models for Predicting Microbial Community Transitions with Variable Selection for Environmental DNA.
- Author
-
Bjorndahl P, Bielawski JP, Liu L, Zhou W, and Gu H
- Subjects
- Harmful Algal Bloom, Seasons, Cyanobacteria genetics, DNA, Environmental, Microbiota
- Abstract
Survival analysis is a prolific statistical tool in medicine for inferring risk and time to disease-related events. However, it is underutilized in microbiome research to predict microbial community-mediated events, partly due to the sparsity and high-dimensional nature of the data. We advance the application of Cox proportional hazards (Cox PH) survival models to environmental DNA (eDNA) data with feature selection suitable for filtering irrelevant and redundant taxonomic variables. Selection methods are compared in terms of false positives, sensitivity, and survival estimation accuracy in simulation and in a real data setting to forecast harmful cyanobacterial blooms. A novel extension of a method for selecting microbial biomarkers with survival data (SuRFCox) reliably outperforms other methods. We determine that Cox PH models with SuRFCox-selected predictors are more robust to varied signal, noise, and data correlation structure. SuRFCox also yields the most accurate and consistent prediction of blooms according to cross-validated testing by year over eight different bloom seasons. Identification of common biomarkers among validated survival forecasts over changing conditions has clear biological significance. Survival models with such biomarkers inform risk assessment and provide insight into the causes of critical community transitions. IMPORTANCE In this paper, we report on a novel approach of selecting microorganisms for model-based prediction of the time to critical microbially modulated events (e.g., harmful algal blooms, clinical outcomes, community shifts, etc.). Our novel method for identifying biomarkers from large, dynamic communities of microbes has broad utility to environmental and ecological impact risk assessment and public health. Results will also promote theoretical and practical advancements relevant to the biology of specific organisms. To address the unique challenge posed by diverse environmental conditions and sparse microbes, we developed a novel method of selecting predictors for modeling time-to-event data. Competing methods for selecting predictors are rigorously compared to determine which is the most accurate and generalizable. Model forecasts are applied to show suitable predictors can precisely quantify the risk over time of biological events like harmful cyanobacterial blooms.
- Published
- 2022
- Full Text
- View/download PDF
10. Evolution of Amino Acid Propensities under Stability-Mediated Epistasis.
- Author
-
Youssef N, Susko E, Roger AJ, and Bielawski JP
- Subjects
- Amino Acid Substitution, Epistasis, Genetic, Proteins genetics, Thermodynamics, Amino Acids chemistry, Amino Acids genetics, Evolution, Molecular
- Abstract
Site-specific amino acid preferences are influenced by the genetic background of the protein. The preferences for resident amino acids are expected to, on average, increase over time because of replacements at other sites-a nonadaptive phenomenon referred to as the "evolutionary Stokes shift." Alternatively, decreases in resident amino acid propensity have recently been viewed as evidence of adaptations to external environmental changes. Using population genetics theory and thermodynamic stability constraints, we show that nonadaptive evolution can lead to both positive and negative shifts in propensities following the fixation of an amino acid, emphasizing that the detection of negative shifts is not conclusive evidence of adaptation. By examining propensity shifts from when an amino acid is first accepted at a site until it is subsequently replaced, we find that ≈50% of sites show a decrease in the propensity for the newly resident amino acid while the remaining sites show an increase. Furthermore, the distributions of the magnitudes of positive and negative shifts were comparable. Preferences were often conserved via a significant negative autocorrelation in propensity changes-increases in propensities often followed by decreases, and vice versa. Lastly, we explore the underlying mechanisms that lead propensities to fluctuate. We observe that stabilizing replacements increase the mutational tolerance at a site and in doing so decrease the propensity for the resident amino acid. In contrast, destabilizing substitutions result in more rugged fitness landscapes that tend to favor the resident amino acid. In summary, our results characterize propensity trajectories under nonadaptive stability-constrained evolution against which evidence of adaptations should be calibrated., (© The Author(s) 2022. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.)
- Published
- 2022
- Full Text
- View/download PDF
11. Gut bacterial gene changes following pegaspargase treatment in pediatric patients with acute lymphoblastic leukemia.
- Author
-
Dunn KA, Forbrigger Z, Connors J, Rahman M, Cohen A, Van Limbergen J, Langille MGI, Stadnyk AW, Bielawski JP, Penny SL, MacDonald T, and Kulkarni K
- Subjects
- Asparaginase adverse effects, Asparagine, Aspartic Acid, Child, Genes, Bacterial, Glutamic Acid therapeutic use, Glutamine therapeutic use, Humans, Polyethylene Glycols adverse effects, Antineoplastic Agents therapeutic use, Precursor Cell Lymphoblastic Leukemia-Lymphoma drug therapy, Precursor Cell Lymphoblastic Leukemia-Lymphoma genetics
- Abstract
Treatment of pediatric acute lymphoblastic leukemia (ALL) with pegaspargase exploits ALL cells dependency on asparagine. Pegaspargase depletes asparagine, consequentially affecting aspartate, glutamine and glutamate. The gut as a confounding source of these amino acids (AAs) and the role of gut microbiome metabolism of AAs has not been examined. We examined asparagine, aspartate, glutamine and glutamate in stool samples from patients over pegaspargase treatment. Microbial gene-products, which interact with these AAs were identified. Stool asparagine declined significantly, and 31 microbial genes changed over treatment. Changes were complex, and included genes involved in AA metabolism, nutrient sensing, and pathways increased in cancers. While we identified changes in a gene ( iaaA ) with limited asparaginase activity, it lacked significance after correction leaving open other mechanisms for asparagine decline, possibly including loss from gut to blood. Understanding pathways that change AA availability, including by microbes in the gut, could be useful in optimizing pegaspargase therapy.
- Published
- 2021
- Full Text
- View/download PDF
12. Shifts in amino acid preferences as proteins evolve: A synthesis of experimental and theoretical work.
- Author
-
Youssef N, Susko E, Roger AJ, and Bielawski JP
- Subjects
- Amino Acids chemistry, Amino Acids genetics, Amino Acids metabolism, Evolution, Molecular, Models, Genetic, Phylogeny, Proteins chemistry, Proteins genetics, Proteins metabolism
- Abstract
Amino acid preferences vary across sites and time. While variation across sites is widely accepted, the extent and frequency of temporal shifts are contentious. Our understanding of the drivers of amino acid preference change is incomplete: To what extent are temporal shifts driven by adaptive versus nonadaptive evolutionary processes? We review phenomena that cause preferences to vary (e.g., evolutionary Stokes shift, contingency, and entrenchment) and clarify how they differ. To determine the extent and prevalence of shifted preferences, we review experimental and theoretical studies. Analyses of natural sequence alignments often detect decreases in homoplasy (convergence and reversions) rates, and variation in replacement rates with time-signals that are consistent with temporally changing preferences. While approaches inferring shifts in preferences from patterns in natural alignments are valuable, they are indirect since multiple mechanisms (both adaptive and nonadaptive) could lead to the observed signal. Alternatively, site-directed mutagenesis experiments allow for a more direct assessment of shifted preferences. They corroborate evidence from multiple sequence alignments, revealing that the preference for an amino acid at a site varies depending on the background sequence. However, shifts in preferences are usually minor in magnitude and sites with significantly shifted preferences are low in frequency. The small yet consistent perturbations in preferences could, nevertheless, jeopardize the accuracy of inference procedures, which assume constant preferences. We conclude by discussing if and how such shifts in preferences might influence widely used time-homogenous inference procedures and potential ways to mitigate such effects., (© 2021 The Protein Society.)
- Published
- 2021
- Full Text
- View/download PDF
13. The role of purifying selection in the origin and maintenance of complex function.
- Author
-
Brunet TDP, Doolittle WF, and Bielawski JP
- Subjects
- Adaptation, Physiological, Evolution, Molecular, Humans, Genome, Human, Genomics
- Abstract
Fitness contribution alone should not be the criterion of 'function' in molecular biology and genomics. Disagreement over the use of 'function' in molecular biology and genomics is still with us, almost eight years after publicity surrounding the Encyclopedia of DNA Elements project claimed that 80.4% of the human genome comprises "functional elements". Recent approaches attempt to resolve or reformulate this debate by redefining genomic 'function' in terms of current fitness contribution. In its favour, this redefinition for the genomic context is in apparent conformity with predominant experimental practices, especially in biomedical research, and with ascription of function by selective maintenance. We argue against approaches of this kind, however, on the grounds that they could be seen as non-Darwinian, and fail to properly account for the diversity of non-adaptive processes involved in the origin and maintenance of genomic complexity. We examine cases of molecular and organismal complexity that arise neutrally, showing how purifying selection maintains non-adaptive genomic complexity. Rather than lumping different sorts of genomic complexity together by defining 'function' as fitness contribution, we argue that it is best to separate the heterogeneous contributions of preaptation, exaptation and adaptation to the historical processes of origin and maintenance for complex features., (Copyright © 2021 Elsevier Ltd. All rights reserved.)
- Published
- 2021
- Full Text
- View/download PDF
14. Investigating the gut microbial community and genes in children with differing levels of change in serum asparaginase activity during pegaspargase treatment for acute lymphoblastic leukemia.
- Author
-
Dunn KA, Connors J, Bielawski JP, Nearing JT, Langille MGI, Van Limbergen J, Fernandez CV, MacDonald T, and Kulkarni K
- Subjects
- Asparaginase therapeutic use, Child, Humans, Polyethylene Glycols, Antineoplastic Agents therapeutic use, Gastrointestinal Microbiome, Microbiota, Precursor Cell Lymphoblastic Leukemia-Lymphoma drug therapy, Precursor Cell Lymphoblastic Leukemia-Lymphoma genetics
- Abstract
Asparaginase (ASNase) is an effective treatment of pediatric acute lymphoblastic leukemia (ALL). Changes in ASNase activity may lead to suboptimal treatment and poorer outcomes. The gut microbiome produces metabolites that could impact ASNase therapy, however, remains uninvestigated. We examined gut-microbial community and microbial-ASNase and asparagine synthetase (ASNS) genes using 16SrRNA and metagenomic sequence data from stool samples of pediatric ALL patients. Comparing ASNase activity between consecutive ASNase-doses, we found microbial communities differed between decreased- and increased-activity samples. Escherichia predominated in the decreased-activity community while Bacteroides and Streptococcus predominated in the increased-activity community. In addition microbial ASNS was significantly ( p =.004) negatively correlated with change in serum ASNase activity. These preliminary findings suggest microbial communities prior to treatment could affect serum ASNase levels, although the mechanism is unknown. Replication in an independent cohort is needed, and future research on manipulation of these communities and genes could prove useful in optimizing ASNase therapy.
- Published
- 2021
- Full Text
- View/download PDF
15. Consequences of Stability-Induced Epistasis for Substitution Rates.
- Author
-
Youssef N, Susko E, and Bielawski JP
- Subjects
- Mutation, Selection, Genetic, Epistasis, Genetic, Evolution, Molecular, Models, Genetic
- Abstract
Do interactions between residues in a protein (i.e., epistasis) significantly alter evolutionary dynamics? If so, what consequences might they have on inference from traditional codon substitution models which assume site-independence for the sake of computational tractability? To investigate the effects of epistasis on substitution rates, we employed a mechanistic mutation-selection model in conjunction with a fitness framework derived from protein stability. We refer to this as the stability-informed site-dependent (S-SD) model and developed a new stability-informed site-independent (S-SI) model that captures the average effect of stability constraints on individual sites of a protein. Comparison of S-SI and S-SD offers a novel and direct method for investigating the consequences of stability-induced epistasis on protein evolution. We developed S-SI and S-SD models for three natural proteins and showed that they generate sequences consistent with real alignments. Our analyses revealed that epistasis tends to increase substitution rates compared with the rates under site-independent evolution. We then assessed the epistatic sensitivity of individual site and discovered a counterintuitive effect: Highly connected sites were less influenced by epistasis relative to exposed sites. Lastly, we show that, despite the unrealistic assumptions, traditional models perform comparably well in the presence and absence of epistasis and provide reasonable summaries of average selection intensities. We conclude that epistatic models are critical to understanding protein evolutionary dynamics, but epistasis might not be required for reasonable inference of selection pressure when averaging over time and sites., (© The Author(s) 2020. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.)
- Published
- 2020
- Full Text
- View/download PDF
16. A Phenotype-Genotype Codon Model for Detecting Adaptive Evolution.
- Author
-
Jones CT, Youssef N, Susko E, and Bielawski JP
- Subjects
- Adaptation, Physiological genetics, Computer Simulation, Classification methods, Codon genetics, Genotype, Phenotype, Phylogeny
- Abstract
A central objective in biology is to link adaptive evolution in a gene to structural and/or functional phenotypic novelties. Yet most analytic methods make inferences mainly from either phenotypic data or genetic data alone. A small number of models have been developed to infer correlations between the rate of molecular evolution and changes in a discrete or continuous life history trait. But such correlations are not necessarily evidence of adaptation. Here, we present a novel approach called the phenotype-genotype branch-site model (PG-BSM) designed to detect evidence of adaptive codon evolution associated with discrete-state phenotype evolution. An episode of adaptation is inferred under standard codon substitution models when there is evidence of positive selection in the form of an elevation in the nonsynonymous-to-synonymous rate ratio $\omega$ to a value $\omega > 1$. As it is becoming increasingly clear that $\omega > 1$ can occur without adaptation, the PG-BSM was formulated to infer an instance of adaptive evolution without appealing to evidence of positive selection. The null model makes use of a covarion-like component to account for general heterotachy (i.e., random changes in the evolutionary rate at a site over time). The alternative model employs samples of the phenotypic evolutionary history to test for phenomenological patterns of heterotachy consistent with specific mechanisms of molecular adaptation. These include 1) a persistent increase/decrease in $\omega$ at a site following a change in phenotype (the pattern) consistent with an increase/decrease in the functional importance of the site (the mechanism); and 2) a transient increase in $\omega$ at a site along a branch over which the phenotype changed (the pattern) consistent with a change in the site's optimal amino acid (the mechanism). Rejection of the null is followed by post hoc analyses to identify sites with strongest evidence for adaptation in association with changes in the phenotype as well as the most likely evolutionary history of the phenotype. Simulation studies based on a novel method for generating mechanistically realistic signatures of molecular adaptation show that the PG-BSM has good statistical properties. Analyses of real alignments show that site patterns identified post hoc are consistent with the specific mechanisms of adaptation included in the alternate model. Further simulation studies show that the covarion-like component of the PG-BSM plays a crucial role in mitigating recently discovered statistical pathologies associated with confounding by accounting for heterotachy-by-any-cause. [Adaptive evolution; branch-site model; confounding; mutation-selection; phenotype-genotype.]., (© The Author(s) 2019. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.)
- Published
- 2020
- Full Text
- View/download PDF
17. Bacterial Taxa and Functions Are Predictive of Sustained Remission Following Exclusive Enteral Nutrition in Pediatric Crohn's Disease.
- Author
-
Jones CMA, Connors J, Dunn KA, Bielawski JP, Comeau AM, Langille MGI, and Van Limbergen J
- Subjects
- Adolescent, Bacteria genetics, Bacterial Typing Techniques methods, Child, Crohn Disease therapy, Feces chemistry, Feces microbiology, Female, Follow-Up Studies, Humans, Leukocyte L1 Antigen Complex analysis, Machine Learning, Male, Metagenome, Predictive Value of Tests, Prospective Studies, RNA, Ribosomal, 16S, Recurrence, Remission Induction, Severity of Illness Index, Bacteria classification, Bacterial Typing Techniques statistics & numerical data, Crohn Disease microbiology, Enteral Nutrition, Gastrointestinal Microbiome genetics
- Abstract
Background: The gut microbiome is extensively involved in induction of remission in pediatric Crohn's disease (CD) patients by exclusive enteral nutrition (EEN). In this follow-up study of pediatric CD patients undergoing treatment with EEN, we employ machine learning models trained on baseline gut microbiome data to distinguish patients who achieved and sustained remission (SR) from those who did not achieve remission nor relapse (non-SR) by 24 weeks., Methods: A total of 139 fecal samples were obtained from 22 patients (8-15 years of age) for up to 96 weeks. Gut microbiome taxonomy was assessed by 16S rRNA gene sequencing, and functional capacity was assessed by metagenomic sequencing. We used standard metrics of diversity and taxonomy to quantify differences between SR and non-SR patients and to associate gut microbial shifts with fecal calprotectin (FCP), and disease severity as defined by weighted Pediatric Crohn's Disease Activity Index. We used microbial data sets in addition to clinical metadata in random forests (RFs) models to classify treatment response and predict FCP levels., Results: Microbial diversity did not change after EEN, but species richness was lower in low-FCP samples (<250 µg/g). An RF model using microbial abundances, species richness, and Paris disease classification was the best at classifying treatment response (area under the curve [AUC] = 0.9). KEGG Pathways also significantly classified treatment response with the addition of the same clinical data (AUC = 0.8). Top features of the RF model are consistent with previously identified IBD taxa, such as Ruminococcaceae and Ruminococcus gnavus., Conclusions: Our machine learning approach is able to distinguish SR and non-SR samples using baseline microbiome and clinical data., (© 2020 Crohn’s & Colitis Foundation. Published by Oxford University Press on behalf of Crohn’s & Colitis Foundation.)
- Published
- 2020
- Full Text
- View/download PDF
18. Re-evaluating the relationship between missing heritability and the microbiome.
- Author
-
Douglas GM, Bielawski JP, and Langille MGI
- Subjects
- Genetic Variation, Genome, Human, Humans, Phenotype, Genome-Wide Association Study, Microbiota genetics
- Abstract
Human genome-wide association studies (GWASs) have recurrently estimated lower heritability estimates than familial studies. Many explanations have been suggested to explain these lower estimates, including that a substantial proportion of genetic variation and gene-by-environment interactions are unmeasured in typical GWASs. The human microbiome is potentially related to both of these explanations, but it has been more commonly considered as a source of unmeasured genetic variation. In particular, it has recently been argued that the genetic variation within the human microbiome should be included when estimating trait heritability. We outline issues with this argument, which in its strictest form depends on the holobiont model of human-microbiome interactions. Instead, we argue that the microbiome could be leveraged to help control for environmental variation across a population, although that remains to be determined. We discuss potential approaches that could be explored to determine whether integrating microbiome sequencing data into GWASs is useful. Video abstract.
- Published
- 2020
- Full Text
- View/download PDF
19. The relationship between fecal bile acids and microbiome community structure in pediatric Crohn's disease.
- Author
-
Connors J, Dunn KA, Allott J, Bandsma R, Rashid M, Otley AR, Bielawski JP, and Van Limbergen J
- Subjects
- Adolescent, Bacteria classification, Bacteria genetics, Bacteria isolation & purification, Bile Acids and Salts metabolism, Child, Crohn Disease metabolism, Feces microbiology, Female, Humans, Intestines microbiology, Liver metabolism, Male, Crohn Disease microbiology, Gastrointestinal Microbiome
- Abstract
Gut microbiome community structure is associated with Crohn's disease (CD) development and response to therapy. Bile acids (BAs) play a central role in modulating intestinal immune responses, and changes in gut bacterial communities can profoundly alter the intestinal BA pool. The liver synthesizes and conjugates primary bile acids (priBAs) that are then deconjugated, epimerized, and dehydroxylated by gut bacteria to produce secondary bile acids (secBAs). We investigated the relationship between the gut microbiome and the fecal BA pool in stool samples obtained from a well-characterized cohort of pediatric CD patients undergoing nutritional therapy to induce disease remission. We found that fecal BA composition was altered in a sub-group of CD patients who did not sustain remission. The microbial community structures associated with priBA and secBA-dominant profiles were distinct. In addition, the fecal BA concentrations were correlated with the abundance of distinct bacterial taxonomic groups. Finally, priBA dominant samples were associated with community-level decreases in enzymes for dehydroxylation but not deconjugation.
- Published
- 2020
- Full Text
- View/download PDF
20. Crohn's Disease Exclusion Diet Plus Partial Enteral Nutrition Induces Sustained Remission in a Randomized Controlled Trial.
- Author
-
Levine A, Wine E, Assa A, Sigall Boneh R, Shaoul R, Kori M, Cohen S, Peleg S, Shamaly H, On A, Millman P, Abramas L, Ziv-Baran T, Grant S, Abitbol G, Dunn KA, Bielawski JP, and Van Limbergen J
- Subjects
- Adolescent, Child, Combined Modality Therapy methods, Crohn Disease diagnosis, Female, Humans, Male, Prospective Studies, Remission Induction methods, Severity of Illness Index, Treatment Outcome, Crohn Disease therapy, Diet Therapy methods, Enteral Nutrition methods
- Abstract
Background & Aims: Exclusive enteral nutrition (EEN) is recommended for children with mild to moderate Crohn's disease (CD), but implementation is challenging. We compared EEN with the CD exclusion diet (CDED), a whole-food diet coupled with partial enteral nutrition (PEN), designed to reduce exposure to dietary components that have adverse effects on the microbiome and intestinal barrier., Methods: We performed a 12-week prospective trial of children with mild to moderate CD. The children were randomly assigned to a group that received CDED plus 50% of calories from formula (Modulen, Nestlé) for 6 weeks (stage 1) followed by CDED with 25% PEN from weeks 7 to 12 (stage 2) (n = 40, group 1) or a group that received EEN for 6 weeks followed by a free diet with 25% PEN from weeks 7 to 12 (n = 38, group 2). Patients were evaluated at baseline and weeks 3, 6, and 12 and laboratory tests were performed; 16S ribosomal RNA gene (V4V5) sequencing was performed on stool samples. The primary endpoint was dietary tolerance. Secondary endpoints were intention to treat (ITT) remission at week 6 (pediatric CD activity index score below 10) and corticosteroid-free ITT sustained remission at week 12., Results: Four patients withdrew from the study because of intolerance by 48 hours, 74 patients (mean age 14.2 ± 2.7 years) were included for remission analysis. The combination of CDED and PEN was tolerated in 39 children (97.5%), whereas EEN was tolerated by 28 children (73.6%) (P = .002; odds ratio for tolerance of CDED and PEN, 13.92; 95% confidence interval [CI] 1.68-115.14). At week 6, 30 (75%) of 40 children given CDED plus PEN were in corticosteroid-free remission vs 20 (59%) of 34 children given EEN (P = .38). At week 12, 28 (75.6%) of 37 children given CDED plus PEN were in corticosteroid-free remission compared with 14 (45.1%) of 31 children given EEN and then PEN (P = .01; odds ratio for remission in children given CDED and PEN, 3.77; CI 1.34-10.59). In children given CDED plus PEN, corticosteroid-free remission was associated with sustained reductions in inflammation (based on serum level of C-reactive protein and fecal level of calprotectin) and fecal Proteobacteria., Conclusion: CDED plus PEN was better tolerated than EEN in children with mild to moderate CD. Both diets were effective in inducing remission by week 6. The combination CDED plus PEN induced sustained remission in a significantly higher proportion of patients than EEN, and produced changes in the fecal microbiome associated with remission. These data support use of CDED plus PEN to induce remission in children with CD. Clinicaltrials.gov no: NCT01728870., (Copyright © 2019 AGA Institute. Published by Elsevier Inc. All rights reserved.)
- Published
- 2019
- Full Text
- View/download PDF
21. ModL: exploring and restoring regularity when testing for positive selection.
- Author
-
Mingrone J, Susko E, and Bielawski JP
- Subjects
- Biometry, Chi-Square Distribution, Evolution, Molecular, Likelihood Functions, Software
- Abstract
Motivation: Likelihood ratio tests are commonly used to test for positive selection acting on proteins. They are usually applied with thresholds for declaring a protein under positive selection determined from a chi-square or mixture of chi-square distributions. Although it is known that such distributions are not strictly justified due to the statistical irregularity of the problem, the hope has been that the resulting tests are conservative and do not lose much power in comparison with the same test using the unknown, correct threshold. We show that commonly used thresholds need not yield conservative tests, but instead give larger than expected Type I error rates. Statistical regularity can be restored by using a modified likelihood ratio test., Results: We give theoretical results to prove that, if the number of sites is not too small, the modified likelihood ratio test gives approximately correct Type I error probabilities regardless of the parameter settings of the underlying null hypothesis. Simulations show that modification gives Type I error rates closer to those stated without a loss of power. The simulations also show that parameter estimation for mixture models of codon evolution can be challenging in certain data-generation settings with very different mixing distributions giving nearly identical site pattern distributions unless the number of taxa and tree length are large. Because mixture models are widely used for a variety of problems in molecular evolution, the challenges and general approaches to solving them presented here are applicable in a broader context., Availability and Implementation: https://github.com/jehops/codeml_modl., Supplementary Information: Supplementary data are available at Bioinformatics online., (© The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.)
- Published
- 2019
- Full Text
- View/download PDF
22. Improved inference of site-specific positive selection under a generalized parametric codon model when there are multinucleotide mutations and multiple nonsynonymous rates.
- Author
-
Dunn KA, Kenney T, Gu H, and Bielawski JP
- Subjects
- Computer Simulation, Evolution, Molecular, Streptococcus genetics, Codon genetics, Models, Genetic, Mutation genetics, Mutation Rate, Nucleotides genetics, Selection, Genetic
- Abstract
Background: An excess of nonsynonymous substitutions, over neutrality, is considered evidence of positive Darwinian selection. Inference for proteins often relies on estimation of the nonsynonymous to synonymous ratio (ω = d
N /dS ) within a codon model. However, to ease computational difficulties, ω is typically estimated assuming an idealized substitution process where (i) all nonsynonymous substitutions have the same rate (regardless of impact on organism fitness) and (ii) instantaneous double and triple (DT) nucleotide mutations have zero probability (despite evidence that they can occur). It follows that estimates of ω represent an imperfect summary of the intensity of selection, and that tests based on the ω > 1 threshold could be negatively impacted., Results: We developed a general-purpose parametric (GPP) modelling framework for codons. This novel approach allows specification of all possible instantaneous codon substitutions, including multiple nonsynonymous rates (MNRs) and instantaneous DT nucleotide changes. Existing codon models are specified as special cases of the GPP model. We use GPP models to implement likelihood ratio tests for ω > 1 that accommodate MNRs and DT mutations. Through both simulation and real data analysis, we find that failure to model MNRs and DT mutations reduces power in some cases and inflates false positives in others. False positives under traditional M2a and M8 models were very sensitive to DT changes. This was exacerbated by the choice of frequency parameterization (GY vs. MG), with rates sometimes > 90% under MG. By including MNRs and DT mutations, accuracy and power was greatly improved under the GPP framework. However, we also find that over-parameterized models can perform less well, and this can contribute to degraded performance of LRTs., Conclusions: We suggest GPP models should be used alongside traditional codon models. Further, all codon models should be deployed within an experimental design that includes (i) assessing robustness to model assumptions, and (ii) investigation of non-standard behaviour of MLEs. As the goal of every analysis is to avoid false conclusions, more work is needed on model selection methods that consider both the increase in fit engendered by a model parameter and the degree to which that parameter is affected by un-modelled evolutionary processes.- Published
- 2019
- Full Text
- View/download PDF
23. Looking for Darwin in Genomic Sequences: Validity and Success Depends on the Relationship Between Model and Data.
- Author
-
Jones CT, Susko E, and Bielawski JP
- Subjects
- Algorithms, Codon, Computational Biology methods, Genetic Variation, Genetics, Population, Humans, Reproducibility of Results, Selection, Genetic, Evolution, Molecular, Genome, Genomics methods, Models, Genetic
- Abstract
Codon substitution models (CSMs) are commonly used to infer the history of natural section for a set of protein-coding sequences, often with the explicit goal of detecting the signature of positive Darwinian selection. However, the validity and success of CSMs used in conjunction with the maximum likelihood (ML) framework is sometimes challenged with claims that the approach might too often support false conclusions. In this chapter, we use a case study approach to identify four legitimate statistical difficulties associated with inference of evolutionary events using CSMs. These include: (1) model misspecification, (2) low information content, (3) the confounding of processes, and (4) phenomenological load, or PL. While past criticisms of CSMs can be connected to these issues, the historical critiques were often misdirected, or overstated, because they failed to recognize that the success of any model-based approach depends on the relationship between model and data. Here, we explore this relationship and provide a candid assessment of the limitations of CSMs to extract historical information from extant sequences. To aid in this assessment, we provide a brief overview of: (1) a more realistic way of thinking about the process of codon evolution framed in terms of population genetic parameters, and (2) a novel presentation of the ML statistical framework. We then divide the development of CSMs into two broad phases of scientific activity and show that the latter phase is characterized by increases in model complexity that can sometimes negatively impact inference of evolutionary mechanisms. Such problems are not yet widely appreciated by the users of CSMs. These problems can be avoided by using a model that is appropriate for the data; but, understanding the relationship between the data and a fitted model is a difficult task. We argue that the only way to properly understand that relationship is to perform in silico experiments using a generating process that can mimic the data as closely as possible. The mutation-selection modeling framework (MutSel) is presented as the basis of such a generating process. We contend that if complex CSMs continue to be developed for testing explicit mechanistic hypotheses, then additional analyses such as those described in here (e.g., penalized LRTs and estimation of PL) will need to be applied alongside the more traditional inferential methods.
- Published
- 2019
- Full Text
- View/download PDF
24. Introduction to Genome Biology and Diversity.
- Author
-
Youssef N, Budd A, and Bielawski JP
- Subjects
- Archaea genetics, Bacteria genetics, Computational Biology methods, Gene Expression Regulation, Genetic Structures, Inheritance Patterns, Viruses genetics, Biodiversity, Eukaryota genetics, Genome, Genomics methods
- Abstract
Organisms display astonishing levels of cell and molecular diversity, including genome size, shape, and architecture. In this chapter, we review how the genome can be viewed as both a structural and an informational unit of biological diversity and explicitly define our intended meaning of genetic information. A brief overview of the characteristic features of bacterial, archaeal, and eukaryotic cell types and viruses sets the stage for a review of the differences in organization, size, and packaging strategies of their genomes. We include a detailed review of genetic elements found outside the primary chromosomal structures, as these provide insights into how genomes are sometimes viewed as incomplete informational entities. Lastly, we reassess the definition of the genome in light of recent advancements in our understanding of the diversity of genomic structures and the mechanisms by which genetic information is expressed within the cell. Collectively, these topics comprise a good introduction to genome biology for the newcomer to the field and provide a valuable reference for those developing new statistical or computation methods in genomics. This review also prepares the reader for anticipated transformations in thinking as the field of genome biology progresses.
- Published
- 2019
- Full Text
- View/download PDF
25. Phenomenological Load on Model Parameters Can Lead to False Biological Conclusions.
- Author
-
Jones CT, Youssef N, Susko E, and Bielawski JP
- Subjects
- Animals, DNA, Mitochondrial, Evolution, Molecular, Likelihood Functions, Sequence Alignment, Mammals genetics, Models, Genetic, Mutation, Selection, Genetic, Silent Mutation
- Abstract
When a substitution model is fitted to an alignment using maximum likelihood, its parameters are adjusted to account for as much site-pattern variation as possible. A parameter might therefore absorb a substantial quantity of the total variance in an alignment (or more formally, bring about a substantial reduction in the deviance of the fitted model) even if the process it represents played no role in the generation of the data. When this occurs, we say that the parameter estimate carries phenomenological load (PL). Large PL in a parameter estimate is a concern because it not only invalidates its mechanistic interpretation (if it has one) but also increases the likelihood that it will be found to be statistically significant. The problem of PL was not identified in the past because most off-the-shelf substitution models make simplifying assumptions that preclude the generation of realistic levels of variation. In this study, we use the more realistic mutation-selection framework as the basis of a generating model formulated to produce data that mimic an alignment of mammalian mitochondrial DNA. We show that a parameter estimate can carry PL when 1) the substitution model is underspecified and 2) the parameter represents a process that is confounded with other processes represented in the data-generating model. We then provide a method that can be used to identify signal for the process that a given parameter represents despite the existence of PL.
- Published
- 2018
- Full Text
- View/download PDF
26. Multi-omics differentially classify disease state and treatment outcome in pediatric Crohn's disease.
- Author
-
Douglas GM, Hansen R, Jones CMA, Dunn KA, Comeau AM, Bielawski JP, Tayler R, El-Omar EM, Russell RK, Hold GL, Langille MGI, and Van Limbergen J
- Subjects
- Adolescent, Child, Child, Preschool, Crohn Disease microbiology, DNA, Bacterial genetics, DNA, Ribosomal genetics, Feces microbiology, Female, Genetic Predisposition to Disease, Humans, Machine Learning, Male, Crohn Disease genetics, Dysbiosis complications, Genetic Variation, Metagenomics methods, RNA, Ribosomal, 16S genetics, Sequence Analysis, DNA methods
- Abstract
Background: Crohn's disease (CD) has an unclear etiology, but there is growing evidence of a direct link with a dysbiotic microbiome. Many gut microbes have previously been associated with CD, but these have mainly been confounded with patients' ongoing treatments. Additionally, most analyses of CD patients' microbiomes have focused on microbes in stool samples, which yield different insights than profiling biopsy samples., Results: We sequenced the 16S rRNA gene (16S) and carried out shotgun metagenomics (MGS) from the intestinal biopsies of 20 treatment-naïve CD and 20 control pediatric patients. We identified the abundances of microbial taxa and inferred functional categories within each dataset. We also identified known human genetic variants from the MGS data. We then used a machine learning approach to determine the classification accuracy when these datasets, collapsed to different hierarchical groupings, were used independently to classify patients by disease state and by CD patients' response to treatment. We found that 16S-identified microbes could classify patients with higher accuracy in both cases. Based on follow-ups with these patients, we identified which microbes and functions were best for predicting disease state and response to treatment, including several previously identified markers. By combining the top features from all significant models into a single model, we could compare the relative importance of these predictive features. We found that 16S-identified microbes are the best predictors of CD state whereas MGS-identified markers perform best for classifying treatment response., Conclusions: We demonstrate for the first time that useful predictors of CD treatment response can be produced from shotgun MGS sequencing of biopsy samples despite the complications related to large proportions of host DNA. The top predictive features that we identified in this study could be useful for building an improved classifier for CD and treatment response based on sufferers' microbiome in the future. The BISCUIT project is funded by a Clinical Academic Fellowship from the Chief Scientist Office (Scotland)-CAF/08/01.
- Published
- 2018
- Full Text
- View/download PDF
27. Bayesian Inference of Microbial Community Structure from Metagenomic Data Using BioMiCo.
- Author
-
Dunn KA, Andrews K, Bashwih RO, and Bielawski JP
- Subjects
- Algorithms, Bayes Theorem, Computational Biology methods, Metagenomics methods, Microbiota
- Abstract
Microbial samples taken from an environment often represent mixtures of communities, where each community is composed of overlapping assemblages of species. Such data represent a serious analytical challenge, as the community structures will be present as complex mixtures, there will be very large numbers of component species, and the species abundance will often be sparse over samples. The structure and complexity of these samples will vary according to both biotic and abiotic factors, and classical methods of data analysis will have a limited value in this setting. A novel Bayesian modeling framework, called BioMiCo, was developed to meet this challenge. BioMiCo takes abundance data derived from environmental DNA, and models each sample by a two-level mixture, where environmental OTUs contribute community structures, and those structures are related to the known biotic and abiotic features of each sample. The model is constrained by Dirichlet priors, which induces compact structures, minimizes variance, and maximizes model interpretability. BioMiCo is trained on a portion of the data, and once trained a BioMiCo model can be employed to make predictions about the features of new samples. This chapter provides a set of protocols that illustrate the application of BioMiCo to real inference problems. Each protocol is designed around the analysis of a real dataset, which was carefully chosen to illustrate specific aspects of real data analysis. With these protocols, users of BioMiCo will be able to undertake basic research into the properties of complex microbial systems, as well as develop predictive models for applied microbiomics.
- Published
- 2018
- Full Text
- View/download PDF
28. Shifting Balance on a Static Mutation-Selection Landscape: A Novel Scenario of Positive Selection.
- Author
-
Jones CT, Youssef N, Susko E, and Bielawski JP
- Subjects
- Amino Acid Substitution, Amino Acids genetics, Animals, Drosophila, Evolution, Molecular, Genetic Variation, Humans, Mutation, Phylogeny, Sequence Alignment, Codon, Genetics, Population methods, Models, Genetic, Selection, Genetic
- Abstract
A version of the mechanistic mutation-selection (MutSel) model that accounts for temporal dynamics at a site is presented. This is used to show that the rate ratio dN/dS at a site can be transiently >1 even when fitness coefficients are fixed or the fitness landscape is static. This occurs whenever a site drifts away from its fitness peak and is then forced back by selection, a process reminiscent of shifting balance. Shifting balance is strongest when the substitution process is not dominated by selection or drift, but admits interplay between the two. Under this condition, site-specific changes in dN/dS were inferred in 78-100% of trials, and positive selection (i.e., dN/dS>1) in 10-40% of trials, when sequence alignments generated under MutSel were fitted to two popular phenomenological branch-site models. These results demonstrate that positive selection can occur without a change in fitness regime, and that this is detectable by branch-site models. In addition, MutSel is used to show that a site can be occupied by a sub-optimal amino acid for long periods on a fixed landscape when selection is stringent. This has implications for the interpretation of constant-but-different site patterns typically attributed to changes in fitness. Furthermore, a version of MutSel with episodic changes in fitness coefficients is used to illustrate systematic differences between parameters used to generate data under MutSel and their counterparts estimated by a simple codon model. Motivated by a discrepancy in the literature, interpretation of dN/dS in the context of MutSel is also discussed., (© The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.)
- Published
- 2017
- Full Text
- View/download PDF
29. Early Changes in Microbial Community Structure Are Associated with Sustained Remission After Nutritional Treatment of Pediatric Crohn's Disease.
- Author
-
Dunn KA, Moore-Connors J, MacIntyre B, Stadnyk AW, Thomas NA, Noble A, Mahdi G, Rashid M, Otley AR, Bielawski JP, and Van Limbergen J
- Subjects
- Adolescent, Bayes Theorem, Case-Control Studies, Child, Female, Follow-Up Studies, Humans, Male, Prospective Studies, RNA, Ribosomal, 16S, Recurrence, Remission Induction methods, Time Factors, Treatment Outcome, Crohn Disease microbiology, Crohn Disease therapy, Enteral Nutrition methods, Feces microbiology, Microbiota
- Abstract
Background: Clinical remission achieved by exclusive enteral nutrition (EEN) is associated with marked microbiome changes. In this prospective study of exclusive enteral nutrition, we employ a hierarchical model of microbial community structure to distinguish between pediatric Crohn's disease patients who achieved sustained remission (SR) and those who relapsed early (non-SR), after restarting a normal diet., Methods: Fecal samples were obtained from 10 patients (age 10-16) and from 5 healthy controls (age 9-14). The microbiota was assessed via 16S rRNA sequencing. In addition to standard measures of microbial biodiversity, we employed Bayesian methods to characterize the hierarchical community structure. Community structure between patients who sustained remission (wPCDAI <12.5) up to their 24-week follow-up (SR) was compared with patients that had not sustained remission (non-SR)., Results: Microbial diversity was lower in Crohn's disease patients relative to controls and lowest in patients who did not achieve SR. SR patients differed from non-SR patients in terms of the structure and prevalence of their microbial communities. The SR prevalent community contained a number of strains of Akkermansia muciniphila and Bacteroides and was limited in Proteobacteria, whereas the non-SR prevalent community had a large Proteobacteria component. Their communities were so different that a model trained to discriminate SR and non-SR had 80% classification accuracy, already at baseline sampling., Conclusions: Microbial community structure differs between healthy controls, patients who have an enduring response to exclusive enteral nutrition, and those who relapse early on introduction of normal diet. Our novel Bayesian approach to these differences is able to predict sustained remission after exclusive enteral nutrition.
- Published
- 2016
- Full Text
- View/download PDF
30. The Gut Microbiome of Pediatric Crohn's Disease Patients Differs from Healthy Controls in Genes That Can Influence the Balance Between a Healthy and Dysregulated Immune Response.
- Author
-
Dunn KA, Moore-Connors J, MacIntyre B, Stadnyk A, Thomas NA, Noble A, Mahdi G, Rashid M, Otley AR, Bielawski JP, and Van Limbergen J
- Subjects
- Adolescent, Case-Control Studies, Child, Crohn Disease immunology, Crohn Disease therapy, Female, Humans, Immunity, Innate, Male, Metagenomics, Remission Induction, Crohn Disease microbiology, Feces microbiology, Gastrointestinal Microbiome genetics, Gastrointestinal Microbiome immunology
- Abstract
Background: Exclusive enteral nutrition (EEN) is a first-line therapy in pediatric Crohn's disease (CD) thought to induce remission through changes in the gut microbiome. With microbiome assessment largely focused on microbial taxonomy and diversity, it remains unclear to what extent EEN induces functional changes that thereby contribute to its therapeutic effect., Methods: Fecal samples were collected from 15 pediatric CD patients prior to and after EEN treatment, as well as from 5 healthy controls. Metagenomic data were obtained via next-generation sequencing, and nonhuman reads were mapped to KEGG pathways, where possible. Pathway abundance was compared between CD patients and controls, and between CD patients that sustained remission (SR) and those that did not sustain remission (NSR)., Results: Of 132 KEGG pathways identified, 8 pathways differed significantly between baseline CD patients and controls. Examination of these eight pathways showed SR patients had greater similarity to controls than NSR patients in all cases. Pathways fell into one of three groups: 1) no prior connection to IBD, 2) previously reported connection to IBD, and 3) known roles in innate immunity and immunoregulation., Conclusions: The microbiota of CD patients and controls represent alternative ecological states that have broad differences in functional capabilities, including xenobiotic and environmental pollutant degradation, succinate metavolism, and bacterial HtpG, all of which can affect barrier integrity and immune regulation. Moreover, our finding that SR patients were more similar to healthy controls suggests that community microbial function, as inferred from fecal microbiomes, could serve as a valuable diagnostic tool.
- Published
- 2016
- Full Text
- View/download PDF
31. Inference of Episodic Changes in Natural Selection Acting on Protein Coding Sequences via CODEML.
- Author
-
Bielawski JP, Baker JL, and Mingrone J
- Subjects
- Codon chemistry, Evolution, Molecular, Likelihood Functions, Computational Biology methods, Selection, Genetic, Software
- Abstract
This unit provides protocols for using the CODEML program from the PAML package to make inferences about episodic natural selection in protein-coding sequences. The protocols cover inference tasks such as maximum likelihood estimation of selection intensity, testing the hypothesis of episodic positive selection, and identifying sites with a history of episodic evolution. We provide protocols for using the rich set of models implemented in CODEML to assess robustness, and for using bootstrapping to assess if the requirements for reliable statistical inference have been met. An example dataset is used to illustrate how the protocols are used with real protein-coding sequences. The workflow of this design, through automation, is readily extendable to a larger-scale evolutionary survey. © 2016 by John Wiley & Sons, Inc., (Copyright © 2016 John Wiley & Sons, Inc.)
- Published
- 2016
- Full Text
- View/download PDF
32. Functional Divergence of the Nuclear Receptor NR2C1 as a Modulator of Pluripotentiality During Hominid Evolution.
- Author
-
Baker JL, Dunn KA, Mingrone J, Wood BA, Karpinski BA, Sherwood CC, Wildman DE, Maynard TM, and Bielawski JP
- Subjects
- Animals, Cell Line, Conserved Sequence, Humans, Mice, Nanog Homeobox Protein genetics, Nanog Homeobox Protein metabolism, Nuclear Receptor Subfamily 2, Group C, Member 1 chemistry, Octamer Transcription Factor-3 genetics, Octamer Transcription Factor-3 metabolism, Phosphoenolpyruvate Carboxykinase (ATP) genetics, Phosphoenolpyruvate Carboxykinase (ATP) metabolism, Pluripotent Stem Cells metabolism, Protein Domains, Cell Differentiation genetics, Evolution, Molecular, Hominidae genetics, Nuclear Receptor Subfamily 2, Group C, Member 1 genetics, Pluripotent Stem Cells cytology
- Abstract
Genes encoding nuclear receptors (NRs) are attractive as candidates for investigating the evolution of gene regulation because they (1) have a direct effect on gene expression and (2) modulate many cellular processes that underlie development. We employed a three-phase investigation linking NR molecular evolution among primates with direct experimental assessment of NR function. Phase 1 was an analysis of NR domain evolution and the results were used to guide the design of phase 2, a codon-model-based survey for alterations of natural selection within the hominids. By using a series of reliability and robustness analyses we selected a single gene, NR2C1, as the best candidate for experimental assessment. We carried out assays to determine whether changes between the ancestral and extant NR2C1s could have impacted stem cell pluripotency (phase 3). We evaluated human, chimpanzee, and ancestral NR2C1 for transcriptional modulation of Oct4 and Nanog (key regulators of pluripotency and cell lineage commitment), promoter activity for Pepck (a proxy for differentiation in numerous cell types), and average size of embryological stem cell colonies (a proxy for the self-renewal capacity of pluripotent cells). Results supported the signal for alteration of natural selection identified in phase 2. We suggest that adaptive evolution of gene regulation has impacted several aspects of pluripotentiality within primates. Our study illustrates that the combination of targeted evolutionary surveys and experimental analysis is an effective strategy for investigating the evolution of gene regulation with respect to developmental phenotypes., (Copyright © 2016 by the Genetics Society of America.)
- Published
- 2016
- Full Text
- View/download PDF
33. Novel Strategies for Applied Metagenomics.
- Author
-
Moore-Connors JM, Dunn KA, Bielawski JP, and Van Limbergen J
- Subjects
- Gene Expression Profiling, Humans, Gastrointestinal Tract metabolism, Gastrointestinal Tract microbiology, Metagenome genetics, Metagenomics methods, Sequence Analysis, DNA methods
- Abstract
Detailed analyses of the gut microbiome and its effect on human physiology and disease are emerging, thanks to advances in high-throughput DNA-sequencing technology and the burgeoning field of metagenomics. Metagenomics examines the structure and functional potential of microbial communities in their native habitats through the direct isolation and analysis of community DNA. In inflammatory bowel disease, gut microbiome studies have shown an association with perturbations in community composition and, especially, function. In this review, we discuss the application of next-generation sequencing to microbiome research and highlight the importance of modeling microbiome structure and function to the future of inflammatory bowel disease research and treatment.
- Published
- 2016
- Full Text
- View/download PDF
34. Seasonal assemblages and short-lived blooms in coastal north-west Atlantic Ocean bacterioplankton.
- Author
-
El-Swais H, Dunn KA, Bielawski JP, Li WK, and Walsh DA
- Subjects
- Alphaproteobacteria growth & development, Atlantic Ocean, Bayes Theorem, Diatoms growth & development, Flavobacteriaceae, Gammaproteobacteria growth & development, RNA, Ribosomal, 16S genetics, Seasons, Sphingobacterium growth & development, Synechococcus growth & development, Phytoplankton growth & development, Seawater microbiology
- Abstract
Temperate oceans are inhabited by diverse and temporally dynamic bacterioplankton communities. However, the role of the environment, resources and phytoplankton dynamics in shaping marine bacterioplankton communities at different time scales remains poorly constrained. Here, we combined time series observations (time scales of weeks to years) with molecular analysis of formalin-fixed samples from a coastal inlet of the north-west Atlantic Ocean to show that a combination of temperature, nitrate, small phytoplankton and Synechococcus abundances are best predictors for annual bacterioplankton community variability, explaining 38% of the variation. Using Bayesian mixed modelling, we identified assemblages of co-occurring bacteria associated with different seasonal periods, including the spring bloom (e.g. Polaribacter, Ulvibacter, Alteromonadales and ARCTIC96B-16) and the autumn bloom (e.g. OM42, OM25, OM38 and Arctic96A-1 clades of Alphaproteobacteria, and SAR86, OM60 and SAR92 clades of Gammaproteobacteria). Community variability over spring bloom development was best explained by silicate (32%)--an indication of rapid succession of bacterial taxa in response to diatom biomass--while nanophytoplankton as well as picophytoplankton abundance explained community variability (16-27%) over the transition into and out of the autumn bloom. Moreover, the seasonal structure was punctuated with short-lived blooms of rare bacteria including the KSA-1 clade of Sphingobacteria related to aromatic hydrocarbon-degrading bacteria., (© 2014 Society for Applied Microbiology and John Wiley & Sons Ltd.)
- Published
- 2015
- Full Text
- View/download PDF
35. BioMiCo: a supervised Bayesian model for inference of microbial community structure.
- Author
-
Shafiei M, Dunn KA, Boon E, MacDonald SM, Walsh DA, Gu H, and Bielawski JP
- Abstract
Background: Microbiome samples often represent mixtures of communities, where each community is composed of overlapping assemblages of species. Such mixtures are complex, the number of species is huge and abundance information for many species is often sparse. Classical methods have a limited value for identifying complex features within such data., Results: Here, we describe a novel hierarchical model for Bayesian inference of microbial communities (BioMiCo). The model takes abundance data derived from environmental DNA, and models the composition of each sample by a two-level hierarchy of mixture distributions constrained by Dirichlet priors. BioMiCo is supervised, using known features for samples and appropriate prior constraints to overcome the challenges posed by many variables, sparse data, and large numbers of rare species. The model is trained on a portion of the data, where it learns how assemblages of species are mixed to form communities and how assemblages are related to the known features of each sample. Training yields a model that can predict the features of new samples. We used BioMiCo to build models for three serially sampled datasets and tested their predictive accuracy across different time points. The first model was trained to predict both body site (hand, mouth, and gut) and individual human host. It was able to reliably distinguish these features across different time points. The second was trained on vaginal microbiomes to predict both the Nugent score and individual human host. We found that women having normal and elevated Nugent scores had distinct microbiome structures that persisted over time, with additional structure within women having elevated scores. The third was trained for the purpose of assessing seasonal transitions in a coastal bacterial community. Application of this model to a high-resolution time series permitted us to track the rate and time of community succession and accurately predict known ecosystem-level events., Conclusion: BioMiCo provides a framework for learning the structure of microbial communities and for making predictions based on microbial assemblages. By training on carefully chosen features (abiotic or biotic), BioMiCo can be used to understand and predict transitions between complex communities composed of hundreds of microbial species.
- Published
- 2015
- Full Text
- View/download PDF
36. BiomeNet: a Bayesian model for inference of metabolic divergence among microbial communities.
- Author
-
Shafiei M, Dunn KA, Chipman H, Gu H, and Bielawski JP
- Subjects
- Algorithms, Animals, Bayes Theorem, Carnivory physiology, Herbivory physiology, Humans, Inflammatory Bowel Diseases metabolism, Inflammatory Bowel Diseases microbiology, Metagenome, Microbiota genetics, Reproducibility of Results, Computational Biology methods, Microbiota physiology, Models, Biological
- Abstract
Metagenomics yields enormous numbers of microbial sequences that can be assigned a metabolic function. Using such data to infer community-level metabolic divergence is hindered by the lack of a suitable statistical framework. Here, we describe a novel hierarchical Bayesian model, called BiomeNet (Bayesian inference of metabolic networks), for inferring differential prevalence of metabolic subnetworks among microbial communities. To infer the structure of community-level metabolic interactions, BiomeNet applies a mixed-membership modelling framework to enzyme abundance information. The basic idea is that the mixture components of the model (metabolic reactions, subnetworks, and networks) are shared across all groups (microbiome samples), but the mixture proportions vary from group to group. Through this framework, the model can capture nested structures within the data. BiomeNet is unique in modeling each metagenome sample as a mixture of complex metabolic systems (metabosystems). The metabosystems are composed of mixtures of tightly connected metabolic subnetworks. BiomeNet differs from other unsupervised methods by allowing researchers to discriminate groups of samples through the metabolic patterns it discovers in the data, and by providing a framework for interpreting them. We describe a collapsed Gibbs sampler for inference of the mixture weights under BiomeNet, and we use simulation to validate the inference algorithm. Application of BiomeNet to human gut metagenomes revealed a metabosystem with greater prevalence among inflammatory bowel disease (IBD) patients. Based on the discriminatory subnetworks for this metabosystem, we inferred that the community is likely to be closely associated with the human gut epithelium, resistant to dietary interventions, and interfere with human uptake of an antioxidant connected to IBD. Because this metabosystem has a greater capacity to exploit host-associated glycans, we speculate that IBD-associated communities might arise from opportunist growth of bacteria that can circumvent the host's nutrient-based mechanism for bacterial partner selection.
- Published
- 2014
- Full Text
- View/download PDF
37. Inference of functional divergence among proteins when the evolutionary process is non-stationary.
- Author
-
Bay RA and Bielawski JP
- Subjects
- Adaptation, Physiological, Amino Acid Substitution, Biological Evolution, Biostatistics methods, Computer Simulation, Ecotype, Genome, Bacterial, Light, Likelihood Functions, Proteins metabolism, Selection, Genetic, Codon, Evolution, Molecular, Prochlorococcus genetics, Proteins genetics
- Abstract
Functional shifts during protein evolution are expected to yield shifts in substitution rate, and statistical methods can test for this at both codon and amino acid levels. Although methods based on models of sequence evolution serve as powerful tools for studying evolutionary processes, violating underlying assumptions can lead to false biological conclusions. It is not unusual for functional shifts to be accompanied by changes in other aspects of the evolutionary process, such as codon or amino acid frequencies. However, models used to test for functional divergence assume these frequencies remain constant over time. We employed simulation to investigate the impact of non-stationary evolution on functional divergence inference. We investigated three likelihood ratio tests based on codon models and found varying degrees of sensitivity. Joint effects of shifts in frequencies and selection pressures can be large, leading to false signals for positive selection. Amino acid-based tests (FunDi and Bivar) were also compromised when several aspects of the substitution process were not adequately modeled. We applied the same tests to a core genome "scan" for functional divergence between light-adapted ecotypes of the cyanobacteria Prochlorococcus, and carried out gene-specific simulations for ten genes. Results of those simulations illustrated how the inference of functional divergence at the genomic level can be seriously impacted by model misspecification. Although computationally costly, simulations motivated by data in hand are warranted when several aspects of the substitution process are either misspecified or not included in the models upon which the statistical tests were built.
- Published
- 2013
- Full Text
- View/download PDF
38. Improving evolutionary models for mitochondrial protein data with site-class specific amino acid exchangeability matrices.
- Author
-
Dunn KA, Jiang W, Field C, and Bielawski JP
- Subjects
- Algorithms, Animals, Cluster Analysis, Computational Biology methods, Fishes, Humans, Likelihood Functions, Mammals, Mitochondrial Proteins chemistry, Reproducibility of Results, Amino Acid Substitution, Evolution, Molecular, Mitochondrial Proteins genetics, Models, Genetic
- Abstract
Adequate modeling of mitochondrial sequence evolution is an essential component of mitochondrial phylogenomics (comparative mitogenomics). There is wide recognition within the field that lineage-specific aspects of mitochondrial evolution should be accommodated through lineage-specific amino-acid exchangeability matrices (e.g., mtMam for mammalian data). However, such a matrix must be applied to all sites and this implies that all sites are subject to the same, or largely similar, evolutionary constraints. This assumption is unjustified. Indeed, substantial differences are expected to arise from three-dimensional structures that impose different physiochemical environments on individual amino acid residues. The objectives of this paper are (1) to investigate the extent to which amino acid evolution varies among sites of mitochondrial proteins, and (2) to assess the potential benefits of explicitly modeling such variability. To achieve this, we developed a novel method for partitioning sites based on amino acid physiochemical properties. We apply this method to two datasets derived from complete mitochondrial genomes of mammals and fish, and use maximum likelihood to estimate amino acid exchangeabilities for the different groups of sites. Using this approach we identified large groups of sites evolving under unique physiochemical constraints. Estimates of amino acid exchangeabilities differed significantly among such groups. Moreover, we found that joint estimates of amino acid exchangeabilities do not adequately represent the natural variability in evolutionary processes among sites of mitochondrial proteins. Significant improvements in likelihood are obtained when the new matrices are employed. We also find that maximum likelihood estimates of branch lengths can be strongly impacted. We provide sets of matrices suitable for groups of sites subject to similar physiochemical constraints, and discuss how they might be used to analyze real data. We also discuss how the general approach might be employed to improve a variety of mitogenomic-based research activities.
- Published
- 2013
- Full Text
- View/download PDF
39. Detecting the signatures of adaptive evolution in protein-coding genes.
- Author
-
Bielawski JP
- Subjects
- Animals, Base Sequence, Computer Simulation, Humans, Models, Genetic, Molecular Sequence Data, Phylogeny, Software, Codon genetics, Evolution, Molecular, Proteins genetics, Selection, Genetic
- Abstract
The field of molecular evolution, which includes genome evolution, is devoted to finding variation within and between groups of organisms and explaining the processes responsible for generating this variation. Many DNA changes are believed to have little to no functional effect, and a neutral process will best explain their evolution. Thus, a central task is to discover which changes had positive fitness consequences and were subject to Darwinian natural selection during the course of evolution. Due the size and complexity of modern molecular datasets, the field has come to rely extensively on statistical modeling techniques to meet this analytical challenge. For DNA sequences that encode proteins, one of the most powerful approaches is to employ a statistical model of codon evolution. This unit provides a general introduction to the practice of modeling codon evolution using the statistical framework of maximum likelihood. Four real-data analysis activities are used to illustrate the principles of parameter estimation, robustness, hypothesis testing, and site classification. Each activity includes an explicit analytical protocol based on programs provided by the Phylogenetic Analysis by Maximum Likelihood (PAML) package., (© 2013 by John Wiley & Sons, Inc.)
- Published
- 2013
- Full Text
- View/download PDF
40. Recombination detection under evolutionary scenarios relevant to functional divergence.
- Author
-
Bay RA and Bielawski JP
- Subjects
- Codon, Computer Simulation, Genome, Evolution, Molecular, Models, Genetic, Prochlorococcus genetics, Recombination, Genetic genetics, Selection, Genetic
- Abstract
Recombination can negatively impact methods designed to detect divergent gene function that rely on explicit knowledge of a gene tree. However, we know little about how recombination detection methods perform under evolutionary scenarios encountered in studies of functional molecular divergence. We use simulation to evaluate false positive rates for six recombination detection methods (GENECONV, MaxChi, Chimera, RDP, GARD-SBP, GARD-MBP) under evolutionary scenarios that might increase false positives. Broadly, these scenarios address: (i) asymmetric tree topology and sequence divergence, (ii) non-stationary codon bias and selection pressure, and (iii) positive selection. We also evaluate power to detect recombination under truly recombinant history. As with previous studies, we find that power increases with sequence divergence. However, we also find that accuracy to correctly infer the number of breakpoints is extremely low. When recombination is absent, increased sequence divergence leads to increased false positives. Furthermore, one method (GARD-SBP) is sensitive to tree shape, with higher false positive rates under an asymmetric tree topology. Somewhat surprisingly, all methods are robust to the simulated heterogeneity in codon bias, shifts in selection pressure and presence of positive selection. Based on these findings, we recommend that studies of functional divergence in systems where recombination is plausible can, and should, include a pre-test for recombination. Application of all methods to the core genome of Prochlorococcus reveals a substantial lack of concordance among results. Based on analysis of both real and simulated datasets we present some guidelines for the investigation of recombination in genes that may have experienced functional divergence.
- Published
- 2011
- Full Text
- View/download PDF
41. Positive Darwinian selection in the piston that powers proton pumps in complex I of the mitochondria of Pacific salmon.
- Author
-
Garvin MR, Bielawski JP, and Gharrett AJ
- Subjects
- Adenosine Triphosphate chemistry, Amino Acid Sequence, Animals, Bayes Theorem, Biological Evolution, Evolution, Molecular, Genomics, Molecular Conformation, Molecular Sequence Data, Oxygen chemistry, Phosphorylation, Phylogeny, Salmon, Sequence Analysis, DNA, Sequence Homology, Amino Acid, Mitochondria metabolism, Proton Pumps physiology
- Abstract
The mechanism of oxidative phosphorylation is well understood, but evolution of the proteins involved is not. We combined phylogenetic, genomic, and structural biology analyses to examine the evolution of twelve mitochondrial encoded proteins of closely related, yet phenotypically diverse, Pacific salmon. Two separate analyses identified the same seven positively selected sites in ND5. A strong signal was also detected at three sites of ND2. An energetic coupling analysis revealed several structures in the ND5 protein that may have co-evolved with the selected sites. These data implicate Complex I, specifically the piston arm of ND5 where it connects the proton pumps, as important in the evolution of Pacific salmon. Lastly, the lineage to Chinook experienced rapid evolution at the piston arm.
- Published
- 2011
- Full Text
- View/download PDF
42. Reconciling ecological and genomic divergence among lineages of listeria under an "extended mosaic genome concept".
- Author
-
Dunn KA, Bielawski JP, Ward TJ, Urquhart C, and Gu H
- Subjects
- Phylogeny, Evolution, Molecular, Genome, Bacterial genetics, Listeria genetics
- Abstract
There is growing evidence for a discontinuity between genomic and ecological divergence in several groups of bacteria. This evidence is difficult to reconcile with the traditional concept that ecologically divergent species maintain a cohesive gene pool isolated from other gene pools by barriers to homologous recombination (HR). There have been several innovative models of bacterial divergence that permit such discontinuity; we refer to these, collectively, as "mosaic genome concepts" (MGCs). These concepts remain a point of contention. Here, we undertake an investigation among ecologically divergent lineages of genus Listeria, and report our assessment of both niche-specific selection pressure and HR in their core genome. We find evidence of a mosaic Listeria core genome. Some core genes appear to have been free to recombine across ecologically divergent lineages or across named species. In contrast, other core genes have histories consistent with the expected organism relationships and have evolved under niche-specific selective pressures. The products of some of those genes can even be linked to metabolic phenotypes with ecological significance. This finding indicates a potentially strong connection between ecological divergence and core-genome evolution, even among lineages that also experience frequent recombination. Based on these findings, we propose an expanded role for natural selection in core-genome evolution under the MGC.
- Published
- 2009
- Full Text
- View/download PDF
43. Trade-offs between efficiency and robustness in bacterial metabolic networks are associated with niche breadth.
- Author
-
Morine MJ, Gu H, Myers RA, and Bielawski JP
- Subjects
- Models, Biological, Phylogeny, Regression Analysis, Bacteria metabolism, Metabolic Networks and Pathways
- Abstract
The relation between structure and function in biologic networks is a central point of systems biology research. Key functional features--notably, efficiency and robustness--are linked to the topologic structure of a network, and there appears to be a degree of trade-off between these features, i.e., simulation studies indicate that more efficient networks tend to be less robust. Here, we investigate this issue in metabolic networks from 105 lineages of bacteria having a wide range of ecologies. We take quantitative measurements on each network and integrate this network data with ecologic data using a phylogenetic comparative model. In this setting, we find that biologic conclusions obtained with classical phylogenetic comparative methods are sensitive to correlations between model covariates and phylogenetic branch length. To avoid this problem, we propose a revised statistical framework--hierarchical mixed-effect regression--to accommodate phylogenetic nonindependence. Using this approach, we show that the cartography of metabolic networks does indeed reflect a trade-off between efficiency and robustness. Furthermore, ecologic characteristics related to niche breadth are strong predictors of network shape. Given the broad variation in niche breadth seen among species, we predict that there is no universally optimal balance between efficiency and robustness in bacterial metabolic networks and, thus, no universally optimal network structure. These results highlight the biologic relevance of variation in network structure and the potential role of niche breadth in shaping metabolic strategies of efficiency and robustness.
- Published
- 2009
- Full Text
- View/download PDF
44. Multilocus genotyping assays for single nucleotide polymorphism-based subtyping of Listeria monocytogenes isolates.
- Author
-
Ward TJ, Ducey TF, Usgaard T, Dunn KA, and Bielawski JP
- Subjects
- Bacterial Proteins genetics, DNA, Bacterial chemistry, Genotype, Humans, Molecular Sequence Data, Reproducibility of Results, Virulence Factors genetics, Bacterial Typing Techniques methods, DNA, Bacterial genetics, Food Microbiology, Listeria monocytogenes genetics, Polymorphism, Single Nucleotide, Sequence Analysis, DNA methods
- Abstract
Listeria monocytogenes is responsible for serious invasive illness associated with consumption of contaminated food and places a significant burden on public health and the agricultural economy. We recently developed a multilocus genotyping (MLGT) assay for high-throughput subtype determination of L. monocytogenes lineage I isolates based on interrogation of single nucleotide polymorphisms (SNPs) via multiplexed primer extension reactions. Here we report the development and validation of two additional MLGT assays that address the need for comprehensive DNA sequence-based subtyping of L. monocytogenes. The first of these novel MLGT assays targeted variation segregating within lineage II, while the second assay combined probes for lineage III strains with probes for strains representing a recently characterized fourth evolutionary lineage (IV) of L. monocytogenes. These assays were based on nucleotide variation identified in >3.8 Mb of comparative DNA sequence and consisted of 115 total probes that differentiated 93% of the 100 haplotypes defined by the multilocus sequence data. MLGT reproducibly typed the 173 isolates used in SNP discovery, and the 10,448 genotypes derived from MLGT analysis of these isolates were consistent with DNA sequence data. Application of the MLGT assays to assess subtype prevalence among isolates from ready-to-eat foods and food-processing facilities indicated a low frequency (6.3%) of epidemic clone subtypes and a substantial population of isolates (>30%) harboring mutations in inlA associated with attenuated virulence in cell culture and animal models. These mutations were restricted to serogroup 1/2 isolates, which may explain the overrepresentation of serotype 4b isolates in human listeriosis cases.
- Published
- 2008
- Full Text
- View/download PDF
45. Portal protein diversity and phage ecology.
- Author
-
Sullivan MB, Coleman ML, Quinlivan V, Rosenkrantz JE, Defrancesco AS, Tan G, Fu R, Lee JA, Waterbury JB, Bielawski JP, and Chisholm SW
- Subjects
- Bacteriophages physiology, Cluster Analysis, DNA, Viral chemistry, DNA, Viral genetics, Ecosystem, Molecular Sequence Data, Phylogeny, Sequence Analysis, DNA, Sequence Homology, Virus Assembly, Bacteriophages classification, Bacteriophages genetics, Genetic Variation, Prochlorococcus virology, Synechococcus virology, Viral Proteins genetics
- Abstract
Oceanic phages are critical components of the global ecosystem, where they play a role in microbial mortality and evolution. Our understanding of phage diversity is greatly limited by the lack of useful genetic diversity measures. Previous studies, focusing on myophages that infect the marine cyanobacterium Synechococcus, have used the coliphage T4 portal-protein-encoding homologue, gene 20 (g20), as a diversity marker. These studies revealed 10 sequence clusters, 9 oceanic and 1 freshwater, where only 3 contained cultured representatives. We sequenced g20 from 38 marine myophages isolated using a diversity of Synechococcus and Prochlorococcus hosts to see if any would fall into the clusters that lacked cultured representatives. On the contrary, all fell into the three clusters that already contained sequences from cultured phages. Further, there was no obvious relationship between host of isolation, or host range, and g20 sequence similarity. We next expanded our analyses to all available g20 sequences (769 sequences), which include PCR amplicons from wild uncultured phages, non-PCR amplified sequences identified in the Global Ocean Survey (GOS) metagenomic database, as well as sequences from cultured phages, to evaluate the relationship between g20 sequence clusters and habitat features from which the phage sequences were isolated. Even in this meta-data set, very few sequences fell into the sequence clusters without cultured representatives, suggesting that the latter are very rare, or sequencing artefacts. In contrast, sequences most similar to the culture-containing clusters, the freshwater cluster and two novel clusters, were more highly represented, with one particular culture-containing cluster representing the dominant g20 genotype in the unamplified GOS sequence data. Finally, while some g20 sequences were non-randomly distributed with respect to habitat, there were always numerous exceptions to general patterns, indicating that phage portal proteins are not good predictors of a phage's host or the habitat in which a particular phage may thrive.
- Published
- 2008
- Full Text
- View/download PDF
46. Likelihood-based clustering (LiBaC) for codon models, a method for grouping sites according to similarities in the underlying process of evolution.
- Author
-
Bao L, Gu H, Dunn KA, and Bielawski JP
- Subjects
- Computer Simulation, Models, Statistical, Rickettsia genetics, Selection, Genetic, Cluster Analysis, Codon genetics, Evolution, Molecular, Likelihood Functions, Models, Genetic
- Abstract
Models of codon evolution are useful for investigating the strength and direction of natural selection via a parameter for the nonsynonymous/synonymous rate ratio (omega = d(N)/d(S)). Different codon models are available to account for diversity of the evolutionary patterns among sites. Codon models that specify data partitions as fixed effects allow the most evolutionary diversity among sites but require that site partitions are a priori identifiable. Models that use a parametric distribution to express the variability in the omega ratio across site do not require a priori partitioning of sites, but they permit less among-site diversity in the evolutionary process. Simulation studies presented in this paper indicate that differences among sites in estimates of omega under an overly simplistic analytical model can reflect more than just natural selection pressure. We also find that the classic likelihood ratio tests for positive selection have a high false-positive rate in some situations. In this paper, we developed a new method for assigning codon sites into groups where each group has a different model, and the likelihood over all sites is maximized. The method, called likelihood-based clustering (LiBaC), can be viewed as a generalization of the family of model-based clustering approaches to models of codon evolution. We report the performance of several LiBaC-based methods, and selected alternative methods, over a wide variety of scenarios. We find that LiBaC, under an appropriate model, can provide reliable parameter estimates when the process of evolution is very heterogeneous among groups of sites. Certain types of proteins, such as transmembrane proteins, are expected to exhibit such heterogeneity. A survey of genes encoding transmembrane proteins suggests that overly simplistic models could be leading to false signal for positive selection among such genes. In these cases, LiBaC-based methods offer an important addition to a "toolbox" of methods thereby helping to uncover robust evidence for the action of positive selection.
- Published
- 2008
- Full Text
- View/download PDF
47. Methods for selecting fixed-effect models for heterogeneous codon evolution, with comments on their application to gene and genome data.
- Author
-
Bao L, Gu H, Dunn KA, and Bielawski JP
- Subjects
- Animals, Computer Simulation, Flagella genetics, Genes, Listeria genetics, Mucoproteins genetics, Polymorphism, Genetic, Selection, Genetic, Codon analysis, Evolution, Molecular, Genome, Models, Genetic, Sequence Analysis, DNA methods
- Abstract
Background: Models of codon evolution have proven useful for investigating the strength and direction of natural selection. In some cases, a priori biological knowledge has been used successfully to model heterogeneous evolutionary dynamics among codon sites. These are called fixed-effect models, and they require that all codon sites are assigned to one of several partitions which are permitted to have independent parameters for selection pressure, evolutionary rate, transition to transversion ratio or codon frequencies. For single gene analysis, partitions might be defined according to protein tertiary structure, and for multiple gene analysis partitions might be defined according to a gene's functional category. Given a set of related fixed-effect models, the task of selecting the model that best fits the data is not trivial., Results: In this study, we implement a set of fixed-effect codon models which allow for different levels of heterogeneity among partitions in the substitution process. We describe strategies for selecting among these models by a backward elimination procedure, Akaike information criterion (AIC) or a corrected Akaike information criterion (AICc). We evaluate the performance of these model selection methods via a simulation study, and make several recommendations for real data analysis. Our simulation study indicates that the backward elimination procedure can provide a reliable method for model selection in this setting. We also demonstrate the utility of these models by application to a single-gene dataset partitioned according to tertiary structure (abalone sperm lysin), and a multi-gene dataset partitioned according to the functional category of the gene (flagellar-related proteins of Listeria)., Conclusion: Fixed-effect models have advantages and disadvantages. Fixed-effect models are desirable when data partitions are known to exhibit significant heterogeneity or when a statistical test of such heterogeneity is desired. They have the disadvantage of requiring a priori knowledge for partitioning sites. We recommend: (i) selection of models by using backward elimination rather than AIC or AICc, (ii) use a stringent cut-off, e.g., p = 0.0001, and (iii) conduct sensitivity analysis of results. With thoughtful application, fixed-effect codon models should provide a useful tool for large scale multi-gene analyses.
- Published
- 2007
- Full Text
- View/download PDF
48. Proposed standard nomenclature for the alpha- and beta-globin gene families.
- Author
-
Aguileta G, Bielawski JP, and Yang Z
- Subjects
- Animals, Humans, Globins genetics, Multigene Family, Terminology as Topic
- Abstract
The globin family of genes and proteins has been a recurrent object of study for many decades. This interest has generated a vast amount of knowledge. However it has also created an inconsistent and confusing nomenclature, due to the lack of a systematic approach to naming genes and failure to reflect the phylogenetic relationships among genes of the gene family. To alleviate the problems with the existing system, here we propose a standardized nomenclature for the alpha and beta globin family of genes, based on a phylogenetic analysis of vertebrate alpha and beta globins, and following the Guidelines for Human Gene Nomenclature.
- Published
- 2006
- Full Text
- View/download PDF
49. Evolutionary rate variation among vertebrate beta globin genes: implications for dating gene family duplication events.
- Author
-
Aguileta G, Bielawski JP, and Yang Z
- Subjects
- Animals, Bayes Theorem, Birds genetics, Databases, Genetic, Gene Conversion, Genetic Variation, Humans, Likelihood Functions, Marsupialia genetics, Multigene Family, Phylogeny, Time Factors, Evolution, Molecular, Gene Duplication, Globins genetics, Vertebrates genetics
- Abstract
A comprehensive dataset of 62 beta globin gene sequences from various vertebrates was compiled to test the molecular clock and to estimate dates of gene duplications. We found that evolution of the beta globin family of genes is not clock-like, a result that is at odds with the common use of this family as an example of a constant rate of evolution over time. Divergence dates were estimated either with or without assuming the molecular clock, and both analyses produced similar date estimates, which are also in general agreement with estimates reported previously. In addition we report date estimates for seven previously unexamined duplication events within the beta globin family. Despite multiple sources of rate variation, the average rate across the beta globin phylogeny yielded reasonable estimates of divergence dates in most cases. Exceptions were cases of gene conversion, where it appears to have led to underestimates of divergence dates. Our results suggest (i) the major duplications giving rise to the paralogous beta globin genes are associated with significant evolutionary rate variation among gene lineages; and (ii) genes arising from more recent gene duplications (e.g., tandem duplications within lineages) do not appear to differ greatly in rate. We believe this pattern reflects a complex interplay of evolutionary forces where natural selection for diversifying paralogous functions and lineage-specific effects contribute to rate variation on a long-term basis, while gene conversion tends to increase sequence similarity. Gene conversion effects appear to be stronger on recent gene duplicates, as their sequences are highly similar. Lastly, phylogenetic analyses do not support a previous report that avian globins are members of a relic lineage of omega globins.
- Published
- 2006
- Full Text
- View/download PDF
50. Large-scale analyses of synonymous substitution rates can be sensitive to assumptions about the process of mutation.
- Author
-
Aris-Brosou S and Bielawski JP
- Subjects
- Animals, Codon genetics, Computer Simulation, Databases, Genetic, Genome, Selection, Genetic, Time Factors, Amino Acid Substitution, Evolution, Molecular, Models, Genetic, Mutation
- Abstract
A popular approach to examine the roles of mutation and selection in the evolution of genomes has been to consider the relationship between codon bias and synonymous rates of molecular evolution. A significant relationship between these two quantities is taken to indicate the action of weak selection on substitutions among synonymous codons. The neutral theory predicts that the rate of evolution is inversely related to the level of functional constraint. Therefore, selection against the use of non-preferred codons among those coding for the same amino acid should result in lower rates of synonymous substitution as compared with sites not subject to such selection pressures. However, reliably measuring the extent of such a relationship is problematic, as estimates of synonymous rates are sensitive to our assumptions about the process of molecular evolution. Previous studies showed the importance of accounting for unequal codon frequencies, in particular when synonymous codon usage is highly biased. Yet, unequal codon frequencies can be modeled in different ways, making different assumptions about the mutation process. Here we conduct a simulation study to evaluate two different ways of modeling uneven codon frequencies and show that both model parameterizations can have a dramatic impact on rate estimates and affect biological conclusions about genome evolution. We reanalyze three large data sets to demonstrate the relevance of our results to empirical data analysis.
- Published
- 2006
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.