40 results on '"Kainer D"'
Search Results
2. MinIONQC: fast and simple quality control for MinION sequencing data
- Author
-
Lanfear, R, primary, Schalamun, M, additional, Kainer, D, additional, Wang, W, additional, and Schwessinger, B, additional
- Published
- 2018
- Full Text
- View/download PDF
3. MinIONQC: fast and simple quality control for MinION sequencing data.
- Author
-
Lanfear, R, Schalamun, M, Kainer, D, Wang, W, and Schwessinger, B
- Subjects
NUCLEOTIDE sequencing ,SEQUENCE analysis ,BIOINFORMATICS ,QUALITY control - Abstract
Summary MinIONQC provides rapid diagnostic plots and quality control data from one or more flowcells of sequencing data from Oxford Nanopore Technologies' MinION instrument. It can be used to assist with the optimisation of extraction, library preparation, and sequencing protocols, to quickly and directly compare the data from many flowcells, and to provide publication-ready figures summarising sequencing data. Availability and implementation MinIONQC is implemented in R and released under an MIT license. It is available for all platforms from https://github.com/roblanf/minion_qc. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
4. The IMSL environment for software development
- Author
-
Aird, T. J and Kainer, D. G
- Subjects
Computer Programming And Software - Abstract
The international mathematical and statistical library (IMSL) developed a set of macros and a file naming convention that automates the subroutine development and testing process over ten computer types. The IMSL software development system is implemented on a Data General Eclipse C330 computer with 256K bytes of central memory and 192M bytes of disk storage using the AOS Operating System. RJE activity is handled by a Data 100 communications computer. The system allows the programmer to work with basis decks. Distribution decks are generated, by the IMSL FORTRAN converter, as they are needed for testing and whenever the basis deck has been modified.
- Published
- 1978
5. The IMSL environment for software development.
- Author
-
Aird, T. J. and Kainer, D. G.
- Published
- 1978
- Full Text
- View/download PDF
6. Modeling and analysis of competitive RT-PCR.
- Author
-
Hayward, A L, Oefner, P J, Sabatini, S, Kainer, D B, Hinojos, C A, and Doris, P A
- Abstract
The present studies demonstrate a theoretical and practical framework for the accurate quantitation of gene expression in RNA extracted from microscopic tissue samples. The approaches are developed around competitive RT-PCR techniques. Assay performance has been examined and validated at both the RT and PCR steps. Our analysis of RT transcription efficiency for a number of native and competitor combinations shows that this property can differ, even for very similar templates. However, this difference is consistent and, once identified and measured, can be removed as an obstacle to accuracy. Using mathematical modeling, we have examined the simulated co-amplification of native and competitor templates in PCR. Useful insights have emerged from such modeling which indicate that differences in initial amplification efficiency and the rate of decay of amplification efficiency during the reaction can rapidly lead to inaccuracy, even while the slope and linearity of log plots of the competitor input and reaction product ratios are close to ideal. Finally, we show here that competitive RT-PCR reactions do not have to remain in the log-linear phase of PCR in order to accomplish accurate and precise quantification. Using appropriate competitors sharing primer binding sites and high internal sequence similarity, identical amplification efficiencies are preserved throughout the reaction. Reaction products, including heteroduplexes formed between native and competitor templates as reactions progress to plateau, can be identified and quantified accurately using the new technique of denaturing HPLC (dHPLC). This analytical technique allows the accuracy of competitive RT-PCR to be preserved beyond the linear phase. The technique has high sensitivity and precision and target abundances as low as 100 copies could be reliably estimated.
- Published
- 1998
- Full Text
- View/download PDF
7. The IMSL environment for software development
- Author
-
Aird, T. J., primary and Kainer, D. G., additional
- Published
- 1978
- Full Text
- View/download PDF
8. Multi-omic network analysis identifies dysregulated neurobiological pathways in opioid addiction.
- Author
-
Sullivan KA, Kainer D, Lane M, Cashman M, Miller JI, Garvin MR, Townsend A, Quach BC, Willis C, Kruse P, Gaddis NC, Mathur R, Corradin O, Maher BS, Scacheri PC, Sanchez-Roige S, Palmer AA, Troiani V, Chesler EJ, Kember RL, Kranzler HR, Justice AC, Xu K, Aouizerat BE, Hancock DB, Johnson EO, and Jacobson DA
- Abstract
Background: Opioid addiction is a worldwide public health crisis. In the United States, for example, opioids cause more drug overdose deaths than any other substance. Yet, opioid addiction treatments have limited efficacy, meaning that additional treatments are needed., Methods: To help address this problem, we used network-based machine learning techniques to integrate results from genome-wide association studies (GWAS) of opioid use disorder (OUD) and problematic prescription opioid misuse with transcriptomic, proteomic, and epigenetic data from the dorsolateral prefrontal cortex (dlPFC) of opioid overdose victims and controls., Results: We identified 211 highly interrelated genes identified by GWAS or dysregulation in the dlPFC of opioid overdose victims that implicated the Akt, BDNF, and ERK pathways, identifying 414 drugs targeting 48 of these opioid addiction-associated genes. Some of the identified drugs are approved to treat other substance use disorders (SUDs) or depression., Conclusions: Our synthesis of multi-omics using a systems biology approach revealed key gene targets that could contribute to drug repurposing, genetics-informed addiction treatment, and future discovery., (Copyright © 2024. Published by Elsevier Inc.)
- Published
- 2024
- Full Text
- View/download PDF
9. Centromeres are hotspots for chromosomal inversions and breeding traits in mango.
- Author
-
Wilkinson MJ, McLay K, Kainer D, Elphinstone C, Dillon NL, Webb M, Wijesundara UK, Ali A, Bally ISE, Munyengwa N, Furtado A, Henry RJ, Hardner CM, and Ortiz-Barrientos D
- Abstract
Chromosomal inversions can preserve combinations of favorable alleles by suppressing recombination. Simultaneously, they reduce the effectiveness of purifying selection enabling deleterious alleles to accumulate. This study explores how areas of low recombination, including centromeric regions and chromosomal inversions, contribute to the accumulation of deleterious and favorable loci in 225 Mangifera indica genomes from the Australian Mango Breeding Program. Here, we identify 17 chromosomal inversions that cover 7.7% (29.7 Mb) of the M. indica genome: eight pericentric (inversion includes the centromere) and nine paracentric (inversion is on one arm of the chromosome). Our results show that these large pericentric inversions are accumulating deleterious loci, while the paracentric inversions show deleterious levels above and below the genome wide average. We find that despite their deleterious load, chromosomal inversions contain small effect loci linked to variation in crucial breeding traits. These results indicate that chromosomal inversions have likely facilitated the evolution of key mango breeding traits. Our study has important implications for selective breeding of favorable combinations of alleles in regions of low recombination., (© 2024 State of Queensland. New Phytologist © 2024 New Phytologist Foundation.)
- Published
- 2024
- Full Text
- View/download PDF
10. Analyses of GWAS signal using GRIN identify additional genes contributing to suicidal behavior.
- Author
-
Sullivan KA, Lane M, Cashman M, Miller JI, Pavicic M, Walker AM, Cliff A, Romero J, Qin X, Mullins N, Docherty A, Coon H, Ruderfer DM, Garvin MR, Pestian JP, Ashley-Koch AE, Beckham JC, McMahon B, Oslin DW, Kimbrel NA, Jacobson DA, and Kainer D
- Subjects
- Humans, Gene Regulatory Networks, Phenotype, Polymorphism, Single Nucleotide, Suicide, Attempted, Genetic Predisposition to Disease, Genome-Wide Association Study, Suicide
- Abstract
Genome-wide association studies (GWAS) identify genetic variants underlying complex traits but are limited by stringent genome-wide significance thresholds. We present GRIN (Gene set Refinement through Interacting Networks), which increases confidence in the expanded gene set by retaining genes strongly connected by biological networks when GWAS thresholds are relaxed. GRIN was validated on both simulated interrelated gene sets as well as multiple GWAS traits. From multiple GWAS summary statistics of suicide attempt, a complex phenotype, GRIN identified additional genes that replicated across independent cohorts and retained biologically interrelated genes despite a relaxed significance threshold. We present a conceptual model of how these retained genes interact through neurobiological pathways that may influence suicidal behavior, and identify existing drugs associated with these pathways that would not have been identified under traditional GWAS thresholds. We demonstrate GRIN's utility in boosting GWAS results by increasing the number of true positive genes identified from GWAS results., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
11. Quantum biological insights into CRISPR-Cas9 sgRNA efficiency from explainable-AI driven feature engineering.
- Author
-
Noshay JM, Walker T, Alexander WG, Klingeman DM, Romero J, Walker AM, Prates E, Eckert C, Irle S, Kainer D, and Jacobson DA
- Subjects
- Artificial Intelligence, DNA, Escherichia coli genetics, Gene Editing, Humans, CRISPR-Cas Systems, RNA, Guide, CRISPR-Cas Systems
- Abstract
CRISPR-Cas9 tools have transformed genetic manipulation capabilities in the laboratory. Empirical rules-of-thumb have been developed for only a narrow range of model organisms, and mechanistic underpinnings for sgRNA efficiency remain poorly understood. This work establishes a novel feature set and new public resource, produced with quantum chemical tensors, for interpreting and predicting sgRNA efficiency. Feature engineering for sgRNA efficiency is performed using an explainable-artificial intelligence model: iterative Random Forest (iRF). By encoding quantitative attributes of position-specific sequences for Escherichia coli sgRNAs, we identify important traits for sgRNA design in bacterial species. Additionally, we show that expanding positional encoding to quantum descriptors of base-pair, dimer, trimer, and tetramer sequences captures intricate interactions in local and neighboring nucleotides of the target DNA. These features highlight variation in CRISPR-Cas9 sgRNA dynamics between E. coli and H. sapiens genomes. These novel encodings of sgRNAs enhance our understanding of the elaborate quantum biological processes involved in CRISPR-Cas9 machinery., (© The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research.)
- Published
- 2023
- Full Text
- View/download PDF
12. A glimpse into the fungal metabolomic abyss: Novel network analysis reveals relationships between exogenous compounds and their outputs.
- Author
-
Gopalakrishnan Meena M, Lane MJ, Tannous J, Carrell AA, Abraham PE, Giannone RJ, Ané JM, Keller NP, Labbé JL, Geiger AG, Kainer D, Jacobson DA, and Rush TA
- Abstract
Fungal specialized metabolites are a major source of beneficial compounds that are routinely isolated, characterized, and manufactured as pharmaceuticals, agrochemical agents, and industrial chemicals. The production of these metabolites is encoded by biosynthetic gene clusters that are often silent under standard growth conditions. There are limited resources for characterizing the direct link between abiotic stimuli and metabolite production. Herein, we introduce a network analysis-based, data-driven algorithm comprising two routes to characterize the production of specialized fungal metabolites triggered by different exogenous compounds: the direct route and the auxiliary route. Both routes elucidate the influence of treatments on the production of specialized metabolites from experimental data. The direct route determines known and putative metabolites induced by treatments and provides additional insight over traditional comparison methods. The auxiliary route is specific for discovering unknown analytes, and further identification can be curated through online bioinformatic resources. We validated our algorithm by applying chitooligosaccharides and lipids at two different temperatures to the fungal pathogen Aspergillus fumigatus . After liquid chromatography-mass spectrometry quantification of significantly produced analytes, we used network centrality measures to rank the treatments' ability to elucidate these analytes and confirmed their identity through fragmentation patterns or in silico spiking with commercially available standards. Later, we examined the transcriptional regulation of these metabolites through real-time quantitative polymerase chain reaction. Our data-driven techniques can complement existing metabolomic network analysis by providing an approach to track the influence of any exogenous stimuli on metabolite production. Our experimental-based algorithm can overcome the bottlenecks in elucidating novel fungal compounds used in drug discovery., (© The Author(s) 2023. Published by Oxford University Press on behalf of National Academy of Sciences.)
- Published
- 2023
- Full Text
- View/download PDF
13. Few-Shot Learning Enables Population-Scale Analysis of Leaf Traits in Populus trichocarpa .
- Author
-
Lagergren J, Pavicic M, Chhetri HB, York LM, Hyatt D, Kainer D, Rutter EM, Flores K, Bailey-Bale J, Klein M, Taylor G, Jacobson D, and Streich J
- Abstract
Plant phenotyping is typically a time-consuming and expensive endeavor, requiring large groups of researchers to meticulously measure biologically relevant plant traits, and is the main bottleneck in understanding plant adaptation and the genetic architecture underlying complex traits at population scale. In this work, we address these challenges by leveraging few-shot learning with convolutional neural networks to segment the leaf body and visible venation of 2,906 Populus trichocarpa leaf images obtained in the field. In contrast to previous methods, our approach (a) does not require experimental or image preprocessing, (b) uses the raw RGB images at full resolution, and (c) requires very few samples for training (e.g., just 8 images for vein segmentation). Traits relating to leaf morphology and vein topology are extracted from the resulting segmentations using traditional open-source image-processing tools, validated using real-world physical measurements, and used to conduct a genome-wide association study to identify genes controlling the traits. In this way, the current work is designed to provide the plant phenotyping community with (a) methods for fast and accurate image-based feature extraction that require minimal training data and (b) a new population-scale dataset, including 68 different leaf phenotypes, for domain scientists and machine learning researchers. All of the few-shot learning code, data, and results are made publicly available., (Copyright © 2023 John Lagergren et al.)
- Published
- 2023
- Full Text
- View/download PDF
14. Validation of a metabolite-GWAS network for Populus trichocarpa family 1 UDP-glycosyltransferases.
- Author
-
Saint-Vincent PMB, Furches A, Galanie S, Teixeira Prates E, Aldridge JL, Labbe A, Zhao N, Martin MZ, Ranjan P, Jones P, Kainer D, Kalluri UC, Chen JG, Muchero W, Jacobson DA, and Tschaplinski TJ
- Abstract
Metabolite genome-wide association studies (mGWASs) are increasingly used to discover the genetic basis of target phenotypes in plants such as Populus trichocarpa , a biofuel feedstock and model woody plant species. Despite their growing importance in plant genetics and metabolomics, few mGWASs are experimentally validated. Here, we present a functional genomics workflow for validating mGWAS-predicted enzyme-substrate relationships. We focus on uridine diphosphate-glycosyltransferases (UGTs), a large family of enzymes that catalyze sugar transfer to a variety of plant secondary metabolites involved in defense, signaling, and lignification. Glycosylation influences physiological roles, localization within cells and tissues, and metabolic fates of these metabolites. UGTs have substantially expanded in P. trichocarpa , presenting a challenge for large-scale characterization. Using a high-throughput assay, we produced substrate acceptance profiles for 40 previously uncharacterized candidate enzymes. Assays confirmed 10 of 13 leaf mGWAS associations, and a focused metabolite screen demonstrated varying levels of substrate specificity among UGTs. A substrate binding model case study of UGT-23 rationalized observed enzyme activities and mGWAS associations, including glycosylation of trichocarpinene to produce trichocarpin, a major higher-order salicylate in P. trichocarpa. We identified UGTs putatively involved in lignan, flavonoid, salicylate, and phytohormone metabolism, with potential implications for cell wall biosynthesis, nitrogen uptake, and biotic and abiotic stress response that determine sustainable biomass crop production. Our results provide new support for in silico analyses and evidence-based guidance for in vivo functional characterization., Competing Interests: Author SG was employed by the Oak Ridge National Laboratory during the described research and is currently employed by the company Merck. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (Copyright © 2023 Saint-Vincent, Furches, Galanie, Teixeira Prates, Aldridge, Labbe, Zhao, Martin, Ranjan, Jones, Kainer, Kalluri, Chen, Muchero, Jacobson and Tschaplinski.)
- Published
- 2023
- Full Text
- View/download PDF
15. Genetics of varicose veins reveals polygenic architecture and genetic overlap with arterial and venous disease.
- Author
-
Levin MG, Huffman JE, Verma A, Sullivan KA, Rodriguez AA, Kainer D, Garvin MR, Lane M, Cashman M, Miller JI, Won H, Li B, Luo Y, Jarvik GP, Hakonarson H, Jasper EA, Bick AG, Tsao PS, Ritchie MD, Jacobson DA, Madduri RK, and Damrauer SM
- Abstract
Varicose veins represent a common cause of cardiovascular morbidity, with limited available medical therapies. Although varicose veins are heritable and epidemiologic studies have identified several candidate varicose vein risk factors, the molecular and genetic basis remains uncertain. Here we analyzed the contribution of common genetic variants to varicose veins using data from the Veterans Affairs Million Veteran Program and four other large biobanks. Among 49,765 individuals with varicose veins and 1,334,301 disease-free controls, we identified 139 risk loci. We identified genetic overlap between varicose veins, other vascular diseases and dozens of anthropometric factors. Using Mendelian randomization, we prioritized therapeutic targets via integration of proteomic and transcriptomic data. Finally, topological enrichment analyses confirmed the biologic roles of endothelial shear flow disruption, inflammation, vascular remodeling and angiogenesis. These findings may facilitate future efforts to develop nonsurgical therapies for varicose veins., (© 2023. This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply.)
- Published
- 2023
- Full Text
- View/download PDF
16. Exploring the role of plant lysin motif receptor-like kinases in regulating plant-microbe interactions in the bioenergy crop Populus .
- Author
-
Cope KR, Prates ET, Miller JI, Demerdash ONA, Shah M, Kainer D, Cliff A, Sullivan KA, Cashman M, Lane M, Matthiadis A, Labbé J, Tschaplinski TJ, Jacobson DA, and Kalluri UC
- Abstract
For plants, distinguishing between mutualistic and pathogenic microbes is a matter of survival. All microbes contain microbe-associated molecular patterns (MAMPs) that are perceived by plant pattern recognition receptors (PRRs). Lysin motif receptor-like kinases (LysM-RLKs) are PRRs attuned for binding and triggering a response to specific MAMPs, including chitin oligomers (COs) in fungi, lipo-chitooligosaccharides (LCOs), which are produced by mycorrhizal fungi and nitrogen-fixing rhizobial bacteria, and peptidoglycan in bacteria. The identification and characterization of LysM-RLKs in candidate bioenergy crops including Populus are limited compared to other model plant species, thus inhibiting our ability to both understand and engineer microbe-mediated gains in plant productivity. As such, we performed a sequence analysis of LysM-RLKs in the Populus genome and predicted their function based on phylogenetic analysis with known LysM-RLKs. Then, using predictive models, molecular dynamics simulations, and comparative structural analysis with previously characterized CO and LCO plant receptors, we identified probable ligand-binding sites in Populus LysM-RLKs. Using several machine learning models, we predicted remarkably consistent binding affinity rankings of Populus proteins to CO. In addition, we used a modified Random Walk with Restart network-topology based approach to identify a subset of Populus LysM-RLKs that are functionally related and propose a corresponding signal transduction cascade. Our findings provide the first look into the role of LysM-RLKs in Populus -microbe interactions and establish a crucial jumping-off point for future research efforts to understand specificity and redundancy in microbial perception mechanisms., Competing Interests: None. The funding agency [DOE BER] had no involvement on the study design, data collection and analysis or interpretation of results reported here., (© 2023 The Authors.)
- Published
- 2022
- Full Text
- View/download PDF
17. Lipo-Chitooligosaccharides Induce Specialized Fungal Metabolite Profiles That Modulate Bacterial Growth.
- Author
-
Rush TA, Tannous J, Lane MJ, Gopalakrishnan Meena M, Carrell AA, Golan JJ, Drott MT, Cottaz S, Fort S, Ané JM, Keller NP, Pelletier DA, Jacobson DA, Kainer D, Abraham PE, Giannone RJ, and Labbé JL
- Subjects
- Humans, Chitin, Oligosaccharides pharmacology, Chitosan pharmacology, Mycorrhizae
- Abstract
Lipo-chitooligosaccharides (LCOs) are historically known for their role as microbial-derived signaling molecules that shape plant symbiosis with beneficial rhizobia or mycorrhizal fungi. Recent studies showing that LCOs are widespread across the fungal kingdom have raised questions about the ecological function of these compounds in organisms that do not form symbiotic relationships with plants. To elucidate the ecological function of these compounds, we investigate the metabolomic response of the ubiquitous human pathogen Aspergillus fumigatus to LCOs. Our metabolomics data revealed that exogenous application of various types of LCOs to A. fumigatus resulted in significant shifts in the fungal metabolic profile, with marked changes in the production of specialized metabolites known to mediate ecological interactions. Using network analyses, we identify specific types of LCOs with the most significant effect on the abundance of known metabolites. Extracts of several LCO-induced metabolic profiles significantly impact the growth rates of diverse bacterial species. These findings suggest that LCOs may play an important role in the competitive dynamics of non-plant-symbiotic fungi and bacteria. This study identifies specific metabolomic profiles induced by these ubiquitously produced chemicals and creates a foundation for future studies into the potential roles of LCOs as modulators of interkingdom competition. IMPORTANCE The activation of silent biosynthetic gene clusters (BGC) for the identification and characterization of novel fungal secondary metabolites is a perpetual motion in natural product discoveries. Here, we demonstrated that one of the best-studied symbiosis signaling compounds, lipo-chitooligosaccharides (LCOs), play a role in activating some of these BGCs, resulting in the production of known, putative, and unknown metabolites with biological activities. This collection of metabolites induced by LCOs differentially modulate bacterial growth, while the LCO standards do not convey the same effect. These findings create a paradigm shift showing that LCOs have a more prominent role outside of host recognition of symbiotic microbes. Importantly, our work demonstrates that fungi use LCOs to produce a variety of metabolites with biological activity, which can be a potential source of bio-stimulants, pesticides, or pharmaceuticals.
- Published
- 2022
- Full Text
- View/download PDF
18. Structural variants identified using non-Mendelian inheritance patterns advance the mechanistic understanding of autism spectrum disorder.
- Author
-
Kainer D, Templeton AR, Prates ET, Jacboson D, Allan ERO, Climer S, and Garvin MR
- Subjects
- Humans, Genome-Wide Association Study methods, Artificial Intelligence, Quantitative Trait Loci genetics, Inheritance Patterns genetics, Autism Spectrum Disorder genetics
- Abstract
The heritability of autism spectrum disorder (ASD), based on 680,000 families and five countries, is estimated to be nearly 80%, yet heritability reported from SNP-based studies are consistently lower, and few significant loci have been identified with genome-wide association studies. This gap in genomic information may reside in rare variants, interaction among variants (epistasis), or cryptic structural variation (SV) and may provide mechanisms that underlie ASD. Here we use a method to identify potential SVs based on non-Mendelian inheritance patterns in pedigrees using parent-child genotypes from ASD families and demonstrate that they are enriched in ASD-risk genes. Most are in non-coding genic space and are over-represented in expression quantitative trait loci, suggesting that they affect gene regulation, which we confirm with their overlap of differentially expressed genes in postmortem brain tissue of ASD individuals. We then identify an SV in the GRIK2 gene that alters RNA splicing and a regulatory region of the ACMSD gene in the kynurenine pathway as significantly associated with a non-verbal ASD phenotype, supporting our hypothesis that these currently excluded loci can provide a clearer mechanistic understanding of ASD. Finally, we use an explainable artificial intelligence approach to define subgroups demonstrating their use in the context of precision medicine., Competing Interests: M.R.G. is owner of Williwaw Biosciences, LLC, which has filed a patent on the use of non-Mendelian inheritance to detect genomic structural variants., (© 2022 Oak Ridge National Laboratory, The Author(s).)
- Published
- 2022
- Full Text
- View/download PDF
19. Evaluating the performance of random forest and iterative random forest based methods when applied to gene expression data.
- Author
-
Walker AM, Cliff A, Romero J, Shah MB, Jones P, Felipe Machado Gazolla JG, Jacobson DA, and Kainer D
- Abstract
Gene-to-gene networks, such as Gene Regulatory Networks (GRN) and Predictive Expression Networks (PEN) capture relationships between genes and are beneficial for use in downstream biological analyses. There exists multiple network inference tools to produce these gene-to-gene networks from matrices of gene expression data. Random Forest-Leave One Out Prediction (RF-LOOP) is a method that has been shown to be efficient at producing these gene-to-gene networks, frequently known as GEne Network Inference with Ensemble of trees (GENIE3). Random Forest can be replaced in this process by iterative Random Forest (iRF), which performs variable selection and boosting. Here we validate that iterative Random Forest-Leave One Out Prediction (iRF-LOOP) produces higher quality networks than GENIE3 (RF-LOOP). We use both synthetic and empirical networks from the Dialogue for Reverse Engineering Assessment and Methods (DREAM) Challenges by Sage Bionetworks, as well as two additional empirical networks created from Arabidopsis thaliana and Populus trichocarpa expression data., Competing Interests: The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper., (© 2022 The Author(s).)
- Published
- 2022
- Full Text
- View/download PDF
20. The Genetic Architecture of Nitrogen Use Efficiency in Switchgrass ( Panicum virgatum L.).
- Author
-
Shrestha V, Chhetri HB, Kainer D, Xu Y, Hamilton L, Piasecki C, Wolfe B, Wang X, Saha M, Jacobson D, Millwood RJ, Mazarei M, and Stewart CN Jr
- Abstract
Switchgrass ( Panicum virgatum L.) has immense potential as a bioenergy crop with the aim of producing biofuel as an end goal. Nitrogen (N)-related sustainability traits, such as nitrogen use efficiency (NUE) and nitrogen remobilization efficiency (NRE), are important factors affecting switchgrass quality and productivity. Hence, it is imperative to develop nitrogen use-efficient switchgrass accessions by exploring the genetic basis of NUE in switchgrass. For that, we used 331 diverse field-grown switchgrass accessions planted under low and moderate N fertility treatments. We performed a genome wide association study (GWAS) in a holistic manner where we not only considered NUE as a single trait but also used its related phenotypic traits, such as total dry biomass at low N and moderate N, and nitrogen use index, such as NRE. We have evaluated the phenotypic characterization of the NUE and the related traits, highlighted their relationship using correlation analysis, and identified the top ten nitrogen use-efficient switchgrass accessions. Our GWAS analysis identified 19 unique single nucleotide polymorphisms (SNPs) and 32 candidate genes. Two promising GWAS candidate genes, caffeoyl-CoA O-methyltransferase ( CCoAOMT ) and alfin-like 6 ( AL6 ), were further supported by linkage disequilibrium (LD) analysis. Finally, we discussed the potential role of nitrogen in modulating the expression of these two genes. Our findings have opened avenues for the development of improved nitrogen use-efficient switchgrass lines., Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (Copyright © 2022 Shrestha, Chhetri, Kainer, Xu, Hamilton, Piasecki, Wolfe, Wang, Saha, Jacobson, Millwood, Mazarei and Stewart.)
- Published
- 2022
- Full Text
- View/download PDF
21. Characterization of terpene biosynthesis in Melaleuca quinquenervia and ecological consequences of terpene accumulation during myrtle rust infection.
- Author
-
Hsieh JF, Krause ST, Kainer D, Degenhardt J, Foley WJ, and Külheim C
- Abstract
Plants use a wide array of secondary metabolites including terpenes as defense against herbivore and pathogen attack, which can be constitutively expressed or induced. Here, we investigated aspects of the chemical and molecular basis of resistance against the exotic rust fungus Austropuccinia psidii in Melaleuca quinquenervia , with a focus on terpenes. Foliar terpenes of resistant and susceptible plants were quantified, and we assessed whether chemotypic variation contributed to resistance to infection by A. psidii . We found that chemotypes did not contribute to the resistance and susceptibility of M. quinquenervia . However, in one of the chemotypes (Chemotype 2), susceptible plants showed higher concentrations of several terpenes including α-pinene, limonene, 1,8-cineole, and viridiflorol compared with resistant plants. Transcriptome profiling of these plants showed that several TPS genes were strongly induced in response to infection by A. psidii . Functional characterization of these TPS showed them to be mono- and sesquiterpene synthases producing compounds including 1,8-cineole, β-caryophyllene, viridiflorol and nerolidol. The expression of these TPS genes correlated with metabolite data in a susceptible plant. These results suggest the complexity of resistance mechanism regulated by M . quinquenervia and that modulation of terpenes may be one of the components that contribute to resistance against A. psidii ., Competing Interests: The authors declare that there is no conflict of interest., (© 2021 The Authors. Plant‐Environment Interactions published by John Wiley & Sons Ltd and New Phytologist Foundation.)
- Published
- 2021
- Full Text
- View/download PDF
22. Potentially adaptive SARS-CoV-2 mutations discovered with novel spatiotemporal and explainable AI models.
- Author
-
Garvin MR, T Prates E, Pavicic M, Jones P, Amos BK, Geiger A, Shah MB, Streich J, Felipe Machado Gazolla JG, Kainer D, Cliff A, Romero J, Keith N, Brown JB, and Jacobson D
- Subjects
- Artificial Intelligence, Genome, Viral, Haplotypes, Mutation, Selection, Genetic, Adaptation, Biological, Evolution, Molecular, Models, Genetic, SARS-CoV-2 genetics, Viral Proteins genetics
- Abstract
Background: A mechanistic understanding of the spread of SARS-CoV-2 and diligent tracking of ongoing mutagenesis are of key importance to plan robust strategies for confining its transmission. Large numbers of available sequences and their dates of transmission provide an unprecedented opportunity to analyze evolutionary adaptation in novel ways. Addition of high-resolution structural information can reveal the functional basis of these processes at the molecular level. Integrated systems biology-directed analyses of these data layers afford valuable insights to build a global understanding of the COVID-19 pandemic., Results: Here we identify globally distributed haplotypes from 15,789 SARS-CoV-2 genomes and model their success based on their duration, dispersal, and frequency in the host population. Our models identify mutations that are likely compensatory adaptive changes that allowed for rapid expansion of the virus. Functional predictions from structural analyses indicate that, contrary to previous reports, the Asp
614 Gly mutation in the spike glycoprotein (S) likely reduced transmission and the subsequent Pro323 Leu mutation in the RNA-dependent RNA polymerase led to the precipitous spread of the virus. Our model also suggests that two mutations in the nsp13 helicase allowed for the adaptation of the virus to the Pacific Northwest of the USA. Finally, our explainable artificial intelligence algorithm identified a mutational hotspot in the sequence of S that also displays a signature of positive selection and may have implications for tissue or cell-specific expression of the virus., Conclusions: These results provide valuable insights for the development of drugs and surveillance strategies to combat the current and future pandemics.- Published
- 2020
- Full Text
- View/download PDF
23. Genome-Wide Association Study of Wood Anatomical and Morphological Traits in Populus trichocarpa .
- Author
-
Chhetri HB, Furches A, Macaya-Sanz D, Walker AR, Kainer D, Jones P, Harman-Ware AE, Tschaplinski TJ, Jacobson D, Tuskan GA, and DiFazio SP
- Abstract
To understand the genetic mechanisms underlying wood anatomical and morphological traits in Populus trichocarpa , we used 869 unrelated genotypes from a common garden in Clatskanie, Oregon that were previously collected from across the distribution range in western North America. Using GEMMA mixed model analysis, we tested for the association of 25 phenotypic traits and nine multitrait combinations with 6.741 million SNPs covering the entire genome. Broad-sense trait heritabilities ranged from 0.117 to 0.477. Most traits were significantly correlated with geoclimatic variables suggesting a role of climate and geography in shaping the variation of this species. Fifty-seven SNPs from single trait GWAS and 11 SNPs from multitrait GWAS passed an FDR threshold of 0.05, leading to the identification of eight and seven nearby candidate genes, respectively. The percentage of phenotypic variance explained (PVE) by the significant SNPs for both single and multitrait GWAS ranged from 0.01% to 6.18%. To further evaluate the potential roles of candidate genes, we used a multi-omic network containing five additional data sets, including leaf and wood metabolite GWAS layers and coexpression and comethylation networks. We also performed a functional enrichment analysis on coexpression nearest neighbors for each gene model identified by the wood anatomical and morphological trait GWAS analyses. Genes affecting cell wall composition and transport related genes were enriched in wood anatomy and stomatal density trait networks. Signaling and metabolism related genes were also common in networks for stomatal density. For leaf morphology traits (leaf dry and wet weight) the networks were significantly enriched for GO terms related to photosynthetic processes as well as cellular homeostasis. The identified genes provide further insights into the genetic control of these traits, which are important determinants of the suitability and sustainability of improved genotypes for lignocellulosic biofuel production., (Copyright © 2020 Chhetri, Furches, Macaya-Sanz, Walker, Kainer, Jones, Harman-Ware, Tschaplinski, Jacobson, Tuskan and DiFazio.)
- Published
- 2020
- Full Text
- View/download PDF
24. A phylogenomic approach reveals a low somatic mutation rate in a long-lived plant.
- Author
-
Orr AJ, Padovan A, Kainer D, Külheim C, Bromham L, Bustos-Segura C, Foley W, Haff T, Hsieh JF, Morales-Suarez A, Cartwright RA, and Lanfear R
- Subjects
- Phylogeny, Plant Physiological Phenomena, Arabidopsis physiology, Mutation Rate
- Abstract
Somatic mutations can have important effects on the life history, ecology, and evolution of plants, but the rate at which they accumulate is poorly understood and difficult to measure directly. Here, we develop a method to measure somatic mutations in individual plants and use it to estimate the somatic mutation rate in a large, long-lived, phenotypically mosaic Eucalyptus melliodora tree. Despite being 100 times larger than Arabidopsis, this tree has a per-generation mutation rate only ten times greater, which suggests that this species may have evolved mechanisms to reduce the mutation rate per unit of growth. This adds to a growing body of evidence that illuminates the correlated evolutionary shifts in mutation rate and life history in plants.
- Published
- 2020
- Full Text
- View/download PDF
25. Can exascale computing and explainable artificial intelligence applied to plant biology deliver on the United Nations sustainable development goals?
- Author
-
Streich J, Romero J, Gazolla JGFM, Kainer D, Cliff A, Prates ET, Brown JB, Khoury S, Tuskan GA, Garvin M, Jacobson D, and Harfouche AL
- Subjects
- Agriculture, Goals, Humans, United Nations, Artificial Intelligence, Sustainable Development
- Abstract
Human population growth and accelerated climate change necessitate agricultural improvements using designer crop ideotypes (idealized plants that can grow in niche environments). Diverse and highly skilled research groups must integrate efforts to bridge the gaps needed to achieve international goals toward sustainable agriculture. Given the scale of global agricultural needs and the breadth of multiple types of omics data needed to optimize these efforts, explainable artificial intelligence (AI with a decipherable decision making process that provides a meaningful explanation to humans) and exascale computing (computers that can perform 10
18 floating-point operations per second, or exaflops) are crucial. Accurate phenotyping and daily-resolution climatype associations are equally important for refining ideotype production to specific environments at various levels of granularity. We review advances toward tackling technological hurdles to solve multiple United Nations Sustainable Development Goals and discuss a vision to overcome gaps between research and policy., (Copyright © 2020 Elsevier Ltd. All rights reserved.)- Published
- 2020
- Full Text
- View/download PDF
26. The draft nuclear genome assembly of Eucalyptus pauciflora: a pipeline for comparing de novo assemblies.
- Author
-
Wang W, Das A, Kainer D, Schalamun M, Morales-Suarez A, Schwessinger B, and Lanfear R
- Subjects
- DNA Contamination, Genome Size, Computational Biology methods, Eucalyptus genetics, Genome, Plant, Genomics methods
- Abstract
Background: Eucalyptus pauciflora (the snow gum) is a long-lived tree with high economic and ecological importance. Currently, little genomic information for E. pauciflora is available. Here, we sequentially assemble the genome of Eucalyptus pauciflora with different methods, and combine multiple existing and novel approaches to help to select the best genome assembly., Findings: We generated high coverage of long- (Nanopore, 174×) and short- (Illumina, 228×) read data from a single E. pauciflora individual and compared assemblies from 5 assemblers (Canu, SMARTdenovo, Flye, Marvel, and MaSuRCA) with different read lengths (1 and 35 kb minimum read length). A key component of our approach is to keep a randomly selected collection of ∼10% of both long and short reads separated from the assemblies to use as a validation set for assessing assemblies. Using this validation set along with a range of existing tools, we compared the assemblies in 8 ways: contig N50, BUSCO scores, LAI (long terminal repeat assembly index) scores, assembly ploidy, base-level error rate, CGAL (computing genome assembly likelihoods) scores, structural variation, and genome sequence similarity. Our result showed that MaSuRCA generated the best assembly, which is 594.87 Mb in size, with a contig N50 of 3.23 Mb, and an estimated error rate of ∼0.006 errors per base., Conclusions: We report a draft genome of E. pauciflora, which will be a valuable resource for further genomic studies of eucalypts. The approaches for assessing and comparing genomes should help in assessing and choosing among many potential genome assemblies from a single dataset., (© The Author(s) 2020. Published by Oxford University Press.)
- Published
- 2020
- Full Text
- View/download PDF
27. A High-Performance Computing Implementation of Iterative Random Forest for the Creation of Predictive Expression Networks.
- Author
-
Cliff A, Romero J, Kainer D, Walker A, Furches A, and Jacobson D
- Subjects
- Computational Biology, Algorithms, Computer Simulation, Models, Genetic, Quantitative Trait Loci
- Abstract
As time progresses and technology improves, biological data sets are continuously increasing in size. New methods and new implementations of existing methods are needed to keep pace with this increase. In this paper, we present a high-performance computing (HPC)-capable implementation of Iterative Random Forest (iRF). This new implementation enables the explainable-AI eQTL analysis of SNP sets with over a million SNPs. Using this implementation, we also present a new method, iRF Leave One Out Prediction (iRF-LOOP), for the creation of Predictive Expression Networks on the order of 40,000 genes or more. We compare the new implementation of iRF with the previous R version and analyze its time to completion on two of the world's fastest supercomputers, Summit and Titan. We also show iRF-LOOP's ability to capture biologically significant results when creating Predictive Expression Networks. This new implementation of iRF will enable the analysis of biological data sets at scales that were previously not possible.
- Published
- 2019
- Full Text
- View/download PDF
28. Accelerating Climate Resilient Plant Breeding by Applying Next-Generation Artificial Intelligence.
- Author
-
Harfouche AL, Jacobson DA, Kainer D, Romero JC, Harfouche AH, Scarascia Mugnozza G, Moshelion M, Tuskan GA, Keurentjes JJB, and Altman A
- Subjects
- Artificial Intelligence, Biomass, Climate, Climate Change, Ecosystem, Genomics methods, Genotype, Humans, Phenomics methods, Phenotype, Crops, Agricultural genetics, Plant Breeding methods
- Abstract
Breeding crops for high yield and superior adaptability to new and variable climates is imperative to ensure continued food security, biomass production, and ecosystem services. Advances in genomics and phenomics are delivering insights into the complex biological mechanisms that underlie plant functions in response to environmental perturbations. However, linking genotype to phenotype remains a huge challenge and is hampering the optimal application of high-throughput genomics and phenomics to advanced breeding. Critical to success is the need to assimilate large amounts of data into biologically meaningful interpretations. Here, we present the current state of genomics and field phenomics, explore emerging approaches and challenges for multiomics big data integration by means of next-generation (Next-Gen) artificial intelligence (AI), and propose a workable path to improvement., (Copyright © 2019 Elsevier Ltd. All rights reserved.)
- Published
- 2019
- Full Text
- View/download PDF
29. Finding New Cell Wall Regulatory Genes in Populus trichocarpa Using Multiple Lines of Evidence.
- Author
-
Furches A, Kainer D, Weighill D, Large A, Jones P, Walker AM, Romero J, Gazolla JGFM, Joubert W, Shah M, Streich J, Ranjan P, Schmutz J, Sreedasyam A, Macaya-Sanz D, Zhao N, Martin MZ, Rao X, Dixon RA, DiFazio S, Tschaplinski TJ, Chen JG, Tuskan GA, and Jacobson D
- Abstract
Understanding the regulatory network controlling cell wall biosynthesis is of great interest in Populus trichocarpa , both because of its status as a model woody perennial and its importance for lignocellulosic products. We searched for genes with putatively unknown roles in regulating cell wall biosynthesis using an extended network-based Lines of Evidence (LOE) pipeline to combine multiple omics data sets in P. trichocarpa , including gene coexpression, gene comethylation, population level pairwise SNP correlations, and two distinct SNP-metabolite Genome Wide Association Study (GWAS) layers. By incorporating validation, ranking, and filtering approaches we produced a list of nine high priority gene candidates for involvement in the regulation of cell wall biosynthesis. We subsequently performed a detailed investigation of candidate gene GROWTH-REGULATING FACTOR 9 ( PtGRF9 ). To investigate the role of PtGRF9 in regulating cell wall biosynthesis, we assessed the genome-wide connections of PtGRF9 and a paralog across data layers with functional enrichment analyses, predictive transcription factor binding site analysis, and an independent comparison to eQTN data. Our findings indicate that PtGRF9 likely affects the cell wall by directly repressing genes involved in cell wall biosynthesis, such as PtCCoAOMT and PtMYB.41 , and indirectly by regulating homeobox genes. Furthermore, evidence suggests that PtGRF9 paralogs may act as transcriptional co-regulators that direct the global energy usage of the plant. Using our extended pipeline, we show multiple lines of evidence implicating the involvement of these genes in cell wall regulatory functions and demonstrate the value of this method for prioritizing candidate genes for experimental validation., (Copyright © 2019 Furches, Kainer, Weighill, Large, Jones, Walker, Romero, Gazolla, Joubert, Shah, Streich, Ranjan, Schmutz, Sreedasyam, Macaya-Sanz, Zhao, Martin, Rao, Dixon, DiFazio, Tschaplinski, Chen, Tuskan and Jacobson.)
- Published
- 2019
- Full Text
- View/download PDF
30. High marker density GWAS provides novel insights into the genomic architecture of terpene oil yield in Eucalyptus.
- Author
-
Kainer D, Padovan A, Degenhardt J, Krause S, Mondal P, Foley WJ, and Külheim C
- Subjects
- Alkyl and Aryl Transferases genetics, Biosynthetic Pathways, Genes, Plant, Genotype, Inheritance Patterns genetics, Multivariate Analysis, Polymorphism, Single Nucleotide genetics, Quantitative Trait Loci genetics, Reproducibility of Results, Terpenes chemistry, Eucalyptus genetics, Genome, Plant, Genome-Wide Association Study, Plant Oils metabolism, Terpenes metabolism
- Abstract
Terpenoid-based essential oils are economically important commodities, yet beyond their biosynthetic pathways, little is known about the genetic architecture of terpene oil yield from plants. Transport, storage, evaporative loss, transcriptional regulation and precursor competition may be important contributors to this complex trait. Here, we associate 2.39 million single nucleotide polymorphisms derived from shallow whole-genome sequencing of 468 Eucalyptus polybractea individuals with 12 traits related to the overall terpene yield, eight direct measures of terpene concentration and four biomass-related traits. Our results show that in addition to terpene biosynthesis, development of secretory cavities, where terpenes are both synthesized and stored, and transport of terpenes were important components of terpene yield. For sesquiterpene concentrations, the availability of precursors in the cytosol was important. Candidate terpene synthase genes for the production of 1,8-cineole and α-pinene, and β-pinene (which comprised > 80% of the total terpenes) were functionally characterized as a 1,8-cineole synthase and a β/α-pinene synthase. Our results provide novel insights into the genomic architecture of terpene yield and we provide candidate genes for breeding or engineering of crops for biofuels or the production of industrially valuable terpenes., (No claim to US Government works New Phytologist © 2019 New Phytologist Trust.)
- Published
- 2019
- Full Text
- View/download PDF
31. Multitrait genome-wide association analysis of Populus trichocarpa identifies key polymorphisms controlling morphological and physiological traits.
- Author
-
Chhetri HB, Macaya-Sanz D, Kainer D, Biswal AK, Evans LM, Chen JG, Collins C, Hunt K, Mohanty SS, Rosenstiel T, Ryno D, Winkeler K, Yang X, Jacobson D, Mohnen D, Muchero W, Strauss SH, Tschaplinski TJ, Tuskan GA, and DiFazio SP
- Subjects
- Down-Regulation, Gene Regulatory Networks, Genes, Plant, Genotype, Geography, Inheritance Patterns genetics, Multivariate Analysis, Plant Stomata physiology, Populus anatomy & histology, Principal Component Analysis, Genome-Wide Association Study, Polymorphism, Single Nucleotide genetics, Populus genetics, Populus physiology, Quantitative Trait, Heritable
- Abstract
Genome-wide association studies (GWAS) have great promise for identifying the loci that contribute to adaptive variation, but the complex genetic architecture of many quantitative traits presents a substantial challenge. We measured 14 morphological and physiological traits and identified single nucleotide polymorphism (SNP)-phenotype associations in a Populus trichocarpa population distributed from California, USA to British Columbia, Canada. We used whole-genome resequencing data of 882 trees with more than 6.78 million SNPs, coupled with multitrait association to detect polymorphisms with potentially pleiotropic effects. Candidate genes were validated with functional data. Broad-sense heritability (H
2 ) ranged from 0.30 to 0.56 for morphological traits and 0.08 to 0.36 for physiological traits. In total, 4 and 20 gene models were detected using the single-trait and multitrait association methods, respectively. Several of these associations were corroborated by additional lines of evidence, including co-expression networks, metabolite analyses, and direct confirmation of gene function through RNAi. Multitrait association identified many more significant associations than single-trait association, potentially revealing pleiotropic effects of individual genes. This approach can be particularly useful for challenging physiological traits such as water-use efficiency or complex traits such as leaf morphology, for which we were able to identify credible candidate genes by combining multitrait association with gene co-expression and co-methylation data., (No claim to US Government works New Phytologist © 2019 New Phytologist Trust.)- Published
- 2019
- Full Text
- View/download PDF
32. Harnessing the MinION: An example of how to establish long-read sequencing in a laboratory using challenging plant tissue from Eucalyptus pauciflora.
- Author
-
Schalamun M, Nagar R, Kainer D, Beavan E, Eccles D, Rathjen JP, Lanfear R, and Schwessinger B
- Subjects
- DNA, Plant chemistry, DNA, Plant genetics, Workflow, DNA, Plant isolation & purification, Eucalyptus genetics, Sequence Analysis, DNA methods
- Abstract
Long-read sequencing technologies are transforming our ability to assemble highly complex genomes. Realizing their full potential is critically reliant on extracting high-quality, high-molecular-weight (HMW) DNA from the organisms of interest. This is especially the case for the portable MinION sequencer which enables all laboratories to undertake their own genome sequencing projects, due to its low entry cost and minimal spatial footprint. One challenge of the MinION is that each group has to independently establish effective protocols for using the instrument, which can be time-consuming and costly. Here, we present a workflow and protocols that enabled us to establish MinION sequencing in our own laboratories, based on optimizing DNA extraction from a challenging plant tissue as a case study. Following the workflow illustrated, we were able to reliably and repeatedly obtain >6.5 Gb of long-read sequencing data with a mean read length of 13 kb and an N50 of 26 kb. Our protocols are open source and can be performed in any laboratory without special equipment. We also illustrate some more elaborate workflows which can increase mean and average read lengths if this is desired. We envision that our workflow for establishing MinION sequencing, including the illustration of potential pitfalls and suggestions of how to adapt it to other tissue types, will be useful to others who plan to establish long-read sequencing in their own laboratories., (© 2018 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.)
- Published
- 2019
- Full Text
- View/download PDF
33. Assembly of chloroplast genomes with long- and short-read data: a comparison of approaches using Eucalyptus pauciflora as a test case.
- Author
-
Wang W, Schalamun M, Morales-Suarez A, Kainer D, Schwessinger B, and Lanfear R
- Subjects
- Eucalyptus genetics, High-Throughput Nucleotide Sequencing methods, Sequence Analysis, DNA methods, Chloroplasts genetics, Genome, Chloroplast, Inverted Repeat Sequences
- Abstract
Background: Chloroplasts are organelles that conduct photosynthesis in plant and algal cells. The information chloroplast genome contained is widely used in agriculture and studies of evolution and ecology. Correctly assembling chloroplast genomes can be challenging because the chloroplast genome contains a pair of long inverted repeats (10-30 kb). Typically, it is simply assumed that the gross structure of the chloroplast genome matches the most commonly observed structure of two single-copy regions separated by a pair of inverted repeats. The advent of long-read sequencing technologies should remove the need to make this assumption by providing sufficient information to completely span the inverted repeat regions. Yet, long-reads tend to have higher error rates than short-reads, and relatively little is known about the best way to combine long- and short-reads to obtain the most accurate chloroplast genome assemblies. Using Eucalyptus pauciflora, the snow gum, as a test case, we evaluated the effect of multiple parameters, such as different coverage of long-(Oxford nanopore) and short-(Illumina) reads, different long-read lengths, different assembly pipelines, with a view to determining the most accurate and efficient approach to chloroplast genome assembly., Results: Hybrid assemblies combining at least 20x coverage of both long-reads and short-reads generated a single contig spanning the entire chloroplast genome with few or no detectable errors. Short-read-only assemblies generated three contigs (the long single copy, short single copy and inverted repeat regions) of the chloroplast genome. These contigs contained few single-base errors but tended to exclude several bases at the beginning or end of each contig. Long-read-only assemblies tended to create multiple contigs with a much higher single-base error rate. The chloroplast genome of Eucalyptus pauciflora is 159,942 bp, contains 131 genes of known function., Conclusions: Our results suggest that very accurate assemblies of chloroplast genomes can be achieved using a combination of at least 20x coverage of long- and short-reads respectively, provided that the long-reads contain at least ~5x coverage of reads longer than the inverted repeat region. We show that further increases in coverage give little or no improvement in accuracy, and that hybrid assemblies are more accurate than long-read-only or short-read-only assemblies.
- Published
- 2018
- Full Text
- View/download PDF
34. Accuracy of Genomic Prediction for Foliar Terpene Traits in Eucalyptus polybractea .
- Author
-
Kainer D, Stone EA, Padovan A, Foley WJ, and Külheim C
- Subjects
- Algorithms, Biomass, Eucalyptus growth & development, Genotyping Techniques methods, Genotyping Techniques standards, Plant Breeding methods, Plant Breeding standards, Plant Leaves genetics, Plant Leaves metabolism, Quantitative Trait, Heritable, Eucalyptus genetics, Genome, Plant, Oils, Volatile metabolism, Quantitative Trait Loci, Terpenes metabolism
- Abstract
Unlike agricultural crops, most forest species have not had millennia of improvement through phenotypic selection, but can contribute energy and material resources and possibly help alleviate climate change. Yield gains similar to those achieved in agricultural crops over millennia could be made in forestry species with the use of genomic methods in a much shorter time frame. Here we compare various methods of genomic prediction for eight traits related to foliar terpene yield in Eucalyptus polybractea , a tree grown predominantly for the production of Eucalyptus oil. The genomic markers used in this study are derived from shallow whole genome sequencing of a population of 480 trees. We compare the traditional pedigree-based additive best linear unbiased predictors (ABLUP), genomic BLUP (GBLUP), BayesB genomic prediction model, and a form of GBLUP based on weighting markers according to their influence on traits (BLUP|GA). Predictive ability is assessed under varying marker densities of 10,000, 100,000 and 500,000 SNPs. Our results show that BayesB and BLUP|GA perform best across the eight traits. Predictive ability was higher for individual terpene traits, such as foliar α-pinene and 1,8-cineole concentration (0.59 and 0.73, respectively), than aggregate traits such as total foliar oil concentration (0.38). This is likely a function of the trait architecture and markers used. BLUP|GA was the best model for the two biomass related traits, height and 1 year change in height (0.25 and 0.19, respectively). Predictive ability increased with marker density for most traits, but with diminishing returns. The results of this study are a solid foundation for yield improvement of essential oil producing eucalypts. New markets such as biopolymers and terpene-derived biofuels could benefit from rapid yield increases in undomesticated oil-producing species., (Copyright © 2018 Kainer et al.)
- Published
- 2018
- Full Text
- View/download PDF
35. Transcriptome analysis of terpene chemotypes of Melaleuca alternifolia across different tissues.
- Author
-
Bustos-Segura C, Padovan A, Kainer D, Foley WJ, and Külheim C
- Subjects
- Alkyl and Aryl Transferases genetics, Alkyl and Aryl Transferases metabolism, Australia, Cluster Analysis, Gene Expression Regulation, Plant, Genes, Plant, Geography, Least-Squares Analysis, Transcription Factors metabolism, Gene Expression Profiling, Melaleuca genetics, Organ Specificity genetics, Terpenes metabolism
- Abstract
Plant chemotypes or chemical polymorphisms are defined by discrete variation in secondary metabolites within a species. This variation can have consequences for ecological interactions or the human use of plants. Understanding the molecular basis of chemotypic variation can help to explain how variation of plant secondary metabolites is controlled. We explored the transcriptomes of the 3 cardinal terpene chemotypes of Melaleuca alternifolia in young leaves, mature leaves, and stem and compared transcript abundance to variation in the constitutive profile of terpenes. Leaves from chemotype 1 plants (dominated by terpinen-4-ol) show a similar pattern of gene expression when compared to chemotype 5 plants (dominated by 1,8-cineole). Only terpene synthases in young leaves were differentially expressed between these chemotypes, supporting the idea that terpenes are mainly synthetized in young tissue. Chemotype 2 plants (dominated by terpinolene) show a greater degree of differential gene expression compared to the other chemotypes, which might be related to the isolation of plant populations that exhibit this chemotype and the possibility that the terpinolene synthase gene in M. alternifolia was derived by introgression from a closely related species, Melaleuca trichostachya. By using multivariate analyses, we were able to associate terpenes with candidate terpene synthases., (© 2017 John Wiley & Sons Ltd.)
- Published
- 2017
- Full Text
- View/download PDF
36. Plant-Derived Terpenes: A Feedstock for Specialty Biofuels.
- Author
-
Mewalal R, Rai DK, Kainer D, Chen F, Külheim C, Peter GF, and Tuskan GA
- Subjects
- Bioengineering, Biofuels, Plants chemistry, Terpenes chemistry, Terpenes metabolism
- Abstract
Research toward renewable and sustainable energy has identified specific terpenes capable of supplementing or replacing current petroleum-derived fuels. Despite being naturally produced and stored by many plants, there are few examples of commercial recovery of terpenes from plants because of low yields. Plant terpene biosynthesis is regulated at multiple levels, leading to wide variability in terpene content and chemistry. Advances in the plant molecular toolkit, including annotated genomes, high-throughput omics profiling, and genome editing, have begun to elucidate plant terpene metabolism, and such information is useful for bioengineering metabolic pathways for specific terpenes. We review here the status of terpenes as a specialty biofuel and discuss the potential of plants as a viable agronomic solution for future terpene-derived biofuels., (Copyright © 2016 Elsevier Ltd. All rights reserved.)
- Published
- 2017
- Full Text
- View/download PDF
37. Genomic approaches to selection in outcrossing perennials: focus on essential oil crops.
- Author
-
Kainer D, Lanfear R, Foley WJ, and Külheim C
- Subjects
- Crops, Agricultural chemistry, Eucalyptus, Genetic Markers, Genotype, Humulus, Linkage Disequilibrium, Melaleuca, Phenotype, Quantitative Trait Loci, Crops, Agricultural genetics, Oils, Volatile chemistry, Plant Breeding methods, Selection, Genetic
- Abstract
The yield of essential oil in commercially harvested perennial species (e.g. 'Oil Mallee' eucalypts, Tea Trees and Hop) is dependent on complex quantitative traits such as foliar oil concentration, biomass and adaptability. These often show large natural variation and some are highly heritable, which has enabled significant gains in oil yield via traditional phenotypic recurrent selection. Analysis of transcript abundance and allelic diversity has revealed that essential oil yield is likely to be controlled by large numbers of quantitative trait loci that range from a few of medium/large effect to many of small effect. Molecular breeding techniques that exploit this information could increase gains per unit time and address complications of traditional breeding such as genetic correlations between key traits and the lower heritability of biomass. Genomic selection (GS) is a technique that uses the information from markers genotyped across the whole genome in order to predict the phenotype of progeny well before they reach maturity, allowing selection at an earlier age. In this review, we investigate the feasibility of genomic selection (GS) for the improvement of essential oil yield. We explore the challenges facing breeders selecting for oil yield, and how GS might deal with them. We then assess the factors that affect the accuracy of genomic estimated breeding values, such as linkage disequilibrium (LD), heritability, relatedness and the genetic architecture of desirable traits. We conclude that GS has the potential to significantly improve the efficiency of selection for essential oil yield.
- Published
- 2015
- Full Text
- View/download PDF
38. The effects of partitioning on phylogenetic inference.
- Author
-
Kainer D and Lanfear R
- Subjects
- Databases, Genetic, Empirical Research, Evolution, Molecular, Classification methods, Models, Genetic, Phylogeny
- Abstract
Partitioning is a commonly used method in phylogenetics that aims to accommodate variation in substitution patterns among sites. Despite its popularity, there have been few systematic studies of its effects on phylogenetic inference, and there have been no studies that compare the effects of different approaches to partitioning across many empirical data sets. In this study, we applied four commonly used approaches to partitioning to each of 34 empirical data sets, and then compared the resulting tree topologies, branch-lengths, and bootstrap support estimated using each approach. We find that the choice of partitioning scheme often affects tree topology, particularly when partitioning is omitted. Most notably, we find occasional instances where the use of a suboptimal partitioning scheme produces highly supported but incorrect nodes in the tree. Branch-lengths and bootstrap support are also affected by the choice of partitioning scheme, sometimes dramatically so. We discuss the reasons for these effects and make some suggestions for best practice., (© The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.)
- Published
- 2015
- Full Text
- View/download PDF
39. Selecting optimal partitioning schemes for phylogenomic datasets.
- Author
-
Lanfear R, Calcott B, Kainer D, Mayer C, and Stamatakis A
- Subjects
- Algorithms, Cluster Analysis, Evolution, Molecular, Genomics, Phylogeny, Software
- Abstract
Background: Partitioning involves estimating independent models of molecular evolution for different subsets of sites in a sequence alignment, and has been shown to improve phylogenetic inference. Current methods for estimating best-fit partitioning schemes, however, are only computationally feasible with datasets of fewer than 100 loci. This is a problem because datasets with thousands of loci are increasingly common in phylogenetics., Methods: We develop two novel methods for estimating best-fit partitioning schemes on large phylogenomic datasets: strict and relaxed hierarchical clustering. These methods use information from the underlying data to cluster together similar subsets of sites in an alignment, and build on clustering approaches that have been proposed elsewhere., Results: We compare the performance of our methods to each other, and to existing methods for selecting partitioning schemes. We demonstrate that while strict hierarchical clustering has the best computational efficiency on very large datasets, relaxed hierarchical clustering provides scalable efficiency and returns dramatically better partitioning schemes as assessed by common criteria such as AICc and BIC scores., Conclusions: These two methods provide the best current approaches to inferring partitioning schemes for very large datasets. We provide free open-source implementations of the methods in the PartitionFinder software. We hope that the use of these methods will help to improve the inferences made from large phylogenomic datasets.
- Published
- 2014
- Full Text
- View/download PDF
40. Cyclophilin B expression in renal proximal tubules of hypertensive rats.
- Author
-
Kainer DB and Doris PA
- Subjects
- Animals, Base Sequence, Gene Expression Regulation, Immunophilins genetics, Molecular Sequence Data, Peptidylprolyl Isomerase, Rats, Rats, Inbred SHR, Rats, Inbred WKY, Rats, Sprague-Dawley, Up-Regulation, Cyclophilins, Hypertension metabolism, Immunophilins biosynthesis, Kidney Tubules, Proximal metabolism
- Abstract
Rat cyclophilin-like protein (Cy-LP) is a candidate hypertension gene initially identified by differential hybridization and implicated in renal mechanisms of salt retention and high blood pressure. We report the molecular characterization of rat cyclophilin B (CypB) and demonstrate, through sequence analysis and an allele-specific polymerase chain reaction primer assay, that CypB but not Cy-LP is expressed in rat kidney. CypB is an endoplasmic reticulum-localized prolyl-isomerase that interacts with elongation initiation factor 2-beta, an important regulator of protein translation and a central component of the endoplasmic reticulum stress response to hypoxia or ATP depletion. Active renal transport of sodium is increased in the spontaneously hypertensive rat (SHR), and there is evidence that this coincides with hypoxia and ATP depletion in the renal cortex. In the present studies we have examined expression of CypB in rat proximal tubules, which contributes to the increased renal sodium reabsorption in this model of hypertension. We report that CypB transcript abundance is significantly elevated in proximal convoluted tubules from SHR compared with the control Wistar-Kyoto strain. This upregulation occurs in weanling animals and precedes the development of hypertension, indicating that it is not a simple response to hypertension in SHR. Further, CypB expression is also higher in a proximal tubule cell line derived from SHR compared with a similar line derived from Wistar-Kyoto rats, indicating that this difference is genetically determined. No sequence differences were observed in the CypB cDNA from these 2 strains. These observations suggest that a genetically determined alteration in proximal tubules from SHR occurs that leads to increased expression of CypB. In view of evidence linking CypB to the regulation of elongation initiation factor-2, the upregulation of CypB may result from metabolic stress.
- Published
- 2000
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.