81 results on '"Duitama J"'
Search Results
2. EP38.06: Variability in the diagnosis and management of fetal growth restriction across Latin America and the Caribbean
- Author
-
Miranda, J., primary, Martinez‐Portilla, R.J., additional, Parra‐Cordero, M., additional, Ximenes, R., additional, Cafici, D., additional, Abdalla, J., additional, Duitama, J., additional, and Cortes, M. Sanz, additional
- Published
- 2022
- Full Text
- View/download PDF
3. Characterization of the recombinant Brettanomyces anomalus β -glucosidase and its potential for bioflavouring
- Author
-
Vervoort, Y., primary, Herrera-Malaver, B., additional, Mertens, S., additional, Guadalupe Medina, V., additional, Duitama, J., additional, Michiels, L., additional, Derdelinckx, G., additional, Voordeckers, K., additional, and Verstrepen, K.J., additional
- Published
- 2016
- Full Text
- View/download PDF
4. Comparative Analysis of Two Emerging Rice Seed Bacterial Pathogens
- Author
-
Fory, P. A., primary, Triplett, L., additional, Ballen, C., additional, Abello, J. F., additional, Duitama, J., additional, Aricapa, M. G., additional, Prado, G. A., additional, Correa, F., additional, Hamilton, J., additional, Leach, J. E., additional, Tohme, J., additional, and Mosquera, G. M., additional
- Published
- 2014
- Full Text
- View/download PDF
5. Towards accurate detection and genotyping of expressed variants from whole transcriptome sequencing data.
- Author
-
Duitama, J., Srivastava, P.K., and Ma?ndoiu, I.I.
- Published
- 2011
- Full Text
- View/download PDF
6. A RDF description model for manipulating learning objects.
- Author
-
Bouzeghoub, A., Defude, B., Ammour, S., Duitama, J.-F., and Lecocq, C.
- Published
- 2004
- Full Text
- View/download PDF
7. A RDF description model for manipulating learning objects
- Author
-
Bouzeghoub, A., primary, Defude, B., additional, Ammour, S., additional, Duitama, J.-F., additional, and Lecocq, C., additional
- Full Text
- View/download PDF
8. Linkage disequilibrium based genotype calling from low-coverage shotgun sequencing reads
- Author
-
Wu Yufeng, Hernández Yözen, Dinakar Sanjiv, Kennedy Justin, Duitama Jorge, and Măndoiu Ion I
- Subjects
Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background Recent technology advances have enabled sequencing of individual genomes, promising to revolutionize biomedical research. However, deep sequencing remains more expensive than microarrays for performing whole-genome SNP genotyping. Results In this paper we introduce a new multi-locus statistical model and computationally efficient genotype calling algorithms that integrate shotgun sequencing data with linkage disequilibrium (LD) information extracted from reference population panels such as Hapmap or the 1000 genomes project. Experiments on publicly available 454, Illumina, and ABI SOLiD sequencing datasets suggest that integration of LD information results in genotype calling accuracy comparable to that of microarray platforms from sequencing data of low-coverage. A software package implementing our algorithm, released under the GNU General Public License, is available at http://dna.engr.uconn.edu/software/GeneSeq/. Conclusions Integration of LD information leads to significant improvements in genotype calling accuracy compared to prior LD-oblivious methods, rendering low-coverage sequencing as a viable alternative to microarrays for conducting large-scale genome-wide association studies.
- Published
- 2011
- Full Text
- View/download PDF
9. Workshop: Bioinformatics pipeline for fosmid based molecular haplotype sequencing.
- Author
-
Duitama, J., Eun-Kyung Suk, Schulz, S., McEwen, G., Huebsch, T., and Hoehe, M.
- Published
- 2011
- Full Text
- View/download PDF
10. QTL mapping for pod quality and yield traits in snap bean ( Phaseolus vulgaris L.).
- Author
-
Njau SN, Parker TA, Duitama J, Gepts P, and Arunga EE
- Abstract
Pod quality and yield traits in snap bean ( Phaseolus vulgaris L.) influence consumer preferences, crop adoption by farmers, and the ability of the product to be commercially competitive locally and globally. The objective of the study was to identify the quantitative trait loci (QTL) for pod quality and yield traits in a snap × dry bean recombinant inbred line (RIL) population. A total of 184 F
6 RILs derived from a cross between Vanilla (snap bean) and MCM5001 (dry bean) were grown in three field sites in Kenya and one greenhouse environment in Davis, CA, USA. They were genotyped at 5,951 single nucleotide polymorphisms (SNPs), and composite interval mapping was conducted to identify QTL for 16 pod quality and yield traits, including pod wall fiber, pod string, pod size, and harvest metrics. A combined total of 44 QTL were identified in field and greenhouse trials. The QTL for pod quality were identified on chromosomes Pv01, Pv02, Pv03, Pv04, Pv06, and Pv07, and for pod yield were identified on Pv08. Co-localization of QTL was observed for pod quality and yield traits. Some identified QTL overlapped with previously mapped QTL for pod quality and yield traits, with several others identified as novel. The identified QTL can be used in future marker-assisted selection in snap bean., Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The handling editor AS declared a past co-authorship with the author PG. The author(s) declared that they were an editorial board member of Frontiers, at the time of submission. This had no impact on the peer review process and the final decision., (Copyright © 2024 Njau, Parker, Duitama, Gepts and Arunga.)- Published
- 2024
- Full Text
- View/download PDF
11. Rapid evolutionary tuning of endospore quantity versus quality trade-off via a phase-variable contingency locus.
- Author
-
Kim TD, Khanal S, Bäcker LE, Lood C, Kerremans A, Gorivale S, Begyn K, Cambré A, Rajkovic A, Devlieghere F, Heyndrickx M, Michiels C, Duitama J, and Aertsen A
- Subjects
- Biological Evolution, Bacterial Proteins genetics, Bacterial Proteins metabolism, Evolution, Molecular, Ultraviolet Rays, Spores, Bacterial genetics, Bacillus cereus genetics
- Abstract
The UV resistance of bacterial endospores is an important quality supporting their survival in inhospitable environments and therefore constitutes an essential driver of the ecological success of spore-forming bacteria. Nevertheless, the variability and evolvability of this trait are poorly understood. In this study, directed evolution and genetics approaches revealed that the Bacillus cereus pdaA gene (encoding the endospore-specific peptidoglycan-N-acetylmuramic acid deacetylase) serves as a contingency locus in which the expansion and contraction of short tandem repeats can readily compromise (PdaA
OFF ) or restore (PdaAON ) the pdaA open reading frame. Compared with B. cereus populations in the PdaAON state, populations in the PdaAOFF state produced a lower yield of viable endospores but endowed them with vastly increased UV resistance. Moreover, selection pressures based on either quantity (i.e., yield of viable endospores) or quality (i.e., UV resistance of viable endospores) aspects could readily shift populations between PdaAON and PdaAOFF states, respectively. Bioinformatic analysis also revealed that pdaA homologs within the Bacillus and Clostridium genera are often equipped with several short tandem repeat regions, suggesting a wider implementation of the pdaA-mediated phase variability in other sporeformers as well. These results for the first time reveal (1) pdaA as a phase-variable contingency locus in the adaptive evolution of endospore properties and (2) bet-hedging between what appears to be a quantity versus quality trade-off in endospore crops., Competing Interests: Declaration of interests The authors declare no competing interests., (Copyright © 2024 Elsevier Inc. All rights reserved.)- Published
- 2024
- Full Text
- View/download PDF
12. vCSF Danger-associated Molecular Patterns After Traumatic and Nontraumatic Acute Brain Injury: A Prospective Study.
- Author
-
Santacruz CA, Vincent JL, Duitama J, Bautista E, Imbault V, Bruneau M, Creteur J, Brimioulle S, Communi D, and Taccone FS
- Subjects
- Humans, Male, Female, Prospective Studies, Middle Aged, Adult, Brain Injuries, Traumatic, Intracranial Pressure, Aged, Cerebrospinal Fluid metabolism, Alarmins metabolism, Cerebral Ventricles metabolism, Glasgow Outcome Scale, Brain Injuries metabolism, Intracranial Hypertension etiology
- Abstract
Background: Danger-associated molecular patterns (DAMPs) may be implicated in the pathophysiological pathways associated with an unfavorable outcome after acute brain injury (ABI)., Methods: We collected samples of ventricular cerebrospinal fluid (vCSF) for 5 days in 50 consecutive patients at risk of intracranial hypertension after traumatic and nontraumatic ABI. Differences in vCSF protein expression over time were evaluated using linear models and selected for functional network analysis using the PANTHER and STRING databases. The primary exposure of interest was the type of brain injury (traumatic vs. nontraumatic), and the primary outcome was the vCSF expression of DAMPs. Secondary exposures of interest included the occurrence of intracranial pressure ≥20 or ≥ 30 mm Hg during the 5 days post-ABI, intensive care unit (ICU) mortality, and neurological outcome (assessed using the Glasgow Outcome Score) at 3 months post-ICU discharge. Secondary outcomes included associations of these exposures with the vCSF expression of DAMPs., Results: A network of 6 DAMPs ( DAMP_trauma ; protein-protein interaction [PPI] P =0.04) was differentially expressed in patients with ABI of traumatic origin compared with those with nontraumatic ABI. ABI patients with intracranial pressure ≥30 mm Hg differentially expressed a set of 38 DAMPS ( DAMP_ICP30 ; PPI P < 0.001). Proteins in DAMP_ICP30 are involved in cellular proteolysis, complement pathway activation, and post-translational modifications. There were no relationships between DAMP expression and ICU mortality or unfavorable versus favorable outcomes., Conclusions: Specific patterns of vCSF DAMP expression differentiated between traumatic and nontraumatic types of ABI and were associated with increased episodes of severe intracranial hypertension., Competing Interests: D.C. is a Senior Research Associate at the FRS-FNRS. The remaining authors have no conflicts of interest to disclose., (Copyright © 2023 Wolters Kluwer Health, Inc. All rights reserved.)
- Published
- 2024
- Full Text
- View/download PDF
13. Phylogenomic approaches reveal a robust time-scale phylogeny of the Terminal Fusarium Clade.
- Author
-
Lizcano Salas AF, Duitama J, Restrepo S, and Celis Ramírez AM
- Abstract
The Terminal Fusarium Clade (TFC) is a group in the Nectriaceae family with agricultural and clinical relevance. In recent years, various phylogenies have been presented in the literature, showing disagreement in the topologies, but only a few studies have conducted analyses on the divergence time scale of the group. Therefore, the evolutionary history of this group is still being determined. This study aimed to understand the evolutionary history of the TFC from a phylogenomic perspective. To achieve this objective, we performed a phylogenomic analysis using the available genomes in GenBank and ran eight different pipelines. We presented a new robust topology of the TFC that differs at some nodes from previous studies. These new relationships allowed us to formulate new hypotheses about the evolutionary history of the TFC. We also inferred new divergence time estimates, which differ from those of previous studies due to topology discordances and taxon sampling. The results suggested an important diversification process in the Neogene period, likely associated with the diversification and predominance of terrestrial ecosystems by angiosperms. In conclusion, we presented a robust time-scale phylogeny that allowed us to formulate new hypotheses regarding the evolutionary history of the TFC., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
14. A phased genome assembly of a Colombian Trypanosoma cruzi TcI strain and the evolution of gene families.
- Author
-
Hoyos Sanchez MC, Ospina Zapata HS, Suarez BD, Ospina C, Barbosa HJ, Carranza Martinez JC, Vallejo GA, Urrea Montes D, and Duitama J
- Subjects
- Humans, Colombia, Histones, Brazil, Trypanosoma cruzi genetics, Chagas Disease
- Abstract
Chagas is an endemic disease in tropical regions of Latin America, caused by the parasite Trypanosoma cruzi. High intraspecies variability and genome complexity have been challenges to assemble high quality genomes needed for studies in evolution, population genomics, diagnosis and drug development. Here we present a chromosome-level phased assembly of a TcI T. cruzi strain (Dm25). While 29 chromosomes show a large collinearity with the assembly of the Brazil A4 strain, three chromosomes show both large heterozygosity and large divergence, compared to previous assemblies of TcI T. cruzi strains. Nucleotide and protein evolution statistics indicate that T. cruzi Marinkellei separated before the diversification of T. cruzi in the known DTUs. Interchromosomal paralogs of dispersed gene families and histones appeared before but at the same time have a more strict purifying selection, compared to other repeat families. Previously unreported large tandem arrays of protein kinases and histones were identified in this assembly. Over one million variants obtained from Illumina reads aligned to the primary assembly clearly separate the main DTUs. We expect that this new assembly will be a valuable resource for further studies on evolution and functional genomics of Trypanosomatids., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
15. A graph clustering algorithm for detection and genotyping of structural variants from long reads.
- Author
-
Gaitán N and Duitama J
- Subjects
- Bayes Theorem, Genotype, Cluster Analysis, Algorithms, Benchmarking
- Abstract
Background: Structural variants (SVs) are genomic polymorphisms defined by their length (>50 bp). The usual types of SVs are deletions, insertions, translocations, inversions, and copy number variants. SV detection and genotyping is fundamental given the role of SVs in phenomena such as phenotypic variation and evolutionary events. Thus, methods to identify SVs using long-read sequencing data have been recently developed., Findings: We present an accurate and efficient algorithm to predict germline SVs from long-read sequencing data. The algorithm starts collecting evidence (signatures) of SVs from read alignments. Then, signatures are clustered based on a Euclidean graph with coordinates calculated from lengths and genomic positions. Clustering is performed by the DBSCAN algorithm, which provides the advantage of delimiting clusters with high resolution. Clusters are transformed into SVs and a Bayesian model allows to precisely genotype SVs based on their supporting evidence. This algorithm is integrated into the single sample variants detector of the Next Generation Sequencing Experience Platform, which facilitates the integration with other functionalities for genomics analysis. We performed multiple benchmark experiments, including simulation and real data, representing different genome profiles, sequencing technologies (PacBio HiFi, ONT), and read depths., Conclusion: The results show that our approach outperformed state-of-the-art tools on germline SV calling and genotyping, especially at low depths, and in error-prone repetitive regions. We believe this work significantly contributes to the development of bioinformatic strategies to maximize the use of long-read sequencing technologies., (© The Author(s) 2024. Published by Oxford University Press GigaScience.)
- Published
- 2024
- Full Text
- View/download PDF
16. Selection signatures and population dynamics of transposable elements in lima bean.
- Author
-
Lozano-Arce D, García T, Gonzalez-Garcia LN, Guyot R, Chacón-Sánchez MI, and Duitama J
- Subjects
- DNA Transposable Elements genetics, Polymorphism, Single Nucleotide, Population Dynamics, Phaseolus genetics
- Abstract
The domestication process in lima bean (Phaseolus lunatus L.) involves two independent events, within the Mesoamerican and Andean gene pools. This makes lima bean an excellent model to understand convergent evolution. The mechanisms of adaptation followed by Mesoamerican and Andean landraces are largely unknown. Genes related to these adaptations can be selected by identification of selective sweeps within gene pools. Previous genetic analyses in lima bean have relied on Single Nucleotide Polymorphism (SNP) loci, and have ignored transposable elements (TEs). Here we show the analysis of whole-genome sequencing data from 61 lima bean accessions to characterize a genomic variation database including TEs and SNPs, to associate selective sweeps with variable TEs and to predict candidate domestication genes. A small percentage of genes under selection are shared among gene pools, suggesting that domestication followed different genetic avenues in both gene pools. About 75% of TEs are located close to genes, which shows their potential to affect gene functions. The genetic structure inferred from variable TEs is consistent with that obtained from SNP markers, suggesting that TE dynamics can be related to the demographic history of wild and domesticated lima bean and its adaptive processes, in particular selection processes during domestication., (© 2023. The Author(s).)
- Published
- 2023
- Full Text
- View/download PDF
17. Efficient homology-based annotation of transposable elements using minimizers.
- Author
-
Gonzalez-García LN, Lozano-Arce D, Londoño JP, Guyot R, and Duitama J
- Abstract
Premise: Transposable elements (TEs) make up more than half of the genomes of complex plant species and can modulate the expression of neighboring genes, producing significant variability of agronomically relevant traits. The availability of long-read sequencing technologies allows the building of genome assemblies for plant species with large and complex genomes. Unfortunately, TE annotation currently represents a bottleneck in the annotation of genome assemblies., Methods and Results: We present a new functionality of the Next-Generation Sequencing Experience Platform (NGSEP) to perform efficient homology-based TE annotation. Sequences in a reference library are treated as long reads and mapped to an input genome assembly. A hierarchical annotation is then assigned by homology using the annotation of the reference library. We tested the performance of our algorithm on genome assemblies of different plant species, including Arabidopsis thaliana , Oryza sativa, Coffea humblotiana , and Triticum aestivum (bread wheat). Our algorithm outperforms traditional homology-based annotation tools in speed by a factor of three to >20, reducing the annotation time of the T. aestivum genome from months to hours, and recovering up to 80% of TEs annotated with RepeatMasker with a precision of up to 0.95., Conclusions: NGSEP allows rapid analysis of TEs, especially in very large and TE-rich plant genomes., (© 2023 The Authors. Applications in Plant Sciences published by Wiley Periodicals LLC on behalf of Botanical Society of America.)
- Published
- 2023
- Full Text
- View/download PDF
18. NGSEP 4: Efficient and accurate identification of orthogroups and whole-genome alignment.
- Author
-
Tello D, Gonzalez-Garcia LN, Gomez J, Zuluaga-Monares JC, Garcia R, Angel R, Mahecha D, Duarte E, Leon MDR, Reyes F, Escobar-Velásquez C, Linares-Vásquez M, Cardozo N, and Duitama J
- Subjects
- Genomics methods, Algorithms, Metagenomics, Software, Genome
- Abstract
Whole-genome alignment allows researchers to understand the genomic structure and variation among genomes. Approaches based on direct pairwise comparisons of DNA sequences require large computational capacities. As a consequence, pipelines combining tools for orthologous gene identification and synteny have been developed. In this manuscript, we present the latest functionalities implemented in NGSEP 4, to identify orthogroups and perform whole genome alignments. NGSEP implements functionalities for identification of clusters of homologus genes, synteny analysis and whole genome alignment. Our results showed that the NGSEP algorithm for orthogroups identification has competitive accuracy and efficiency in comparison to commonly used tools. The implementation also includes a visualization of the whole genome alignment based on synteny of the orthogroups that were identified, and a reconstruction of the pangenome based on frequencies of the orthogroups among the genomes. NGSEP 4 also includes a new graphical user interface based on the JavaFX technology. We expect that these new developments will be very useful for several studies in evolutionary biology and population genomics., (© 2022 John Wiley & Sons Ltd.)
- Published
- 2023
- Full Text
- View/download PDF
19. Omics approaches to understand cocoa processing and chocolate flavor development: A review.
- Author
-
Herrera-Rocha F, Fernández-Niño M, Cala MP, Duitama J, and Barrios AFG
- Subjects
- Food, Candida, Chocolate, Cacao, Bacillus
- Abstract
The global market of chocolate has increased worldwide during the last decade and is expected to reach a value of USD 200 billion by 2028. Chocolate is obtained from different varieties of Theobroma cacao L, a plant domesticated more than 4000 years ago in the Amazon rainforest. However, chocolate production is a complex process requiring extensive post-harvesting, mainly involving cocoa bean fermentation, drying, and roasting. These steps have a critical impact on chocolate quality. Standardizing and better understanding cocoa processing is, therefore, a current challenge to boost the global production of high-quality cocoa worldwide. This knowledge can also help cocoa producers improve cocoa processing management and obtain a better chocolate. Several recent studies have been conducted to dissect cocoa processing via omics analysis. A vast amount of data has been produced regarding omics studies of cocoa processing performed worldwide. This review systematically analyzes the current data on cocoa omics using data mining techniques and discusses opportunities and gaps for cocoa processing standardization from this data. First, we observed a recurrent report in metagenomics studies of species of the fungi genus Candida and Pichia as well as bacteria from the genus Lactobacillus, Acetobacter, and Bacillus. Second, our analyzes of the available metabolomics data showed clear differences in the identified metabolites in cocoa and chocolate from different geographical origin, cocoa type, and processing stage. Finally, our analysis of peptidomics data revealed characteristic patterns in the gathered data including higher diversity and lower size distribution of peptides in fine-flavor cocoa. In addition, we discuss the current challenges in cocoa omics research. More research is still required to fill gaps in central matter in chocolate production as starter cultures for cocoa fermentation, flavor evolution of cocoa, and the role of peptides in the development of specific flavor notes. We also offer the most comprehensive collection of multi-omics data in cocoa processing gathered from different research articles., Competing Interests: Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper., (Copyright © 2023 Elsevier Ltd. All rights reserved.)
- Published
- 2023
- Full Text
- View/download PDF
20. New algorithms for accurate and efficient de novo genome assembly from long DNA sequencing reads.
- Author
-
Gonzalez-Garcia L, Guevara-Barrientos D, Lozano-Arce D, Gil J, Díaz-Riaño J, Duarte E, Andrade G, Bojacá JC, Hoyos-Sanchez MC, Chavarro C, Guayazan N, Chica LA, Buitrago Acosta MC, Bautista E, Trujillo M, and Duitama J
- Subjects
- Sequence Analysis, DNA methods, Genome, Software, High-Throughput Nucleotide Sequencing methods, Algorithms
- Abstract
Building de novo genome assemblies for complex genomes is possible thanks to long-read DNA sequencing technologies. However, maximizing the quality of assemblies based on long reads is a challenging task that requires the development of specialized data analysis techniques. We present new algorithms for assembling long DNA sequencing reads from haploid and diploid organisms. The assembly algorithm builds an undirected graph with two vertices for each read based on minimizers selected by a hash function derived from the k-mer distribution. Statistics collected during the graph construction are used as features to build layout paths by selecting edges, ranked by a likelihood function. For diploid samples, we integrated a reimplementation of the ReFHap algorithm to perform molecular phasing. We ran the implemented algorithms on PacBio HiFi and Nanopore sequencing data taken from haploid and diploid samples of different species. Our algorithms showed competitive accuracy and computational efficiency, compared with other currently used software. We expect that this new development will be useful for researchers building genome assemblies for different species., (© 2023 Gonzalez-Garcia et al.)
- Published
- 2023
- Full Text
- View/download PDF
21. Phased Genome Assemblies.
- Author
-
Duitama J
- Subjects
- Sequence Analysis, DNA methods, Haplotypes, Alleles, High-Throughput Nucleotide Sequencing methods, Algorithms, Computational Biology
- Abstract
The ultimate goal of de novo assembly of reads sequenced from a diploid individual is the separate reconstruction of the sequences corresponding to the two copies of each chromosome. Unfortunately, the allele linkage information needed to perform phased genome assemblies has been difficult to generate. Hence, most current genome assemblies are a haploid mixture of the two underlying chromosome copies present in the sequenced individual. Sequencing technologies providing long (20 kb) and accurate reads are the basis to generate phased genome assemblies. This chapter provides a brief overview of the main milestones in traditional genome assembly, focusing on the bioinformatic techniques developed to generate haplotype information from different specialized protocols. Using these techniques as a knowledge background, the chapter reviews the current algorithms to generate phased assemblies from long reads with low error rates. Current techniques perform haplotype-aware error correction steps to increase the quality of the raw reads. In addition, variations on the traditional overlap-layout-consensus (OLC) graph have been developed in an effort to eliminate edges between reads sequenced from different chromosome copies. This allows for large presence-absence variants between the chromosome copies to be taken into account. The development of these algorithms, along with the improved sequencing technologies has been crucial to finish chromosome-level assemblies of complex genomes., (© 2023. The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature.)
- Published
- 2023
- Full Text
- View/download PDF
22. Lipidomic profiling of bioactive lipids during spontaneous fermentations of fine-flavor cocoa.
- Author
-
Herrera-Rocha F, Cala MP, León-Inga AM, Aguirre Mejía JL, Rodríguez-López CM, Florez SL, Chica MJ, Olarte HH, Duitama J, González Barrios AF, and Fernández-Niño M
- Subjects
- Fermentation, Lipidomics, Lipids, Taste, Cacao, Chocolate
- Abstract
The impact of cocoa lipid content on chocolate quality has been extensively described. Nevertheless, few studies have elucidated the cocoa lipid composition and their bioactive properties, focusing only on specific lipids. In the present study the lipidome of fine-flavor cocoa fermentation was analyzed using LC-MS-QTOF and a Machine Learning model to assess potential bioactivity was developed. Our results revealed that the cocoa lipidome, comprised mainly of fatty acyls and glycerophospholipids, remains stable during fine-flavor cocoa fermentations. Also, several Machine Learning algorithms were trained to explore potential biological activity among the identified lipids. We found that K-Nearest Neighbors had the best performance. This model was used to classify the identified lipids as bioactive or non-bioactive, nominating 28 molecules as potential bioactive lipids. None of these compounds have been previously reported as bioactive. Our work is the first untargeted lipidomic study and systematic effort to investigate potential bioactivity in fine-flavor cocoa lipids., (Copyright © 2022 Elsevier Ltd. All rights reserved.)
- Published
- 2022
- Full Text
- View/download PDF
23. The Cerebrospinal Fluid Proteomic Response to Traumatic and Nontraumatic Acute Brain Injury: A Prospective Study.
- Author
-
Santacruz CA, Vincent JL, Duitama J, Bautista E, Imbault V, Bruneau M, Creteur J, Brimioulle S, Communi D, and Taccone FS
- Subjects
- Biomarkers, Cholesterol, Glial Fibrillary Acidic Protein, Humans, Intracranial Pressure physiology, Prospective Studies, Proteomics, Brain Injuries, Brain Injuries, Traumatic, Intracranial Hypertension etiology, Subarachnoid Hemorrhage complications
- Abstract
Background: Quantitative analysis of ventricular cerebrospinal fluid (vCSF) proteins following acute brain injury (ABI) may help identify pathophysiological pathways and potential biomarkers that can predict unfavorable outcome., Methods: In this prospective proteomic analysis study, consecutive patients with severe ABI expected to require intraventricular catheterization for intracranial pressure (ICP) monitoring for at least 5 days and patients without ABI admitted for elective clipping of an unruptured cerebral aneurysm were included. vCSF samples were collected within the first 24 h after ABI and ventriculostomy insertion and then every 24 h for 5 days. In patients without ABI, a single vCSF sample was collected at the time of elective clipping. Data-independent acquisition and sequential window acquisition of all theoretical spectra (SWATH) mass spectrometry were used to compare differences in protein expression in patients with ABI and patients without ABI and in patients with traumatic and nontraumatic ABI. Differences in protein expression according to different ICP values, intensive care unit outcome, subarachnoid hemorrhage (SAH) versus traumatic brain injury (TBI), and good versus poor 3-month functional status (assessed by using the Glasgow Outcome Scale) were also evaluated. vCSF proteins with significant differences between groups were compared by using linear models and selected for gene ontology analysis using R Language and the Panther database., Results: We included 50 patients with ABI (SAH n = 23, TBI n = 15, intracranial hemorrhage n = 6, ischemic stroke n = 3, others n = 3) and 12 patients without ABI. There were significant differences in the expression of 255 proteins between patients with and without ABI (p < 0.01). There were intraday and interday differences in expression of seven proteins related to increased inflammation, apoptosis, oxidative stress, and cellular response to hypoxia and injury. Among these, glial fibrillary acidic protein expression was higher in patients with ABI with severe intracranial hypertension (ICH) (ICP ≥ 30 mm Hg) or death compared to those without (log 2 fold change: + 2.4; p < 0.001), suggesting extensive primary astroglial injury or death. There were differences in the expression of 96 proteins between patients with traumatic and nontraumatic ABI (p < 0.05); intraday and interday differences were observed for six proteins related to structural damage, complement activation, and cholesterol metabolism. Thirty-nine vCSF proteins were associated with an increased risk of severe ICH (ICP ≥ 30 mm Hg) in patients with traumatic compared with nontraumatic ABI (p < 0.05). No significant differences were found in protein expression between patients with SAH versus TBI or between those with good versus poor 3-month Glasgow Outcome Scale score., Conclusions: Dysregulated vCSF protein expression after ABI may be associated with an increased risk of severe ICH and death., (© 2022. Springer Science+Business Media, LLC, part of Springer Nature and Neurocritical Care Society.)
- Published
- 2022
- Full Text
- View/download PDF
24. Loss of pod strings in common bean is associated with gene duplication, retrotransposon insertion and overexpression of PvIND.
- Author
-
Parker TA, Cetz J, de Sousa LL, Kuzay S, Lo S, Floriani TO, Njau S, Arunga E, Duitama J, Jernstedt J, Myers JR, Llaca V, Herrera-Estrella A, and Gepts P
- Subjects
- Domestication, Gene Duplication, Phenotype, Retroelements genetics, Phaseolus genetics
- Abstract
Fruit development has been central in the evolution and domestication of flowering plants. In common bean (Phaseolus vulgaris), the principal global grain legume staple, two main production categories are distinguished by fibre deposition in pods: dry beans, with fibrous, stringy pods; and stringless snap/green beans, with reduced fibre deposition, which frequently revert to the ancestral stringy state. Here, we identify genetic and developmental patterns associated with pod fibre deposition. Transcriptional, anatomical, epigenetic and genetic regulation of pod strings were explored through RNA-seq, RT-qPCR, fluorescence microscopy, bisulfite sequencing and whole-genome sequencing. Overexpression of the INDEHISCENT ('PvIND') orthologue was observed in stringless types compared with isogenic stringy lines, associated with overspecification of weak dehiscence-zone cells throughout the pod vascular sheath. No differences in DNA methylation were correlated with this phenotype. Nonstringy varieties showed a tandemly direct duplicated PvIND and a Ty1-copia retrotransposon inserted between the two repeats. These sequence features are lost during pod reversion and are predictive of pod phenotype in diverse materials, supporting their role in PvIND overexpression and reversible string phenotype. Our results give insight into reversible gain-of-function mutations and possible genetic solutions to the reversion problem, of considerable economic value for green bean production., (© 2022 The Authors. New Phytologist © 2022 New Phytologist Foundation.)
- Published
- 2022
- Full Text
- View/download PDF
25. Editorial: Grass Genome Evolution and Domestication.
- Author
-
Duitama J, Bartley LE, Guyot R, and Sharma R
- Abstract
Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
- Published
- 2022
- Full Text
- View/download PDF
26. Machine learning models for accurate prioritization of variants of uncertain significance.
- Author
-
Mahecha D, Nuñez H, Lattig MC, and Duitama J
- Subjects
- Humans, Neural Networks, Computer, Software, Support Vector Machine, High-Throughput Nucleotide Sequencing methods, Machine Learning
- Abstract
The growing use of next-generation sequencing technologies on genetic diagnosis has produced an exponential increase in the number of variants of uncertain significance (VUS). In this manuscript, we compare three machine learning methods to classify VUS as Pathogenic or No pathogenic, implementing a Random Forest (RF), a Support Vector Machine (SVM), and a Multilayer Perceptron. To train the models, we extracted high-quality variants from ClinVar that were previously classified as VUS. For each variant, we retrieved nine conservation scores, the loss-of-function tool, and allele frequencies. For the RF and SVM models, hyperparameters were tuned using cross-validation with a grid search. The three models were tested on a nonoverlapping set of variants that had been classified as VUS over the last 3 years, but had been reclassified in August 2020. The three models yielded superior accuracy on this set compared to the benchmarked tools. The RF-based model yielded the best performance across different variant types and was used to create VusPrize, an open-source software tool for prioritization of VUS. We believe that our model can improve the process of genetic diagnosis in research and clinical settings., (© 2022 Wiley Periodicals LLC.)
- Published
- 2022
- Full Text
- View/download PDF
27. Robust and efficient software for reference-free genomic diversity analysis of genotyping-by-sequencing data on diploid and polyploid species.
- Author
-
Parra-Salazar A, Gomez J, Lozano-Arce D, Reyes-Herrera PH, and Duitama J
- Subjects
- Genomics, Genotype, Humans, Software, Diploidy, Polyploidy
- Abstract
Genotyping-by-sequencing (GBS) is a widely used and cost-effective technique for obtaining large numbers of genetic markers from populations by sequencing regions adjacent to restriction cut sites. Although a standard reference-based pipeline can be followed to analyse GBS reads, a reference genome is still not available for a large number of species. Hence, reference-free approaches are required to generate the genetic variability information that can be obtained from a GBS experiment. Unfortunately, available tools to perform de novo analysis of GBS reads face issues of usability, accuracy and performance. Furthermore, few available tools are suitable for analysing data sets from polyploid species. In this manuscript, we describe a novel algorithm to perform reference-free variant detection and genotyping from GBS reads. Nonexact searches on a dynamic hash table of consensus sequences allow for efficient read clustering and sorting. This algorithm was integrated in the Next Generation Sequencing Experience Platform (NGSEP) to integrate the state-of-the-art variant detector already implemented in this tool. We performed benchmark experiments with three different empirical data sets of plants and animals with different population structures and ploidies, and sequenced with different GBS protocols at different read depths. These experiments show that NGSEP has comparable and in some cases better accuracy and always better computational efficiency compared to existing solutions. We expect that this new development will be useful for many research groups conducting population genetic studies in a wide variety of species., (© 2021 John Wiley & Sons Ltd.)
- Published
- 2022
- Full Text
- View/download PDF
28. Comprehensive Time-Series Analysis of the Gene Expression Profile in a Susceptible Cultivar of Tree Tomato ( Solanum betaceum ) During the Infection of Phytophthora betacei .
- Author
-
Bautista D, Guayazan-Palacios N, Buitrago MC, Cardenas M, Botero D, Duitama J, Bernal AJ, and Restrepo S
- Abstract
Solanum betaceum is a tree from the Andean region bearing edible fruits, considered an exotic export. Although there has been renewed interest in its commercialization, sustainability, and disease management have been limiting factors. Phytophthora betacei is a recently described species that causes late blight in S. betaceum . There is no general study of the response of S. betaceum , particularly, in the changes in expression of pathogenesis-related genes. In this manuscript we present a comprehensive RNA-seq time-series study of the plant response to the infection of P. betacei . Following six time points of infection, the differentially expressed genes (DEGs) involved in the defense by the plant were contextualized in a sequential manner. We documented 5,628 DEGs across all time-points. From 6 to 24 h post-inoculation, we highlighted DEGs involved in the recognition of the pathogen by the likely activation of pattern-triggered immunity (PTI) genes. We also describe the possible effect of the pathogen effectors in the host during the effector-triggered response. Finally, we reveal genes related to the susceptible outcome of the interaction caused by the onset of necrotrophy and the sharp transcriptional changes as a response to the pathogen. This is the first report of the transcriptome of the tree tomato in response to the newly described pathogen P. betacei., Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (Copyright © 2021 Bautista, Guayazan-Palacios, Buitrago, Cardenas, Botero, Duitama, Bernal and Restrepo.)
- Published
- 2021
- Full Text
- View/download PDF
29. Unraveling the Genome of a High Yielding Colombian Sugarcane Hybrid.
- Author
-
Trujillo-Montenegro JH, Rodríguez Cubillos MJ, Loaiza CD, Quintero M, Espitia-Navarro HF, Salazar Villareal FA, Viveros Valens CA, González Barrios AF, De Vega J, Duitama J, and Riascos JJ
- Abstract
Recent developments in High Throughput Sequencing (HTS) technologies and bioinformatics, including improved read lengths and genome assemblers allow the reconstruction of complex genomes with unprecedented quality and contiguity. Sugarcane has one of the most complicated genomes among grassess with a haploid length of 1Gbp and a ploidies between 8 and 12. In this work, we present a genome assembly of the Colombian sugarcane hybrid CC 01-1940. Three types of sequencing technologies were combined for this assembly: PacBio long reads, Illumina paired short reads, and Hi-C reads. We achieved a median contig length of 34.94 Mbp and a total genome assembly of 903.2 Mbp. We annotated a total of 63,724 protein coding genes and performed a reconstruction and comparative analysis of the sucrose metabolism pathway. Nucleotide evolution measurements between orthologs with close species suggest that divergence between Saccharum officinarum and Saccharum spontaneum occurred <2 million years ago. Synteny analysis between CC 01-1940 and the S. spontaneum genome confirms the presence of translocation events between the species and a random contribution throughout the entire genome in current sugarcane hybrids. Analysis of RNA-Seq data from leaf and root tissue of contrasting sugarcane genotypes subjected to water stress treatments revealed 17,490 differentially expressed genes, from which 3,633 correspond to genes expressed exclusively in tolerant genotypes. We expect the resources presented here to serve as a source of information to improve the selection processes of new varieties of the breeding programs of sugarcane., Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (Copyright © 2021 Trujillo-Montenegro, Rodríguez Cubillos, Loaiza, Quintero, Espitia-Navarro, Salazar Villareal, Viveros Valens, González Barrios, De Vega, Duitama and Riascos.)
- Published
- 2021
- Full Text
- View/download PDF
30. Non-Extensive Fragmentation of Natural Products and Pharmacophore-Based Virtual Screening as a Practical Approach to Identify Novel Promising Chemical Scaffolds.
- Author
-
Vásquez AF, Muñoz AR, Duitama J, and González Barrios A
- Abstract
Fragment-based drug design (FBDD) and pharmacophore modeling have proven to be efficient tools to discover novel drugs. However, these approaches may become limited if the collection of fragments is highly repetitive, poorly diverse, or excessively simple. In this article, combining pharmacophore modeling and a non-classical type of fragmentation (herein called non-extensive) to screen a natural product (NP) library may provide fragments predicted as potent, diverse, and developable. Initially, we applied retrosynthetic combinatorial analysis procedure (RECAP) rules in two versions, extensive and non-extensive, in order to deconstruct a virtual library of NPs formed by the databases Traditional Chinese Medicine (TCM), AfroDb (African Medicinal Plants database), NuBBE (Nuclei of Bioassays, Biosynthesis, and Ecophysiology of Natural Products), and UEFS (Universidade Estadual de Feira de Santana). We then developed a virtual screening (VS) using two groups of natural-product-derived fragments (extensive and non-extensive NPDFs) and two overlapping pharmacophore models for each of 20 different proteins of therapeutic interest. Molecular weight, lipophilicity, and molecular complexity were estimated and compared for both types of NPDFs (and their original NPs) before and after the VS proceedings. As a result, we found that non-extensive NPDFs exhibited a much higher number of chemical entities compared to extensive NPDFs (45,355 vs. 11,525 compounds), accounting for the larger part of the hits recovered and being far less repetitive than extensive NPDFs. The structural diversity of both types of NPDFs and the NPs was shown to diminish slightly after VS procedures. Finally, and most interestingly, the pharmacophore fit score of the non-extensive NPDFs proved to be not only higher, on average, than extensive NPDFs (56% of cases) but also higher than their original NPs (69% of cases) when all of them were also recognized as hits after the VS. The findings obtained in this study indicated that the proposed cascade approach was useful to enhance the probability of identifying innovative chemical scaffolds, which deserve further development to become drug-sized candidate compounds. We consider that the knowledge about the deconstruction degree required to produce NPDFs of interest represents a good starting point for eventual synthesis, characterization, and biological activity studies., Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (Copyright © 2021 Vásquez, Muñoz, Duitama and González Barrios.)
- Published
- 2021
- Full Text
- View/download PDF
31. Discovery of new potential CDK2/VEGFR2 type II inhibitors by fragmentation and virtual screening of natural products.
- Author
-
Vásquez AF, Reyes Muñoz A, Duitama J, and González Barrios A
- Subjects
- Molecular Docking Simulation, Antineoplastic Agents, Biological Products, Cyclin-Dependent Kinase 2 antagonists & inhibitors, Vascular Endothelial Growth Factor Receptor-2 antagonists & inhibitors
- Abstract
Cyclin-Dependent Kinase 2 (CDK2) and Vascular Endothelial Growth Factor Receptor (VEGFR2) have largely been considered as attractive targets for developing anticancer agents. However, there is no dual inhibitor commercially available in the market that interacts simultaneously with the allosteric back pocket of these enzymes. We applied a combined computational strategy that started with the generation of two overlapping pharmacophore models of both kinases at 'inactive' conformation. Next, several virtual libraries of natural products, including the databases TCM (Traditional Chinese Medicine), UEFS (Universidade Estadual de Feira de Santana), NuBBE (Nuclei of Bioassays, Biosynthesis, and Ecophysiology of Natural Products) and AfroDb (African Medicinal Plants Database) were deconstructed using a non-extensive version of the approach RECAP (retrosynthetic combinatorial analysis procedure). These natural-product-derived fragments (NPDFs) were screened and merged into drug-sized compounds, which were filtered by Lipinski's Rule-of-five (Ro5) and docking. As a result, two pharmacophore models, namely Hypo1 and Hypo2, were developed with an accuracy of 0.94 and 0.84, respectively. Deconstruction of natural products produced a set of 16655 unique non-extensive NPDFs that were screened against both pharmacophore models. Finally, after merging, Ro5-filtering and docking, we obtained a set of 20 hit compounds predicted to be diverse, developable, synthesizable and potent. The computational strategy proved successful to find virtual candidates of kinase inhibitors and therefore contributes to the identification of innovative multi-target compounds with potential anticancer activity. Communicated by Ramaswamy H. Sarma.
- Published
- 2021
- Full Text
- View/download PDF
32. Using RNA-seq to characterize pollen-stigma interactions for pollination studies.
- Author
-
Lobaton J, Andrew R, Duitama J, Kirkland L, Macfadyen S, and Rader R
- Subjects
- Computational Biology methods, Gene Expression Profiling, Gene Expression Regulation, Plant, High-Throughput Nucleotide Sequencing, Phylogeny, Polymorphism, Single Nucleotide, Reproduction, Flowers genetics, Plant Physiological Phenomena, Pollen genetics, Pollination physiology
- Abstract
Insects are essential for the reproduction of pollinator-dependent crops and contribute to the pollination of 87% of wild plants and 75% of the world's food crops. Understanding pollen flow dynamics between plants and pollinators is thus essential to manage and conserve wild plants and ensure yields are maximized in food crops. However, the determination of pollen transfer in the field is complex and laborious. We developed a field experiment in a pollinator-dependent crop and used high throughput RNA sequencing (RNA-seq) to quantify pollen flow by measuring changes in gene expression between pollination treatments across different apple (Malus domestica Borkh.) cultivars. We tested three potential molecular indicators of successful pollination and validated these results with field data by observing single and multiple visits by honey bees (Apis mellifera) to apple flowers and measured fruit set in a commercial apple orchard. The first indicator of successful outcrossing was revealed via differential gene expression in the cross-pollination treatments after 6 h. The second indicator of successful outcrossing was revealed by the expression of specific genes related to pollen tube formation and defense response at three different time intervals in the stigma and the style following cross-pollination (i.e. after 6, 24, and 48 h). Finally, genotyping variants specific to donor pollen could be detected in cross-pollination treatments, providing a third indicator of successful outcrossing. Field data indicated that one or five flower visits by honey bees were insufficient and at least 10 honey bee flower visits were required to achieve a 25% probability of fruit set under orchard conditions. By combining the genotyping data, the differential expression analysis, and the traditional fruit set field experiments, it was possible to evaluate the pollination effectiveness of honey bee visits under orchards conditions. This is the first time that pollen-stigma-style mRNA expression analysis has been conducted after a pollinator visit (honey bee) to a plant (in vivo apple flowers). This study provides evidence that mRNA sequencing can be used to address complex questions related to stigma-pollen interactions over time in pollination ecology.
- Published
- 2021
- Full Text
- View/download PDF
33. Whole-Genome Transformation Promotes tRNA Anticodon Suppressor Mutations under Stress.
- Author
-
Deparis Q, Duitama J, Foulquié-Moreno MR, and Thevelein JM
- Subjects
- Anticodon genetics, Chromatography, Liquid, Kluyveromyces classification, Phylogeny, Polymorphism, Single Nucleotide, Tandem Mass Spectrometry, Whole Genome Sequencing, Anticodon antagonists & inhibitors, Genome, Fungal, Kluyveromyces genetics, Mutation, RNA, Transfer genetics, Stress, Physiological genetics, Suppression, Genetic
- Abstract
tRNAs are encoded by a large gene family, usually with several isogenic tRNAs interacting with the same codon. Mutations in the anticodon region of other tRNAs can overcome specific tRNA deficiencies. Phylogenetic analysis suggests that such mutations have occurred in evolution, but the driving force is unclear. We show that in yeast suppressor mutations in other tRNAs are able to overcome deficiency of the essential TRT2 -encoded tRNA
Thr CGU at high temperature (40°C). Surprisingly, these tRNA suppressor mutations were obtained after whole-genome transformation with DNA from thermotolerant Kluyveromyces marxianus or Ogataea polymorpha strains but from which the mutations did apparently not originate. We suggest that transient presence of donor DNA in the host facilitates proliferation at high temperature and thus increases the chances for occurrence of spontaneous mutations suppressing defective growth at high temperature. Whole-genome sequence analysis of three transformants revealed only four to five nonsynonymous mutations of which one causing TRT2 anticodon stem stabilization and two anticodon mutations in non-threonyl-tRNAs, tRNALys CUU and tRNAeMet CAU , were causative. Both anticodon mutations suppressed lethality of TRT2 deletion and apparently caused the respective tRNAs to become novel substrates for threonyl-tRNA synthetase. Liquid chromatography-tandem mass spectrometry (LC-MS/MS) data could not detect any significant mistranslation, and reverse transcription-quantitative PCR results contradicted induction of the unfolded protein response. We suggest that stress conditions have been a driving force in evolution for the selection of anticodon-switching mutations in tRNAs as revealed by phylogenetic analysis. IMPORTANCE In this work, we have identified for the first time the causative elements in a eukaryotic organism introduced by applying whole-genome transformation and responsible for the selectable trait of interest, i.e., high temperature tolerance. Surprisingly, the whole-genome transformants contained just a few single nucleotide polymorphisms (SNPs), which were unrelated to the sequence of the donor DNA. In each of three independent transformants, we have identified a SNP in a tRNA, either stabilizing the essential tRNAThr CGU at high temperature or switching the anticodon of tRNALys CUU or tRNAeMet CAU into CGU, which is apparently enough for in vivo recognition by threonyl-tRNA synthetase. LC-MS/MS analysis indeed indicated absence of significant mistranslation. Phylogenetic analysis showed that similar mutations have occurred throughout evolution and we suggest that stress conditions may have been a driving force for their selection. The low number of SNPs introduced by whole-genome transformation may favor its application for improvement of industrial yeast strains., (Copyright © 2021 Deparis et al.)- Published
- 2021
- Full Text
- View/download PDF
34. Accurate, Efficient and User-Friendly Mutation Calling and Sample Identification for TILLING Experiments.
- Author
-
Gil J, Andrade-Martínez JS, and Duitama J
- Abstract
TILLING (Targeting Induced Local Lesions IN Genomes) is a powerful reverse genetics method in plant functional genomics and breeding to identify mutagenized individuals with improved behavior for a trait of interest. Pooled high throughput sequencing (HTS) of the targeted genes allows efficient identification and sample assignment of variants within genes of interest in hundreds of individuals. Although TILLING has been used successfully in different crops and even applied to natural populations, one of the main issues for a successful TILLING experiment is that most currently available bioinformatics tools for variant detection are not designed to identify mutations with low frequencies in pooled samples or to perform sample identification from variants identified in overlapping pools. Our research group maintains the Next Generation Sequencing Experience Platform (NGSEP), an open source solution for analysis of HTS data. In this manuscript, we present three novel components within NGSEP to facilitate the design and analysis of TILLING experiments: a pooled variants detector, a sample identifier from variants detected in overlapping pools and a simulator of TILLING experiments. A new implementation of the NGSEP calling model for variant detection allows accurate detection of low frequency mutations within pools. The samples identifier implements the process to triangulate the mutations called within overlapping pools in order to assign mutations to single individuals whenever possible. Finally, we developed a complete simulator of TILLING experiments to enable benchmarking of different tools and to facilitate the design of experimental alternatives varying the number of pools and individuals per pool. Simulation experiments based on genes from the common bean genome indicate that NGSEP provides similar accuracy and better efficiency than other tools to perform pooled variants detection. To the best of our knowledge, NGSEP is currently the only tool that generates individual assignments of the mutations discovered from the pooled data. We expect that this development will be of great use for different groups implementing TILLING as an alternative for plant breeding and even to research groups performing pooled sequencing for other applications., Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (Copyright © 2021 Gil, Andrade-Martínez and Duitama.)
- Published
- 2021
- Full Text
- View/download PDF
35. Comprehensive genomic resources related to domestication and crop improvement traits in Lima bean.
- Author
-
Garcia T, Duitama J, Zullo SS, Gil J, Ariani A, Dohle S, Palkovic A, Skeen P, Bermudez-Santana CI, Debouck DG, Martínez-Castillo J, Gepts P, and Chacón-Sánchez MI
- Subjects
- Argentina, Chromosome Mapping, Climate Change, Domestication, Genes, Plant genetics, Mexico, Plant Dispersal, RNA-Seq, Seeds, Synteny, Acclimatization genetics, Crops, Agricultural genetics, Phaseolus genetics, Plant Breeding, Quantitative Trait Loci
- Abstract
Lima bean (Phaseolus lunatus L.), one of the five domesticated Phaseolus bean crops, shows a wide range of ecological adaptations along its distribution range from Mexico to Argentina. These adaptations make it a promising crop for improving food security under predicted scenarios of climate change in Latin America and elsewhere. In this work, we combine long and short read sequencing technologies with a dense genetic map from a biparental population to obtain the chromosome-level genome assembly for Lima bean. Annotation of 28,326 gene models show high diversity among 1917 genes with conserved domains related to disease resistance. Structural comparison across 22,180 orthologs with common bean reveals high genome synteny and five large intrachromosomal rearrangements. Population genomic analyses show that wild Lima bean is organized into six clusters with mostly non-overlapping distributions and that Mesomerican landraces can be further subdivided into three subclusters. RNA-seq data reveal 4275 differentially expressed genes, which can be related to pod dehiscence and seed development. We expect the resources presented here to serve as a solid basis to achieve a comprehensive view of the degree of convergent evolution of Phaseolus species under domestication and provide tools and information for breeding for climate change resiliency.
- Published
- 2021
- Full Text
- View/download PDF
36. Genetic mapping for agronomic traits in a MAGIC population of common bean (Phaseolus vulgaris L.) under drought conditions.
- Author
-
Diaz S, Ariza-Suarez D, Izquierdo P, Lobaton JD, de la Hoz JF, Acevedo F, Duitama J, Guerrero AF, Cajiao C, Mayor V, Beebe SE, and Raatz B
- Subjects
- Africa, Asia, Chromosome Mapping, Droughts, Phenotype, Plant Breeding, Quantitative Trait Loci, Phaseolus genetics
- Abstract
Background: Common bean is an important staple crop in the tropics of Africa, Asia and the Americas. Particularly smallholder farmers rely on bean as a source for calories, protein and micronutrients. Drought is a major production constraint for common bean, a situation that will be aggravated with current climate change scenarios. In this context, new tools designed to understand the genetic basis governing the phenotypic responses to abiotic stress are required to improve transfer of desirable traits into cultivated beans., Results: A multiparent advanced generation intercross (MAGIC) population of common bean was generated from eight Mesoamerican breeding lines representing the phenotypic and genotypic diversity of the CIAT Mesoamerican breeding program. This population was assessed under drought conditions in two field trials for yield, 100 seed weight, iron and zinc accumulation, phenology and pod harvest index. Transgressive segregation was observed for most of these traits. Yield was positively correlated with yield components and pod harvest index (PHI), and negative correlations were found with phenology traits and micromineral contents. Founder haplotypes in the population were identified using Genotyping by Sequencing (GBS). No major population structure was observed in the population. Whole Genome Sequencing (WGS) data from the founder lines was used to impute genotyping data for GWAS. Genetic mapping was carried out with two methods, using association mapping with GWAS, and linkage mapping with haplotype-based interval screening. Thirteen high confidence QTL were identified using both methods and several QTL hotspots were found controlling multiple traits. A major QTL hotspot located on chromosome Pv01 for phenology traits and yield was identified. Further hotspots affecting several traits were observed on chromosomes Pv03 and Pv08. A major QTL for seed Fe content was contributed by MIB778, the founder line with highest micromineral accumulation. Based on imputed WGS data, candidate genes are reported for the identified major QTL, and sequence changes were identified that could cause the phenotypic variation., Conclusions: This work demonstrates the importance of this common bean MAGIC population for genetic mapping of agronomic traits, to identify trait associations for molecular breeding tool design and as a new genetic resource for the bean research community.
- Published
- 2020
- Full Text
- View/download PDF
37. Genomic Variability of Phytophthora palmivora Isolates from Different Oil Palm Cultivation Regions in Colombia.
- Author
-
Gil J, Herrera M, Duitama J, Sarria G, Restrepo S, and Romero HM
- Subjects
- Colombia, Genomics, Plant Diseases, South America, Phytophthora
- Abstract
Palm oil is the most consumed vegetable oil globally, and Colombia is the largest palm oil producer in South America and fourth worldwide. However, oil palm plantations in Colombia are affected by bud rot disease caused by the oomycete Phytophthora palmivora , leading to significant economic losses. Infection processes by plant pathogens involve the secretion of effector molecules, which alter the functioning or structure of host cells. Current long-read sequencing technologies provide the information needed to produce high-quality genome assemblies, enabling a comprehensive annotation of effectors. Here, we describe the development of genomic resources for P. palmivora , including a high-quality genome assembly based on long and short-read sequencing data, intraspecies variability for 12 isolates from different oil palm cultivation regions in Colombia, and a catalog of over 1,000 candidate effector proteins. A total of 45,416 genes were annotated from the new genome assembled in 2,322 contigs adding to 165.5 Mbp, which represents an improvement of two times more gene models, 33 times better contiguity, and 11 times less fragmentation compared with currently available genomic resources for the species. Analysis of nucleotide evolution in paralogs suggests a recent whole-genome duplication event. Genetic differences were identified among isolates showing variable virulence levels. We expect that these novel genomic resources contribute to the characterization of the species and the understanding of the interaction of P. palmivora with oil palm and could be further exploited as tools for the development of effective strategies for disease control.
- Published
- 2020
- Full Text
- View/download PDF
38. Added Value of Quantitative Ultrasound and Machine Learning in BI-RADS 4-5 Assessment of Solid Breast Lesions.
- Author
-
Destrempes F, Trop I, Allard L, Chayer B, Garcia-Duitama J, El Khoury M, Lalonde L, and Cloutier G
- Subjects
- Adolescent, Adult, Aged, Aged, 80 and over, Data Systems, Female, Humans, Middle Aged, Research Design, Young Adult, Breast Neoplasms diagnostic imaging, Machine Learning, Ultrasonography, Mammary methods
- Abstract
The purpose of this study was to evaluate various combinations of 13 features based on shear wave elasticity (SWE), statistical and spectral backscatter properties of tissues, along with the Breast Imaging Reporting and Data System (BI-RADS), for classification of solid breast lesions at ultrasonography by means of random forests. One hundred and three women with 103 suspicious solid breast lesions (BI-RADS categories 4-5) were enrolled. Before biopsy, additional SWE images and a cine sequence of ultrasound images were obtained. The contours of lesions were delineated, and parametric maps of the homodyned-K distribution were computed on three regions: intra-tumoral, supra-tumoral and infra-tumoral zones. Maximum elasticity and total attenuation coefficient were also extracted. Random forests yielded receiver operating characteristic (ROC) curves for various combinations of features. Adding BI-RADS category improved the classification performance of other features. The best result was an area under the ROC curve of 0.97, with 75.9% specificity at 98% sensitivity., (Copyright © 2019 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.)
- Published
- 2020
- Full Text
- View/download PDF
39. Pilot clinical study of quantitative ultrasound spectroscopy measurements of erythrocyte aggregation within superficial veins.
- Author
-
Chayer B, Allard L, Qin Z, Garcia-Duitama J, Roger L, Destrempes F, Cailhier JF, Denault A, and Cloutier G
- Subjects
- Adult, Healthy Volunteers, Humans, Pilot Projects, Reproducibility of Results, Erythrocyte Aggregation physiology, Spectrum Analysis methods, Ultrasonography methods, Veins diagnostic imaging
- Abstract
Background: An enhanced inflammatory response is a trigger to the production of blood macromolecules involved in abnormally high levels of erythrocyte aggregation., Objective: This study aimed at demonstrating for the first time the clinical feasibility of a non-invasive ultrasound-based erythrocyte aggregation quantitative measurement method for potential application in critical care medicine., Methods: Erythrocyte aggregation was evaluated using modeling of the backscatter coefficient with the Structure Factor Size and Attenuation Estimator (SFSAE). SFSAE spectral parameters W (packing factor) and D (mean aggregate diameter) were measured within the antebrachial vein of the forearm and tibial vein of the leg in 50 healthy participants at natural flow and reduced flow controlled by a pressurized bracelet. Blood samples were also collected to measure erythrocyte aggregation ex vivo with an erythroaggregometer (parameter S10)., Results: W and Din vivo measurements were positively correlated with the ex vivoS10 index for both measurement sites and shear rates (correlations between 0.35-0.81, p < 0.05). Measurement at low shear rate was found to increase the sensitivity and reliability of this non-invasive measurement method., Conclusions: We behold that the SFSAE method presents systemic measures of the erythrocyte aggregation level, since results on upper and lower limbs were highly correlated.
- Published
- 2020
- Full Text
- View/download PDF
40. NGSEP3: accurate variant calling across species and sequencing protocols.
- Author
-
Tello D, Gil J, Loaiza CD, Riascos JJ, Cardozo N, and Duitama J
- Subjects
- Algorithms, Genomics, INDEL Mutation, Sequence Analysis, DNA, High-Throughput Nucleotide Sequencing, Software
- Abstract
Motivation: Accurate detection, genotyping and downstream analysis of genomic variants from high-throughput sequencing data are fundamental features in modern production pipelines for genetic-based diagnosis in medicine or genomic selection in plant and animal breeding. Our research group maintains the Next-Generation Sequencing Experience Platform (NGSEP) as a precise, efficient and easy-to-use software solution for these features., Results: Understanding that incorrect alignments around short tandem repeats are an important source of genotyping errors, we implemented in NGSEP new algorithms for realignment and haplotype clustering of reads spanning indels and short tandem repeats. We performed extensive benchmark experiments comparing NGSEP to state-of-the-art software using real data from three sequencing protocols and four species with different distributions of repetitive elements. NGSEP consistently shows comparative accuracy and better efficiency compared to the existing solutions. We expect that this work will contribute to the continuous improvement of quality in variant calling needed for modern applications in medicine and agriculture., Availability and Implementation: NGSEP is available as open source software at http://ngsep.sf.net., Supplementary Information: Supplementary data are available at Bioinformatics online., (© The Author(s) 2019. Published by Oxford University Press.)
- Published
- 2019
- Full Text
- View/download PDF
41. Quantitative ultrasound and machine learning for assessment of steatohepatitis in a rat model.
- Author
-
Tang A, Destrempes F, Kazemirad S, Garcia-Duitama J, Nguyen BN, and Cloutier G
- Subjects
- Animals, Disease Models, Animal, Liver diagnostic imaging, Male, ROC Curve, Rats, Rats, Sprague-Dawley, Machine Learning, Non-alcoholic Fatty Liver Disease diagnosis, Ultrasonography methods
- Abstract
Objectives: To develop a machine learning model based on quantitative ultrasound (QUS) parameters to improve classification of steatohepatitis with shear wave elastography in rats by using histopathology scoring as the reference standard., Methods: This study received approval from the institutional animal care committee. Sixty male Sprague-Dawley rats were either fed a standard chow or a methionine- and choline-deficient diet. Ultrasound-based radiofrequency images were recorded in vivo to generate QUS and elastography maps. Random forests classification models and a bootstrap method were used to identify the QUS parameters that improved the classification accuracy of elastography. Receiver-operating characteristic analyses were performed., Results: For classification of not steatohepatitis vs borderline or steatohepatitis, the area under the receiver-operating characteristic curve (AUC) increased from 0.63 for elastography alone to 0.72 for a model that combined elastography and QUS techniques (p < 0.001). For detection of liver steatosis grades 0 vs ≥ 1, ≤ 1 vs ≥ 2, ≤ 2 vs 3, respectively, the AUCs increased from 0.70, 0.65, and 0.69 to 0.78, 0.78, and 0.75 (p < 0.001). For detection of liver inflammation grades 0 vs ≥ 1, ≤ 1 vs ≥ 2, ≤ 2 vs 3, respectively, the AUCs increased from 0.58, 0.77, and 0.78 to 0.66, 0.84, and 0.87 (p < 0.001). For staging of liver fibrosis grades 0 vs ≥ 1, ≤ 1 vs ≥ 2, and ≤ 2 vs ≥ 3, respectively, the AUCs increased from 0.79, 0.92, and 0.91 to 0.85, 0.98, and 0.97 (p < 0.001)., Conclusion: QUS parameters improved the classification accuracy of steatohepatitis, liver steatosis, inflammation, and fibrosis compared to shear wave elastography alone., Key Points: • Quantitative ultrasound and shear wave elastography improved classification accuracy of liver steatohepatitis and its histological features (liver steatosis, inflammation, and fibrosis) compared to elastography alone. • A machine learning approach based on random forest models and incorporating local attenuation and homodyned-K tissue modeling shows promise for classification of nonalcoholic steatohepatitis. • Further research should be performed to demonstrate the applicability of this multi-parametric QUS approach in a human cohort and to validate the combinations of parameters providing the highest classification accuracy.
- Published
- 2019
- Full Text
- View/download PDF
42. Structural variants in 3000 rice genomes.
- Author
-
Fuentes RR, Chebotarov D, Duitama J, Smith S, De la Hoz JF, Mohiyuddin M, Wing RA, McNally KL, Tatarinova T, Grigoriev A, Mauleon R, and Alexandrov N
- Subjects
- Alleles, Chromosome Mapping, DNA Transposable Elements, Genome-Wide Association Study methods, Phenotype, Sequence Analysis, DNA methods, Stress, Physiological genetics, Genetic Variation, Genome, Plant, Genomic Structural Variation, Genomics methods, Oryza genetics
- Abstract
Investigation of large structural variants (SVs) is a challenging yet important task in understanding trait differences in highly repetitive genomes. Combining different bioinformatic approaches for SV detection, we analyzed whole-genome sequencing data from 3000 rice genomes and identified 63 million individual SV calls that grouped into 1.5 million allelic variants. We found enrichment of long SVs in promoters and an excess of shorter variants in 5' UTRs. Across the rice genomes, we identified regions of high SV frequency enriched in stress response genes. We demonstrated how SVs may help in finding causative variants in genome-wide association analysis. These new insights into rice genome biology are valuable for understanding the effects SVs have on gene function, with the prospect of identifying novel agronomically important alleles that can be utilized to improve cultivated rice., (© 2019 Fuentes et al.; Published by Cold Spring Harbor Laboratory Press.)
- Published
- 2019
- Full Text
- View/download PDF
43. Translocation of a parthenogenesis gene candidate to an alternate carrier chromosome in apomictic Brachiaria humidicola.
- Author
-
Worthington M, Ebina M, Yamanaka N, Heffelfinger C, Quintero C, Zapata YP, Perez JG, Selvaraj M, Ishitani M, Duitama J, de la Hoz JF, Rao I, Dellaporta S, Tohme J, and Arango J
- Subjects
- Chromosomes, Plant, Genomics, Karyotyping, Translocation, Genetic, Apomixis, Brachiaria genetics, Chromosome Mapping, Parthenogenesis genetics
- Abstract
Background: The apomictic reproductive mode of Brachiaria (syn. Urochloa) forage species allows breeders to faithfully propagate heterozygous genotypes through seed over multiple generations. In Brachiaria, reproductive mode segregates as single dominant locus, the apospory-specific genomic region (ASGR). The AGSR has been mapped to an area of reduced recombination on Brachiaria decumbens chromosome 5. A primer pair designed within ASGR-BABY BOOM-like (BBML), the candidate gene for the parthenogenesis component of apomixis in Pennisetum squamulatum, was diagnostic for reproductive mode in the closely related species B. ruziziensis, B. brizantha, and B. decumbens. In this study, we used a mapping population of the distantly related commercial species B. humidicola to map the ASGR and test for conservation of ASGR-BBML sequences across Brachiaria species., Results: Dense genetic maps were constructed for the maternal and paternal genomes of a hexaploid (2n = 6x = 36) B. humidicola F
1 mapping population (n = 102) using genotyping-by-sequencing, simple sequence repeat, amplified fragment length polymorphism, and transcriptome derived single nucleotide polymorphism markers. Comparative genomics with Setaria italica provided confirmation for x = 6 as the base chromosome number of B. humidicola. High resolution molecular karyotyping indicated that the six homologous chromosomes of the sexual female parent paired at random, whereas preferential pairing of subgenomes was observed in the apomictic male parent. Furthermore, evidence for compensated aneuploidy was found in the apomictic parent, with only five homologous linkage groups identified for chromosome 5 and seven homologous linkage groups of chromosome 6. The ASGR mapped to B. humidicola chromosome 1, a region syntenic with chromosomes 1 and 7 of S. italica. The ASGR-BBML specific PCR product cosegregated with the ASGR in the F1 mapping population, despite its location on a different carrier chromosome than B. decumbens., Conclusions: The first dense molecular maps of B. humidicola provide strong support for cytogenetic evidence indicating a base chromosome number of six in this species. Furthermore, these results show conservation of the ASGR across the Paniceae in different chromosomal backgrounds and support postulation of the ASGR-BBML as candidate genes for the parthenogenesis component of apomixis.- Published
- 2019
- Full Text
- View/download PDF
44. Genomic Analysis of Colombian Leishmania panamensis strains with different level of virulence.
- Author
-
Urrea DA, Duitama J, Imamura H, Álzate JF, Gil J, Muñoz N, Villa JA, Dujardin JC, Ramirez-Pineda JR, and Triana-Chavez O
- Subjects
- Animals, Colombia, DNA Copy Number Variations, Female, Genome, Protozoan, Leishmania braziliensis genetics, Leishmaniasis, Mucocutaneous parasitology, Machine Learning, Mice, Inbred BALB C, Polymorphism, Single Nucleotide, Sequence Analysis, DNA, Leishmania guyanensis genetics, Leishmania guyanensis pathogenicity
- Abstract
The establishment of Leishmania infection in mammalian hosts and the subsequent manifestation of clinical symptoms require internalization into macrophages, immune evasion and parasite survival and replication. Although many of the genes involved in these processes have been described, the genetic and genomic variability associated to differences in virulence is largely unknown. Here we present the genomic variation of four Leishmania (Viannia) panamensis strains exhibiting different levels of virulence in BALB/c mice and its application to predict novel genes related to virulence. De novo DNA sequencing and assembly of the most virulent strain allowed comparative genomics analysis with sequenced L. (Viannia) panamensis and L. (Viannia) braziliensis strains, and showed important variations at intra and interspecific levels. Moreover, the mutation detection and a CNV search revealed both base and structural genomic variation within the species. Interestingly, we found differences in the copy number and protein diversity of some genes previously related to virulence. Several machine-learning approaches were applied to combine previous knowledge with features derived from genomic variation and predict a curated set of 66 novel genes related to virulence. These genes can be prioritized for validation experiments and could potentially become promising drug and immune targets for the development of novel prophylactic and therapeutic interventions.
- Published
- 2018
- Full Text
- View/download PDF
45. Resequencing of Common Bean Identifies Regions of Inter-Gene Pool Introgression and Provides Comprehensive Resources for Molecular Breeding.
- Author
-
Lobaton JD, Miller T, Gil J, Ariza D, de la Hoz JF, Soler A, Beebe S, Duitama J, Gepts P, and Raatz B
- Subjects
- DNA Copy Number Variations, DNA, Plant, Disease Resistance genetics, Genome, Plant, High-Throughput Nucleotide Sequencing, Plant Breeding, Plant Diseases genetics, Polymorphism, Single Nucleotide, Gene Pool, Genetic Variation, Phaseolus genetics
- Abstract
Common bean ( L.) is the most important grain legume for human consumption and is a major nutrition source in the tropics. Because bean production is reduced by both abiotic and biotic constraints, current breeding efforts are focused on the development of improved varieties with tolerance to these stresses. We characterized materials from different breeding programs spanning three continents to understand their sequence diversity and advance the development of molecular breeding tools. For this, 37 varieties belonging to , (A. Gray), and L. were sequenced by whole-genome sequencing, identifying more than 40 million genomic variants. Evaluation of nuclear DNA content and analysis of copy number variation revealed important differences in genomic content not only between and the two other domesticated species, but also within , affecting hundreds of protein-coding genomic regions. A large number of inter-gene pool introgressions were identified. Furthermore, interspecific introgressions for disease resistance in breeding lines were mapped. Evaluation of newly developed single nucleotide polymorphism markers within previously discovered quantitative trait loci for common bacterial blight and angular leaf spot provides improved specificity to tag sources of resistance to these diseases. We expect that this dataset will provide a deeper molecular understanding of breeding germplasm and deliver molecular tools for germplasm development, aiming to increase the efficiency of bean breeding programs., (Copyright © 2018 Crop Science Society of America.)
- Published
- 2018
- Full Text
- View/download PDF
46. Protocol for Robust In Vivo Measurements of Erythrocyte Aggregation Using Ultrasound Spectroscopy.
- Author
-
Garcia-Duitama J, Chayer B, Garcia D, Goussard Y, and Cloutier G
- Subjects
- Adult, Animals, Horses, Humans, Middle Aged, Models, Animal, Spectrum Analysis, Swine, Young Adult, Erythrocyte Aggregation physiology, Ultrasonography methods
- Abstract
Erythrocyte aggregation is a non-specific marker of acute and chronic inflammation. Although it is usual to evaluate this phenomenon from blood samples analyzed in laboratory instruments, in vivo real-time assessment of aggregation is possible with spectral ultrasound techniques. However, variable blood flow can affect the interpretation of acoustic measures. Therefore, flow standardization is required. Two techniques of flow standardization were evaluated with porcine and equine blood samples in Couette flow. These techniques consisted in either stopping the flow or reducing it. Then, the sensibility and repeatability of the retained method were evaluated in 11 human volunteers. We observed that stopping the flow compromised interpretation and repeatability. Conversely, maintaining a low flow provided repeatable measures and could distinguish between normal and high extents of erythrocyte aggregation. Agreement was observed between in vivo and ex vivo measures of the phenomenon (R
2 = 82.7%, p value < 0.0001). These results support the feasibility of assessing in vivo erythrocyte aggregation in humans by quantitative ultrasound means., (Copyright © 2017 World Federation for Ultrasound in Medicine and Biology. Published by Elsevier Inc. All rights reserved.)- Published
- 2017
- Full Text
- View/download PDF
47. Deep Assessment of Genomic Diversity in Cassava for Herbicide Tolerance and Starch Biosynthesis.
- Author
-
Duitama J, Kafuri L, Tello D, Leiva AM, Hofinger B, Datta S, Lentini Z, Aranzales E, Till B, and Ceballos H
- Abstract
Cassava is one of the most important food security crops in tropical countries, and a competitive resource for the starch, food, feed and ethanol industries. However, genomics research in this crop is much less developed compared to other economically important crops such as rice or maize. The International Center for Tropical Agriculture (CIAT) maintains the largest cassava germplasm collection in the world. Unfortunately, the genetic potential of this diversity for breeding programs remains underexploited due to the difficulties in phenotypic screening and lack of deep genomic information about the different accessions. A chromosome-level assembly of the cassava reference genome was released this year and only a handful of studies have been made, mainly to find quantitative trait loci (QTL) on breeding populations with limited variability. This work presents the results of pooled targeted resequencing of more than 1500 cassava accessions from the CIAT germplasm collection to obtain a dataset of more than 2000 variants within genes related to starch functional properties and herbicide tolerance. Results of twelve bioinformatic pipelines for variant detection in pooled samples were compared to ensure the quality of the variant calling process. Predictions of functional impact were performed using two separate methods to prioritize interesting variation for genotyping and cultivar selection. Targeted resequencing, either by pooled samples or by similar approaches such as Ecotilling or capture, emerges as a cost effective alternative to whole genome sequencing to identify interesting alleles of genes related to relevant traits within large germplasm collections.
- Published
- 2017
- Full Text
- View/download PDF
48. The sole introduction of two single-point mutations establishes glycerol utilization in Saccharomyces cerevisiae CEN.PK derivatives.
- Author
-
Ho PW, Swinnen S, Duitama J, and Nevoigt E
- Abstract
Background: Glycerol is an abundant by-product of biodiesel production and has several advantages as a substrate in biotechnological applications. Unfortunately, the popular production host Saccharomyces cerevisiae can barely metabolize glycerol by nature., Results: In this study, two evolved derivatives of the strain CEN.PK113-1A were created that were able to grow in synthetic glycerol medium (strains PW-1 and PW-2). Their growth performances on glycerol were compared with that of the previously published evolved CEN.PK113-7D derivative JL1. As JL1 showed a higher maximum specific growth rate on glycerol (0.164 h
-1 compared to 0.119 h-1 for PW-1 and 0.127 h-1 for PW-2), its genomic DNA was subjected to whole-genome resequencing. Two point mutations in the coding sequences of the genes UBR2 and GUT1 were identified to be crucial for growth in synthetic glycerol medium and subsequently verified by reverse engineering of the wild-type strain CEN.PK113-7D. The growth rate of the resulting reverse-engineered strain was 0.130 h-1 . Sanger sequencing of the GUT1 and UBR2 alleles of the above-mentioned evolved strains PW-1 and PW-2 also revealed one single-point mutation in these two genes, and both mutations were demonstrated to be also crucial and sufficient for obtaining a maximum specific growth rate on glycerol of ~0.120 h-1 ., Conclusions: The current work confirmed the importance of UBR2 and GUT1 as targets for establishing glycerol utilization in strains of the CEN.PK family. In addition, it shows that a growth rate on glycerol of 0.130 h-1 can be established in reverse-engineered CEN.PK strains by solely replacing a single amino acid in the coding sequences of both Ubr2 and Gut1.- Published
- 2017
- Full Text
- View/download PDF
49. A Fosmid Pool-Based Next Generation Sequencing Approach to Haplotype-Resolve Whole Genomes.
- Author
-
Suk EK, Schulz S, Mentrup B, Huebsch T, Duitama J, and Hoehe MR
- Subjects
- Genome, Human genetics, Genomics, Humans, Polymorphism, Single Nucleotide genetics, Sequence Analysis, DNA, Haplotypes genetics, High-Throughput Nucleotide Sequencing methods
- Abstract
Haplotype resolution of human genomes is essential to describe and interpret genetic variation and its impact on biology and disease. Our approach to haplotyping relies on converting genomic DNA into a fosmid library, which represents the entire diploid genome as a collection of haploid DNA clones of ~40 kb in size. These can be partitioned into pools such that the probability that the same pool contains both parental haplotypes is reduced to ~1 %. This is the key principle of this method, allowing entire pools of fosmids to be massively parallel sequenced, yielding haploid sequence output. Here, we present a detailed protocol for fosmid pool-based next generation sequencing to haplotype-resolve whole genomes including the following steps: (1) generation of high molecular weight DNA fragments of ~40 kb in size from genomic DNA; (2) fosmid cloning and partitioning into 96-well plates; (3) barcoded sequencing library preparation from fosmid pools for next generation sequencing; and (4) computational analysis of fosmid sequences and assembly into contiguous haploid sequences.This method can be used in combination with, but also without, whole genome shotgun sequencing to extensively resolve heterozygous SNPs and structural variants within genomic regions, resulting in haploid contigs of several hundred kb up to several Mb. This method has a broad range of applications including population and ancestry genetics, the clinical interpretation of mutations in personal genomes, the analysis of cancer genomes and highly complex disease gene regions such as MHC. Moreover, haplotype-resolved genome sequencing allows description and interpretation of the diploid nature of genome biology, for example through the analysis of haploid gene forms and allele-specific phenomena. Application of this method has enabled the production of most of the molecular haplotype-resolved genomes reported to date.
- Published
- 2017
- Full Text
- View/download PDF
50. Combining Image Analysis, Genome Wide Association Studies and Different Field Trials to Reveal Stable Genetic Regions Related to Panicle Architecture and the Number of Spikelets per Panicle in Rice.
- Author
-
Rebolledo MC, Peña AL, Duitama J, Cruz DF, Dingkuhn M, Grenier C, and Tohme J
- Abstract
Number of spikelets per panicle (NSP) is a key trait to increase yield potential in rice ( O. sativa ). The architecture of the rice inflorescence which is mainly determined by the length and number of primary (PBL and PBN) and secondary (SBL and SBN) branches can influence NSP. Although several genes controlling panicle architecture and NSP in rice have been identified, there is little evidence of (i) the genetic control of panicle architecture and NSP in different environments and (ii) the presence of stable genetic associations with panicle architecture across environments. This study combines image phenotyping of 225 accessions belonging to a genetic diversity array of indica rice grown under irrigated field condition in two different environments and Genome Wide Association Studies (GWAS) based on the genotyping of the diversity panel, providing 83,374 SNPs. Accessions sown under direct seeding in one environement had reduced Panicle Length (PL), NSP, PBN, PBL, SBN, and SBL compared to those established under transplanting in the second environment. Across environments, NSP was significantly and positively correlated with PBN, SBN and PBL. However, the length of branches (PBL and SBL) was not significantly correlated with variables related to number of branches (PBN and SBN), suggesting independent genetic control. Twenty- three GWAS sites were detected with P ≤ 1.0E-04 and 27 GWAS sites with p ≤ 5.9E-04. We found 17 GWAS sites related to NSP, 10 for PBN and 11 for SBN, 7 for PBL and 11 for SBL. This study revealed new regions related to NSP, but only three associations were related to both branching number (PBN and SBN) and NSP. Two GWAS sites associated with SBL and SBN were stable across contrasting environments and were not related to genes previously reported. The new regions reported in this study can help improving NSP in rice for both direct seeded and transplanted conditions. The integrated approach of high-throughput phenotyping, multi-environment field trials and GWAS has the potential to dissect complex traits, such as NSP, into less complex traits and to match single nucleotide polymorphisms with relevant function under different environments, offering a potential use for molecular breeding.
- Published
- 2016
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.