483 results on '"Taxonomic classification"'
Search Results
2. CGRclust: Chaos Game Representation for twin contrastive clustering of unlabelled DNA sequences.
- Author
-
Alipour, Fatemeh, Hill, Kathleen A., and Kari, Lila
- Subjects
- *
CONVOLUTIONAL neural networks , *NUCLEOTIDE sequence , *IMAGE recognition (Computer vision) , *ARTIFICIAL chromosomes , *DNA sequencing - Abstract
Background: Traditional supervised learning methods applied to DNA sequence taxonomic classification rely on the labor-intensive and time-consuming step of labelling the primary DNA sequences. Additionally, standard DNA classification/clustering methods involve time-intensive multiple sequence alignments, which impacts their applicability to large genomic datasets or distantly related organisms. These limitations indicate a need for robust, efficient, and scalable unsupervised DNA sequence clustering methods that do not depend on sequence labels or alignment. Results: This study proposes CGRclust, a novel combination of unsupervised twin contrastive clustering of Chaos Game Representations (CGR) of DNA sequences, with convolutional neural networks (CNNs). To the best of our knowledge, CGRclust is the first method to use unsupervised learning for image classification (herein applied to two-dimensional CGR images) for clustering datasets of DNA sequences. CGRclust overcomes the limitations of traditional sequence classification methods by leveraging unsupervised twin contrastive learning to detect distinctive sequence patterns, without requiring DNA sequence alignment or biological/taxonomic labels. CGRclust accurately clustered twenty-five diverse datasets, with sequence lengths ranging from 664 bp to 100 kbp, including mitochondrial genomes of fish, fungi, and protists, as well as viral whole genome assemblies and synthetic DNA sequences. Compared with three recent clustering methods for DNA sequences (DeLUCS, iDeLUCS, and MeShClust v3.0.), CGRclust is the only method that surpasses 81.70% accuracy across all four taxonomic levels tested for mitochondrial DNA genomes of fish. Moreover, CGRclust also consistently demonstrates superior performance across all the viral genomic datasets. The high clustering accuracy of CGRclust on these twenty-five datasets, which vary significantly in terms of sequence length, number of genomes, number of clusters, and level of taxonomy, demonstrates its robustness, scalability, and versatility. Conclusion: CGRclust is a novel, scalable, alignment-free DNA sequence clustering method that uses CGR images of DNA sequences and CNNs for twin contrastive clustering of unlabelled primary DNA sequences, achieving superior or comparable accuracy and performance over current approaches. CGRclust demonstrated enhanced reliability, by consistently achieving over 80% accuracy in more than 90% of the datasets analyzed. In particular, CGRclust performed especially well in clustering viral DNA datasets, where it consistently outperformed all competing methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Phylogeny and classification of Cixiidae (Hemiptera, Fulgoromorpha): A new evolutionary scenario for the most diverse planthopper family.
- Author
-
Luo, Yang, Bucher, Manon, Bourgoin, Thierry, Löcker, Birgit, and Feng, Ji‐Nian
- Subjects
- *
MOLECULAR phylogeny , *TRIBES , *HEMIPTERA , *PHYLOGENY , *FOSSILS - Abstract
The Cixiidae represent the most diverse family within Hemiptera Fulgoromorpha, accounting for nearly 20% of the described species. A molecular phylogenetic analysis of 147 taxa reveals a new evolutionary scenario for the family, identifying four major lineages: borystheninian (restricted to the Borysthenini), oecleinian and pentastirinian, grouped in one clade, sister to the cixiinian one. In the oecleinian lineage, the Oecleini are paraphyletic, including the Bothriocerini. Three groups are identified in the pentastirinian lineage: the Hyalesthes+, Pentastiridius+ and Oliarus+ clades. Within the cixiinian lineage, as traditionally recognised, the Cixiini tribe is polyphyletic, involving a basally separated Achaemenes clade, a newly described Chidaeini
trib. nov ., and the ‘true Cixiini’ clade, which itself remains paraphyletic, including the Semonini. The Andini tribe appears paraphyletic, including the Brixiini, and the position of the Gelastocephalini is yet to be confirmed. Despite its significance, the sampling remains incomplete, hindering, in our opinion, the formal taxonomic recognition of these lineages with formal ranks for a new classification of the Cixiidae. Fossil‐calibrated tree analysis indicates that Cixiidae originated in Lower Jurassic, approximately 181 million years ago. The four identified main lineages diverged during the Lower Jurassic in some 12 million years only, 155 million years ago. All currently recognised tribes and new major clades revealed with this study were present as early as the mid‐Cretaceous, around 100 million years ago; however, the Bennini tribe and the ‘true Cixiini’ clade emerged later, some 75 million years ago. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
4. A Comprehensive Metagenome Study Identifies Distinct Biological Pathways in Asthma Patients: An In-Silico Approach.
- Author
-
Rana, Samiksha, Singh, Pooja, Bhardwaj, Tulika, and Somvanshi, Pallavi
- Subjects
- *
ASTHMATICS , *REGULATORY T cells , *SECONDARY metabolism , *GUT microbiome , *BACTERIAL communities - Abstract
Asthma is a multifactorial disease with phenotypes and several clinical and pathophysiological characteristics. Besides innate and adaptive immune responses, the gut microbiome generates Treg cells, mediating the allergic response to environmental factors and exposure to allergens. Because of the complexity of asthma, microbiome analysis and other precision medicine methods are now widely regarded as essential elements of efficient disease therapy. An in-silico pipeline enables the comparative taxonomic profiling of 16S rRNA metagenomic profiles of 20 asthmatic patients and 15 healthy controls utilizing QIIME2. Further, PICRUSt supports downstream gene enrichment and pathway analysis, inferring the enriched pathways in a diseased state. A significant abundance of the phylum Proteobacteria, Sutterella, and Megamonas is identified in asthma patients and a diminished genus Akkermansia. Nasal samples reveal a high relative abundance of Mycoplasma in the nasal samples. Further, differential functional profiling identifies the metabolic pathways related to cofactors and amino acids, secondary metabolism, and signaling pathways. These findings support that a combination of bacterial communities is involved in mediating the responses involved in chronic respiratory conditions like asthma by exerting their influence on various metabolic pathways. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. SpeciateIT and vSpeciateDB: novel, fast, and accurate per sequence 16S rRNA gene taxonomic classification of vaginal microbiota
- Author
-
Johanna B. Holm, Pawel Gajer, and Jacques Ravel
- Subjects
Amplicon sequencing ,Taxonomic classification ,Vaginal microbiota ,16S rRNA gene ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background Clustering of sequences into operational taxonomic units (OTUs) and denoising methods are a mainstream stopgap to taxonomically classifying large numbers of 16S rRNA gene sequences. Environment-specific reference databases generally yield optimal taxonomic assignment. Results We developed SpeciateIT, a novel taxonomic classification tool which rapidly and accurately classifies individual amplicon sequences ( https://github.com/Ravel-Laboratory/speciateIT ). We also present vSpeciateDB, a custom reference database for the taxonomic classification of 16S rRNA gene amplicon sequences from vaginal microbiota. We show that SpeciateIT requires minimal computational resources relative to other algorithms and, when combined with vSpeciateDB, affords accurate species level classification in an environment-specific manner. Conclusions Herein, two resources with new and practical importance are described. The novel classification algorithm, SpeciateIT, is based on 7th order Markov chain models and allows for fast and accurate per-sequence taxonomic assignments (as little as 10 min for 107 sequences). vSpeciateDB, a meticulously tailored reference database, stands as a vital and pragmatic contribution. Its significance lies in the superiority of this environment-specific database to provide more species-resolution over its universal counterparts.
- Published
- 2024
- Full Text
- View/download PDF
6. Leaf macro- and micromorphological traits and phenotypic diversity of Quercus petraea subspecies in Eastern Romania.
- Author
-
GAFENCO (PLEȘCA), Ioana M., APOSTOL, Ecaterina N., PLEȘCA, Bogdan I., CIOCÎRLAN, Elena, GUREAN, Dan M., and ȘOFLETEA, Neculae
- Subjects
- *
DURMAST oak , *MULTIVARIATE analysis , *NATURAL selection , *SUBSPECIES , *TRICHOMES - Abstract
Sessile oak (Quercus petraea) is a polytypic species comprising three subspecies (Q. petraea subsp. petraea – Qpe, Q. petraea subsp. dalechampii – Qda, and Q. petraea subsp. polycarpa – Qpo) with distinct ecological requirements, posing significant challenges in morphological differentiation. The integration of macro- and micro-morphological analyses plays a crucial role in clarifying the taxonomic uncertainties. This study aimed to characterize phenotypic diversity and identify key leaf descriptors for distinguishing sessile oak subspecies across three peripheral populations, one reference population, and one sessile oak comparative trail from Eastern Romania. A comprehensive analysis was conducted on 227 sampled trees, utilizing multivariate statistical analysis - encompassing 18 macromorphological and 9 micromorphological leaf descriptors. The results revealed distinct traits of Qda and Qpo, including shorter leaves with maximal width in the lower half of the lamina, fewer lobes, ovate shapes, a subcordate basal shape, and a higher intercalary vein frequency compared to Qpe. Furthermore, Qpo could be differentiated from both Qpe and Qda by its shorter lamina lengths, fewer lobes, greater lobe width ratios, and stellate trichomes with shorter rays. The length of rays of stellate trichomes has emerged as a significant micromorphological descriptor. Qda predominated in peripheral populations, likely due to natural selection in drought-affected local ecosystems. This highlights the importance of prioritizing this taxon in breeding programs and conserving it in situ, given its remarkable leaf plasticity and adaptability. Additionally, principal component indicated a fairly high level of morphological similarity among the three subspecies. These findings emphasize the critical importance of comprehensive morphological analyses for precise species classification and deeper understanding of sessile oak taxonomy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Analyzing the performance of short-read classification tools on metagenomic samples toward proper diagnosis of diseases.
- Author
-
Irankhah, Leili, Khorsand, Babak, Naghibzadeh, Mahmoud, and Savadi, Abdorreza
- Subjects
- *
CROHN'S disease , *INFLAMMATORY bowel diseases , *ESCHERICHIA coli , *ULCERATIVE colitis , *NUCLEOTIDE sequencing - Abstract
Accurate knowledge of the genome, virus and bacteria that have invaded our bodies is crucial for diagnosing many human diseases. The field of bioinformatics encompasses the complex computational methods required for this purpose. Metagenomics employs next-generation sequencing (NGS) technology to study and identify microbial communities in environmental samples. This technique allows for the measurement of the relative abundance of different microbes. Various tools are available for detecting bacterial species in sequenced metagenomic samples. In this study, we focus on well-known taxonomic classification tools such as MetaPhlAn4, Centrifuge, Kraken2, and Bracken, and evaluate their performance at the species level using synthetic and real datasets. The results indicate that MetaPhlAn4 exhibited high precision in identifying species in the simulated dataset, while Kraken2 had the best area under the precision-recall curve (AUPR) performance. Centrifuge, Kraken2, and Bracken showed accurate estimation of species abundances, unlike MetaPhlAn4, which had a higher L2 distance. In the real dataset analysis with samples from an inflammatory bowel disease (IBD) research, MetaPhlAn4, and Kraken2 had faster execution times, with differences in performance at family and species levels among the tools. Enterobacteriaceae and Pasteurellaceae were highlighted as the most abundant families by Centrifuge, Kraken2, and MetaPhlAn4, with variations in abundance among ulcerative colitis (UC), Crohn's disease (CD), and control non-IBD (CN) groups. Escherichia coli (E. coli) has the highest abundance among Enterobacteriaceae species in the CD and UC groups in comparison with the CN group. Bracken overestimated E. coli abundance, emphasizing result interpretation caution. The findings of this research can assist in selecting the appropriate short-read classifier, thereby aiding in the diagnosis of target diseases. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. SpeciateIT and vSpeciateDB: novel, fast, and accurate per sequence 16S rRNA gene taxonomic classification of vaginal microbiota.
- Author
-
Holm, Johanna B., Gajer, Pawel, and Ravel, Jacques
- Subjects
CLASSIFICATION algorithms ,DATABASES ,MARKOV processes ,STOPGAP solutions ,RIBOSOMAL RNA - Abstract
Background: Clustering of sequences into operational taxonomic units (OTUs) and denoising methods are a mainstream stopgap to taxonomically classifying large numbers of 16S rRNA gene sequences. Environment-specific reference databases generally yield optimal taxonomic assignment. Results: We developed SpeciateIT, a novel taxonomic classification tool which rapidly and accurately classifies individual amplicon sequences (https://github.com/Ravel-Laboratory/speciateIT). We also present vSpeciateDB, a custom reference database for the taxonomic classification of 16S rRNA gene amplicon sequences from vaginal microbiota. We show that SpeciateIT requires minimal computational resources relative to other algorithms and, when combined with vSpeciateDB, affords accurate species level classification in an environment-specific manner. Conclusions: Herein, two resources with new and practical importance are described. The novel classification algorithm, SpeciateIT, is based on 7th order Markov chain models and allows for fast and accurate per-sequence taxonomic assignments (as little as 10 min for 10
7 sequences). vSpeciateDB, a meticulously tailored reference database, stands as a vital and pragmatic contribution. Its significance lies in the superiority of this environment-specific database to provide more species-resolution over its universal counterparts. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
9. The microwave bacteriome: biodiversity of domestic and laboratory microwave ovens.
- Author
-
Iglesias, Alba, Martínez, Lorena, Torrent, Daniel, and Porcar, Manuel
- Subjects
MICROWAVE ovens ,COLONIZATION (Ecology) ,RADIATION pressure ,BACTERIAL population ,NUCLEOTIDE sequencing - Abstract
Microwaves have become an essential part of the modern kitchen, but their potential as a reservoir for bacterial colonization and the microbial composition within them remain largely unexplored. In this study, we investigated the bacterial communities in microwave ovens and compared the microbial composition of domestic microwaves, microwaves used in shared large spaces, and laboratory microwaves, using next-generation sequencing and culturing techniques. The microwave oven bacterial population was dominated by Proteobacteria, Firmicutes, Actinobacteria, and Bacteroidetes, similar to the bacterial composition of human skin. Comparison with other environments revealed that the bacterial composition of domestic microwaves was similar to that of kitchen surfaces, whereas laboratory microwaves had a higher abundance of taxa known for their ability to withstand microwave radiation, high temperatures and desiccation. These results suggest that different selective pressures, such as human contact, nutrient availability and radiation levels, may explain the differences observed between domestic and laboratory microwaves. Overall, this study provides valuable insights into microwave ovens bacterial communities and their potential biotechnological applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. UCE-based phylogenomics of the lepidopteran endoparasitoid wasp subfamily Rogadinae (Hymenoptera: Braconidae) unveils a new Neotropical tribe.
- Author
-
Shimbori, Eduardo M., Castañeda-Osorio, Rubén, Jasso-Martínez, Jovana M., Penteado-Dias, Angélica M., Gadelha, Sian S., Brady, Seán G., Quicke, Donald L. J., Kula, Robert R., and Zaldívar-Riverón, Alejandro
- Subjects
- *
NUCLEAR DNA , *NUCLEOTIDE sequence , *DNA sequencing , *TRIBES , *BRACONIDAE - Abstract
During the past two decades, the phylogenetic relationships and higher-level classification of the subfamily Rogadinae have received relevant contributions based on Sanger, mitogenome and genome-wide nuclear DNA sequence data. These studies have helped to update the circumscription and tribal classification of this subfamily, with six tribes currently recognised (Aleiodini, Betylobraconini, Clinocentrini, Rogadini, Stiropiini and Yeliconini). The tribal relationships within Rogadinae, however, are yet to be fully resolved, including the status of tribe Facitorini, previously regarded as betylobraconine, with respect to the members of Yeliconini. We conducted a phylogenomic analysis among the tribes of Rogadinae based on genomic ultraconserved element (UCE) data and extensive taxon sampling including three undescribed genera of uncertain tribal placement. Our almost fully supported estimate of phylogeny confirmed the basal position of Rogadini within the subfamily and a Facitorini clade (Yeliconini + Aleiodini) that led us to propose the former group as a valid rogadine tribe (Facitorini stat. res.). Stiropiini, however, was recovered for the first time as sister to the remaining rogadine tribes except Rogadini, and Clinocentrini as sister to a clade with Betylobraconini + the three undescribed genera. The relationships recovered and morphological examination of the material included led us to place the latter three new genera and recently described genus Gondwanocentrus within a new rogadine tribe, Gondwanocentrini Shimbori & Zaldívar-Riverón trib. nov. We described these genera (Ghibli Shimbori & Zaldívar-Riverón gen. nov., Racionais Shimbori & Zaldívar-Riverón gen. nov. and Soraya Shimbori gen. nov.) with two or three new species each (G. miyazakii Shimbori & Zaldívar-Riverón sp. nov., G. totoro Shimbori & Zaldívar-Riverón sp. nov., R. brunus Shimbori & Zaldívar-Riverón sp. nov., R. kaelejay Shimbori & Zaldívar-Riverón sp. nov., R. superstes Shimbori & Zaldívar-Riverón sp. nov., S. alencarae Shimbori sp. nov. and S. venus Shimbori & Zaldívar-Riverón sp. nov.). A new species of Facitorini, Jannya pasargadae Gadelha & Shimbori sp. nov., is also described. Our newly proposed classification expands the number of tribes and genera within Rogadinae to 8 and 66 respectively. ZooBank: Tribal relationships within the braconid subfamily Rogadinae are reconstructed based on nuclear UCE data and extensive taxon sampling. Our fully supported estimate of phylogeny and the morphological evidence led us to erect a new rogadine tribe, Gondwanocentrini Shimbori & Zaldívar-Riverón trib. nov. Facitorini was also confirmed as a separate rogadine tribe. Our updated classification expands the number of tribes and genera within Rogadinae to 8 and 66 respectively. (Image credit: Eduardo M. Shimbori.) [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. The medico-legal interpretation of diatom findings for the diagnosis of fatal drowning: a systematic review
- Author
-
Tyr, Alexander, Lunetta, Philippe, Zilg, Brita, Winskog, Carl, and Heldring, Nina
- Published
- 2025
- Full Text
- View/download PDF
12. Impact of database choice and confidence score on the performance of taxonomic classification using Kraken2
- Author
-
Liu, Yunlong, Ghaffari, Morteza H., Ma, Tao, and Tu, Yan
- Published
- 2024
- Full Text
- View/download PDF
13. Citrus Greek National Germplasm Collection: a genetic diversity survey using nuclear and chloroplast microsatellite markers
- Author
-
Tourvas, Nikolaos, Boutsika, Anastasia, Michailidis, Michail, Bazakos, Christos, Mellidou, Ifigeneia, Sarrou, Eirini, Polychroniadou, Chrysanthi, Lyrou, Fani, Kotina, Vasiliki-Maria, Xanthopoulou, Aliki, Molassiotis, Athanassios, Ziogas, Vasileios, Aravanopoulos, Filippos, and Ganopoulos, Ioannis
- Published
- 2024
- Full Text
- View/download PDF
14. PROTAX-GPU: a scalable probabilistic taxonomic classification system for DNA barcodes.
- Author
-
Li, Roy, Ratnasingham, Sujeevan, Zarubiieva, Iuliia, Somervuo, Panu, and Taylor, Graham W.
- Subjects
- *
GRAPHICS processing units , *BAR codes , *BIOLOGICAL specimens , *CENTRAL processing units , *BIODIVERSITY monitoring , *GENETIC barcoding , *CLASSIFICATION - Abstract
DNA-based identification is vital for classifying biological specimens, yet methods to quantify the uncertainty of sequence-based taxonomic assignments are scarce. Challenges arise from noisy reference databases, including mislabelled entries and missing taxa. PROTAX addresses these issues with a probabilistic approach to taxonomic classification, advancing on methods that rely solely on sequence similarity. It provides calibrated probabilistic assignments to a partially populated taxonomic hierarchy, accounting for taxa that lack references and incorrect taxonomic annotation. While effective on smaller scales, global application of PROTAX necessitates substantially larger reference libraries, a goal previously hindered by computational barriers. We introduce PROTAX-GPU, a scalable algorithm capable of leveraging the global Barcode of Life Data System (>14 million specimens) as a reference database. Using graphics processing units (GPU) to accelerate similarity and nearest-neighbour operations and the JAX library for Python integration, we achieve over a 1000 × speedup compared with the central processing unit (CPU)-based implementation without compromising PROTAX's key benefits. PROTAX-GPU marks a significant stride towards real-time DNA barcoding, enabling quicker and more efficient species identification in environmental assessments. This capability opens up new avenues for real-time monitoring and analysis of biodiversity, advancing our ability to understand and respond to ecological dynamics. This article is part of the theme issue 'Towards a toolkit for global insect biodiversity monitoring'. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Application and Comparison of Machine Learning and Database-Based Methods in Taxonomic Classification of High-Throughput Sequencing Data.
- Author
-
Tian, Qinzhong, Zhang, Pinglu, Zhai, Yixiao, Wang, Yansu, and Zou, Quan
- Subjects
- *
NUCLEOTIDE sequencing , *TECHNOLOGICAL innovations , *CLASSIFICATION , *DEVELOPMENTAL biology , *DATABASES , *SYNTHETIC biology - Abstract
The advent of high-throughput sequencing technologies has not only revolutionized the field of bioinformatics but has also heightened the demand for efficient taxonomic classification. Despite technological advancements, efficiently processing and analyzing the deluge of sequencing data for precise taxonomic classification remains a formidable challenge. Existing classification approaches primarily fall into two categories, database-based methods and machine learning methods, each presenting its own set of challenges and advantages. On this basis, the aim of our study was to conduct a comparative analysis between these two methods while also investigating the merits of integrating multiple database-based methods. Through an in-depth comparative study, we evaluated the performance of both methodological categories in taxonomic classification by utilizing simulated data sets. Our analysis revealed that database-based methods excel in classification accuracy when backed by a rich and comprehensive reference database. Conversely, while machine learning methods show superior performance in scenarios where reference sequences are sparse or lacking, they generally show inferior performance compared with database methods under most conditions. Moreover, our study confirms that integrating multiple database-based methods does, in fact, enhance classification accuracy. These findings shed new light on the taxonomic classification of high-throughput sequencing data and bear substantial implications for the future development of computational biology. For those interested in further exploring our methods, the source code of this study is publicly available on https://github.com/LoadStar822/Genome-Classifier-Performance-Evaluator. Additionally, a dedicated webpage showcasing our collected database, data sets, and various classification software can be found at http://lab.malab.cn/~tqz/project/taxonomic/. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Improving Bacterial Metagenomic Research through Long-Read Sequencing.
- Author
-
Greenman, Noah, Hassouneh, Sayf Al-Deen, Abdelli, Latifa S., Johnston, Catherine, and Azarian, Taj
- Subjects
METAGENOMICS ,MICROBIAL communities ,PRINCIPAL components analysis ,ANIMAL droppings - Abstract
Metagenomic sequencing analysis is central to investigating microbial communities in clinical and environmental studies. Short-read sequencing remains the primary approach for metagenomic research; however, long-read sequencing may offer advantages of improved metagenomic assembly and resolved taxonomic identification. To compare the relative performance for metagenomic studies, we simulated short- and long-read datasets using increasingly complex metagenomes comprising 10, 20, and 50 microbial taxa. Additionally, we used an empirical dataset of paired short- and long-read data generated from mouse fecal pellets to assess real-world performance. We compared metagenomic assembly quality, taxonomic classification, and metagenome-assembled genome (MAG) recovery rates. We show that long-read sequencing data significantly improve taxonomic classification and assembly quality. Metagenomic assemblies using simulated long reads were more complete and more contiguous with higher rates of MAG recovery. This resulted in more precise taxonomic classifications. Principal component analysis of empirical data demonstrated that sequencing technology affects compositional results as samples clustered by sequence type, not sample type. Overall, we highlight strengths of long-read metagenomic sequencing for microbiome studies, including improving the accuracy of classification and relative abundance estimates. These results will aid researchers when considering which sequencing approaches to use for metagenomic projects. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Leveraging Large Image-Caption Datasets for Multimodal Taxon Classification
- Author
-
Chavez, Raynor Kirkson E., Reynoso, Kyle Gabriel M., Raquel, Carlo R., Naval, Prospero C., Jr., Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Nguyen, Ngoc Thanh, editor, Chbeir, Richard, editor, Manolopoulos, Yannis, editor, Fujita, Hamido, editor, Hong, Tzung-Pei, editor, Nguyen, Le Minh, editor, and Wojtkiewicz, Krystian, editor
- Published
- 2024
- Full Text
- View/download PDF
18. Memory-Bound and Taxonomy-Aware K-Mer Selection for Ultra-Large Reference Libraries
- Author
-
Şapcı, Ali Osman Berk, Mirarab, Siavash, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, van Leeuwen, Jan, Series Editor, Hutchison, David, Editorial Board Member, Kanade, Takeo, Editorial Board Member, Kittler, Josef, Editorial Board Member, Kleinberg, Jon M., Editorial Board Member, Kobsa, Alfred, Series Editor, Mattern, Friedemann, Editorial Board Member, Mitchell, John C., Editorial Board Member, Naor, Moni, Editorial Board Member, Nierstrasz, Oscar, Series Editor, Pandu Rangan, C., Editorial Board Member, Sudan, Madhu, Series Editor, Terzopoulos, Demetri, Editorial Board Member, Tygar, Doug, Editorial Board Member, Weikum, Gerhard, Series Editor, Vardi, Moshe Y, Series Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, and Ma, Jian, editor
- Published
- 2024
- Full Text
- View/download PDF
19. The microwave bacteriome: biodiversity of domestic and laboratory microwave ovens
- Author
-
Alba Iglesias, Lorena Martínez, Daniel Torrent, and Manuel Porcar
- Subjects
microwave ,16S rRNA gene sequencing ,taxonomic classification ,radiation ,desiccation ,selective pressure ,Microbiology ,QR1-502 - Abstract
Microwaves have become an essential part of the modern kitchen, but their potential as a reservoir for bacterial colonization and the microbial composition within them remain largely unexplored. In this study, we investigated the bacterial communities in microwave ovens and compared the microbial composition of domestic microwaves, microwaves used in shared large spaces, and laboratory microwaves, using next-generation sequencing and culturing techniques. The microwave oven bacterial population was dominated by Proteobacteria, Firmicutes, Actinobacteria, and Bacteroidetes, similar to the bacterial composition of human skin. Comparison with other environments revealed that the bacterial composition of domestic microwaves was similar to that of kitchen surfaces, whereas laboratory microwaves had a higher abundance of taxa known for their ability to withstand microwave radiation, high temperatures and desiccation. These results suggest that different selective pressures, such as human contact, nutrient availability and radiation levels, may explain the differences observed between domestic and laboratory microwaves. Overall, this study provides valuable insights into microwave ovens bacterial communities and their potential biotechnological applications.
- Published
- 2024
- Full Text
- View/download PDF
20. MetageNN: a memory-efficient neural network taxonomic classifier robust to sequencing errors and missing genomes
- Author
-
Rafael Peres da Silva, Chayaporn Suphavilai, and Niranjan Nagarajan
- Subjects
Taxonomic classification ,Machine learning ,Metagenomics ,Long-read ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background With the rapid increase in throughput of long-read sequencing technologies, recent studies have explored their potential for taxonomic classification by using alignment-based approaches to reduce the impact of higher sequencing error rates. While alignment-based methods are generally slower, k-mer-based taxonomic classifiers can overcome this limitation, potentially at the expense of lower sensitivity for strains and species that are not in the database. Results We present MetageNN, a memory-efficient long-read taxonomic classifier that is robust to sequencing errors and missing genomes. MetageNN is a neural network model that uses short k-mer profiles of sequences to reduce the impact of distribution shifts on error-prone long reads. Benchmarking MetageNN against other machine learning approaches for taxonomic classification (GeNet) showed substantial improvements with long-read data (20% improvement in F1 score). By utilizing nanopore sequencing data, MetageNN exhibits improved sensitivity in situations where the reference database is incomplete. It surpasses the alignment-based MetaMaps and MEGAN-LR, as well as the k-mer-based Kraken2 tools, with improvements of 100%, 36%, and 23% respectively at the read-level analysis. Notably, at the community level, MetageNN consistently demonstrated higher sensitivities than the previously mentioned tools. Furthermore, MetageNN requires 7× faster than MetaMaps and GeNet and > 2× faster than MEGAN-LR and MMseqs2. Conclusion This proof of concept work demonstrates the utility of machine-learning-based methods for taxonomic classification using long reads. MetageNN can be used on sequences not classified by conventional methods and offers an alternative approach for memory-efficient classifiers that can be optimized further.
- Published
- 2024
- Full Text
- View/download PDF
21. Professionalization of the public health workforce: scoping review and call to action.
- Author
-
Czabanowska, Katarzyna, Feria, Pablo Rodriguez, Kuhlmann, Ellen, Kostoulas, Polychronis, Middleton, John, Magana, Laura, Sutton, Gabriella, Goodman, Julien, Burazeri, Genc, Aleksandrova, Olga, and Piven, Natalia
- Subjects
- *
ONLINE information services , *SYSTEMATIC reviews , *PUBLIC health , *LABOR supply , *PROFESSIONALISM , *LITERATURE reviews , *MEDLINE , *ERIC (Information retrieval system) - Abstract
Background The 'WHO-ASPHER Roadmap to Professionalizing the Public Health Workforce in the European Region' provides recommendations for strategic and systematic workforce planning around professionalization levers including: (i) competencies, (ii) training and education, (iii) formal organization, (iv) professional credentialing and (v) code of ethics and professional conduct as well as taxonomy and enumeration. It was based on a literature review till 2016. This scoping review aims to explore how the professionalization was documented in the literature between 2016 and 2022. Methods Following the Joanna Briggs Institute guidelines, we searched Medline via PubMed, Web of Science, ERIC via EBSCO and Google Scholar and included studies on professionalization levers. Four critical appraisal tools were used to assess qualitative, quantitative, mixed methods studies and grey literature. The PRISMA Extension for Scoping Reviews (PRISMA-ScR) was used for reporting. Results Eleven articles included in this review spanned 61 countries, targeting undergraduate, master's, doctoral degrees and continuing professional development. Most of these documents were reviews. About half provided a definition of the public health workforce; more than half covered the taxonomy and included information about competences, but the use of frameworks was sporadic and inconsistent. Formal organization and the necessity of a code of conduct for the public health workforce were acknowledged in only two studies. Conclusions In spite of some efforts to professionalize the public health workforce, this process is fragmented and not fully recognized and supported. There is an urgent need to engage policymakers and stakeholders to prioritize investments in strengthening the public health workforce worldwide. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Assessing the performance of short 18S rDNA markers for environmental DNA metabarcoding of marine protists
- Author
-
Heike H. Zimmermann, Sara Harðardóttir, and Sofia Ribeiro
- Subjects
biodiversity ,ecoPCR ,microbial eukaryotes ,molecular primers ,SSU ,taxonomic classification ,Environmental sciences ,GE1-350 ,Microbial ecology ,QR100-130 - Abstract
Abstract Marine protists are globally distributed and sensitive to environmental conditions, which makes them a focal group when studying the effects of climate change on biodiversity and ocean health. However, they are a highly diverse group with varying evolutionary histories and morphologies and widely variable preservation potential in the fossil record. Thus, their past diversity and composition are poorly known. Paleogenetics, which relies, among other approaches, on DNA metabarcoding of sedimentary ancient DNA (sedaDNA), provides a promising avenue to explore the past history and responses of marine protists to global change. Choosing the right marker for sedaDNA studies is critical, striking a balance between marker length and taxonomic resolution. While marker guides exist for modern environmental DNA surveys, a thorough assessment of existing short markers for sedaDNA studies targeting protists is lacking. In this study, we report on a comparison of in silico PCR for eight short 18S rDNA markers, including one from the Tara Oceans initiative and a longer marker commonly used in modern marine eDNA studies. We analyze their taxonomic coverage and resolution, taxonomic overlap and uniqueness between markers, co‐amplification of non‐protist taxa, and amplicon size differences across taxonomic groups. Additionally, we provide a detailed analysis of diatoms, dinoflagellates, haptophytes, and chlorophytes. Our study is aimed at supporting project‐specific marker choices for characterizing protist composition and diversity. While we focus on marine protists, our results are applicable to other aquatic and terrestrial environments.
- Published
- 2024
- Full Text
- View/download PDF
23. Partial Genome Characterization of Novel Parapoxvirus in Horse, Finland
- Author
-
Jenni Virtanen, Maria Hautaniemi, Lara Dutra, Ilya Plyusnin, Katja Hautala, Teemu Smura, Olli Vapalahti, Tarja Sironen, Ravi Kant, and Paula M. Kinnunen
- Subjects
parapoxvirus ,viruses ,zoonoses ,taxonomic classification ,high-throughput nucleotide sequencing ,horses ,Medicine ,Infectious and parasitic diseases ,RC109-216 - Abstract
We report a sequencing protocol and 121-kb poxvirus sequence from a clinical sample from a horse in Finland with dermatitis. Based on phylogenetic analyses, the virus is a novel parapoxvirus associated with a recent epidemic; previous data suggest zoonotic potential. Increased awareness of this virus and specific diagnostic protocols are needed.
- Published
- 2023
- Full Text
- View/download PDF
24. Species-level resolution for the vaginal microbiota with short amplicons
- Author
-
Wei Qing, Yiya Shi, Rongdan Chen, Yin'ai Zou, Cancan Qi, Yingxuan Zhang, Zuyi Zhou, Shanshan Li, Yi Hou, Hongwei Zhou, and Muxuan Chen
- Subjects
vaginal microbiota ,16S rRNA gene sequencing ,taxonomic classification ,species-level resolution ,Microbiology ,QR1-502 - Abstract
ABSTRACTSpecific bacterial species have been found to play important roles in human vagina. Achieving high species-level resolution is vital for analyzing vaginal microbiota data. However, contradictory conclusions were yielded from different methodological studies. More comprehensive evaluation is needed for determining an optimal pipeline for vaginal microbiota. Based on the sequences of vaginal bacterial species downloaded from NCBI, we conducted simulated amplification with various primer sets targeting different 16S regions as well as taxonomic classification on the amplicons applying different combinations of algorithms (BLAST+, VSEARCH, and Sklearn) and reference databases (Greengenes2, SILVA, and RDP). Vaginal swabs were collected from participants with different vaginal microecology to construct 16S full-length sequenced mock communities. Both computational and experimental amplifications were performed on the mock samples. Classification accuracy of each pipeline was determined. Microbial profiles were compared between the full-length and partial 16S sequencing samples. The optimal pipeline was further validated in a multicenter cohort against the PCR results of common STI pathogens. Pipeline V1–V3_Sklearn_Combined had the highest accuracy for classifying the amplicons generated from both the NCBI downloaded data (84.20% ± 2.39%) and the full-length sequencing data (95.65% ± 3.04%). Vaginal samples amplified and sequenced targeting the V1–V3 region but merely employing the forward reads (223 bp) and classified using the optimal pipeline, resembled the mock communities the most. The pipeline demonstrated high F1-scores for detecting STI pathogens within the validation cohort. We have determined an optimal pipeline to achieve high species-level resolution for vaginal microbiota with short amplicons, which will facilitate future studies.IMPORTANCEFor vaginal microbiota studies, diverse 16S rRNA gene regions were applied for amplification and sequencing, which affect the comparability between different studies as well as the species-level resolution of taxonomic classification. We conducted comprehensive evaluation on the methods which influence the accuracy for the taxonomic classification and established an optimal pipeline to achieve high species-level resolution for vaginal microbiota with short amplicons, which will facilitate future studies.
- Published
- 2024
- Full Text
- View/download PDF
25. The impact of transitive annotation on the training of taxonomic classifiers.
- Author
-
Muralidharan, Harihara Subrahmaniam, Fox, Noam Y., and Mihai Pop
- Subjects
MACHINE learning ,BIOLOGICAL databases ,DATABASES ,ANNOTATIONS ,TASK analysis - Abstract
Introduction: A common task in the analysis of microbial communities involves assigning taxonomic labels to the sequences derived from organisms found in the communities. Frequently, such labels are assigned using machine learning algorithms that are trained to recognize individual taxonomic groups based on training data sets that comprise sequences with known taxonomic labels. Ideally, the training data should rely on labels that are experimentally verified--formal taxonomic labels require knowledge of physical and biochemical properties of organisms that cannot be directly inferred from sequence alone. However, the labels associated with sequences in biological databases are most commonly computational predictions which themselves may rely on computationallygenerated data--a process commonly referred to as "transitive annotation." Methods: In this manuscript we explore the implications of training a machine learning classifier (the Ribosomal Database Project's Bayesian classifier in our case) on data that itself has been computationally generated. We generate new training examples based on 16S rRNA data from a metagenomic experiment, and evaluate the extent to which the taxonomic labels predicted by the classifier change after re-training. Results: We demonstrate that even a few computationally-generated training data points can significantly skew the output of the classifier to the point where entire regions of the taxonomic space can be disturbed. Discussion and conclusions: We conclude with a discussion of key factors that affect the resilience of classifiers to transitively-annotated training data, and propose best practices to avoid the artifacts described in our paper. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. Pangenome databases improve host removal and mycobacteria classification from clinical metagenomic data.
- Author
-
Hall, Michael B and Coin, Lachlan J M
- Subjects
- *
RESOURCE-limited settings , *HUMAN DNA , *HUMAN genome , *PAN-genome , *DATABASES - Abstract
Background Culture-free real-time sequencing of clinical metagenomic samples promises both rapid pathogen detection and antimicrobial resistance profiling. However, this approach introduces the risk of patient DNA leakage. To mitigate this risk, we need near-comprehensive removal of human DNA sequences at the point of sequencing, typically involving the use of resource-constrained devices. Existing benchmarks have largely focused on the use of standardized databases and largely ignored the computational requirements of depletion pipelines as well as the impact of human genome diversity. Results We benchmarked host removal pipelines on simulated and artificial real Illumina and Nanopore metagenomic samples. We found that construction of a custom kraken database containing diverse human genomes results in the best balance of accuracy and computational resource usage. In addition, we benchmarked pipelines using kraken and minimap2 for taxonomic classification of Mycobacterium reads using standard and custom databases. With a database representative of the Mycobacterium genus, both tools obtained improved specificity and sensitivity, compared to the standard databases for classification of Mycobacterium tuberculosis. Computational efficiency of these custom databases was superior to most standard approaches, allowing them to be executed on a laptop device. Conclusions Customized pangenome databases provide the best balance of accuracy and computational efficiency when compared to standard databases for the task of human read removal and M. tuberculosis read classification from metagenomic samples. Such databases allow for execution on a laptop, without sacrificing accuracy, an especially important consideration in low-resource settings. We make all customized databases and pipelines freely available. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Classifying the bacterial taxonomy with its metagenomic data using the deep neural network model.
- Author
-
Raman, Ramakrishnan, Barve, Amit, Meenakshi, R., Jayaseelan, G.M., Ganeshan, P., Taqui, Syed Noeman, Almoallim, Hesham S., Alharbi, Sulaiman Ali, and Raghavan, S.S.
- Subjects
- *
DEEP learning , *CONVOLUTIONAL neural networks , *BACTERIA classification , *SHOTGUN sequencing , *METAGENOMICS , *FEATURE extraction - Abstract
Because of the two sequenced methods stated above, SG and AMP, are being used in different ways, present a deep learning methodology for taxonomic categorization of the metagenomic information which could be utilized for either. To place the suggested pipeline to a trial, 1000 16 S full-length genomes were used to generate either SG or AMP short-reads. Then, to map sequencing as matrices into such a number space, used a k-mer model. Our analysis of the existing approaches revealed several drawbacks, including limited ability to handle complex hierarchical representations of data and suboptimal feature extraction from grid-like structures. To overcome these limitations, we introduce DBNs for feature learning and dimensionality reduction, and CNNs for efficient processing of grid-like metagenomic data. Finally, a training set for every taxon was obtained by training two distinct deep learning constructions, specifically deep belief network (DBN) and convolutional neural network (CNN). This examined the proposed methodology to determine the best factor that determines and compared findings to the classification abilities offered by the RDP classifier, a standard classifier for bacterium identification. These designs outperform using RDP classifiers at every taxonomic level. So, at the genetic level, for example, both CNN and DBN achieved 91.4% accuracy using AMP short-reads, but the RDP classifier achieved 83.9% with the same information. This paper, suggested a classification method for 16 S short-read sequences created on k-mer representations and a deep learning structure, that every taxon creates a classification method. The experimental findings validate the suggested pipelines as a realistic strategy for classifying bacterium samples; as a result, the technique might be included in the most commonly used tools for the metagenomic research. According to the outcomes, it could be utilized to effectively classify either SG or AMP information. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
28. Whole Genome Sequencing Based Taxonomic Classification, and Comparative Genomic Analysis of Potentially Human Pathogenic Enterobacter spp. Isolated from Chlorinated Wastewater in the North West Province, South Africa.
- Author
-
Maguvu, Tawanda and Bezuidenhout, Cornelius
- Subjects
Enterobacter ,comparative genomics ,pan-genome analysis ,taxonomic classification - Abstract
Comparative genomics, in particular, pan-genome analysis, provides an in-depth understanding of the genetic variability and dynamics of a bacterial species. Coupled with whole-genome-based taxonomic analysis, these approaches can help to provide comprehensive, detailed insights into a bacterial species. Here, we report whole-genome-based taxonomic classification and comparative genomic analysis of potential human pathogenic Enterobacter hormaechei subsp. hoffmannii isolated from chlorinated wastewater. Genome Blast Distance Phylogeny (GBDP), digital DNA-DNA hybridization (dDDH), and average nucleotide identity (ANI) confirmed the identity of the isolates. The algorithm PathogenFinder predicted the isolates to be human pathogens with a probability of greater than 0.78. The potential pathogenic nature of the isolates was supported by the presence of biosynthetic gene clusters (BGCs), aerobactin, and aryl polyenes (APEs), which are known to be associated with pathogenic/virulent strains. Moreover, analysis of the genome sequences of the isolates reflected the presence of an arsenal of virulence factors and antibiotic resistance genes that augment the predictions of the algorithm PathogenFinder. The study comprehensively elucidated the genomic features of pathogenic Enterobacter isolates from wastewaters, highlighting the role of wastewaters in the dissemination of pathogenic microbes, and the need for monitoring the effectiveness of the wastewater treatment process.
- Published
- 2021
29. CONSULT-II: Taxonomic Identification Using Locality Sensitive Hashing
- Author
-
Şapcı, Ali Osman Berk, Rachtman, Eleonora, Mirarab, Siavash, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Jahn, Katharina, editor, and Vinař, Tomáš, editor
- Published
- 2023
- Full Text
- View/download PDF
30. Food to Medicine: The Impact of Soil and Climatic Factors on the Phytochemical Property of Anahaw (Saribus rotundifolius (Lam.) Blume Shoot
- Author
-
Bucao, Dionisio S., Bucao, Xenia Elika N., Ramamoorthy, Siva, editor, Buot Jr., Inocencio E, editor, and Rajasekaran, C, editor
- Published
- 2023
- Full Text
- View/download PDF
31. Assigning Taxonomy, Building Phylogenetic Tree
- Author
-
Xia, Yinglin, Sun, Jun, Xia, Yinglin, and Sun, Jun
- Published
- 2023
- Full Text
- View/download PDF
32. Multi-locus phylogeny of the catfish genus Ictalurus Rafinesque, 1820 (Actinopterygii, Siluriformes) and its systematic and evolutionary implications
- Author
-
Rodolfo Pérez-Rodríguez, Omar Domínguez-Domínguez, Carlos Pedraza-Lara, Rogelio Rosas-Valdez, Gerardo Pérez-Ponce de León, Ana Berenice García-Andrade, and Ignacio Doadrio
- Subjects
Evolution ,Freshwater fishes ,Ictaluridae ,North America ,Taxonomic classification ,Ecology ,QH540-549.5 ,QH359-425 - Abstract
Abstract Background Ictalurus is one of the most representative groups of North American freshwater fishes. Although this group has a well-studied fossil record and has been the subject of several morphological and molecular phylogenetic studies, incomplete taxonomic sampling and insufficient taxonomic studies have produced a rather complex classification, along with intricate patterns of evolutionary history in the genus that are considered unresolved and remain under debate. Results Based on four loci and the most comprehensive taxonomic sampling analyzed to date, including currently recognized species, previously synonymized species, undescribed taxa, and poorly studied populations, this study produced a resolved phylogenetic framework that provided plausible species delimitation and an evolutionary time framework for the genus Ictalurus. Conclusions Our phylogenetic hypothesis revealed that Ictalurus comprises at least 13 evolutionary units, partially corroborating the current classification and identifying populations that emerge as putative undescribed taxa. The divergence times of the species indicate that the diversification of Ictalurus dates to the early Oligocene, confirming its status as one of the oldest genera within the family Ictaluridae.
- Published
- 2023
- Full Text
- View/download PDF
33. Improving Bacterial Metagenomic Research through Long-Read Sequencing
- Author
-
Noah Greenman, Sayf Al-Deen Hassouneh, Latifa S. Abdelli, Catherine Johnston, and Taj Azarian
- Subjects
metagenomics ,shotgun metagenomic sequencing ,next-generation sequencing ,third-generation sequencing ,taxonomic classification ,metagenome-assembled genomes ,Biology (General) ,QH301-705.5 - Abstract
Metagenomic sequencing analysis is central to investigating microbial communities in clinical and environmental studies. Short-read sequencing remains the primary approach for metagenomic research; however, long-read sequencing may offer advantages of improved metagenomic assembly and resolved taxonomic identification. To compare the relative performance for metagenomic studies, we simulated short- and long-read datasets using increasingly complex metagenomes comprising 10, 20, and 50 microbial taxa. Additionally, we used an empirical dataset of paired short- and long-read data generated from mouse fecal pellets to assess real-world performance. We compared metagenomic assembly quality, taxonomic classification, and metagenome-assembled genome (MAG) recovery rates. We show that long-read sequencing data significantly improve taxonomic classification and assembly quality. Metagenomic assemblies using simulated long reads were more complete and more contiguous with higher rates of MAG recovery. This resulted in more precise taxonomic classifications. Principal component analysis of empirical data demonstrated that sequencing technology affects compositional results as samples clustered by sequence type, not sample type. Overall, we highlight strengths of long-read metagenomic sequencing for microbiome studies, including improving the accuracy of classification and relative abundance estimates. These results will aid researchers when considering which sequencing approaches to use for metagenomic projects.
- Published
- 2024
- Full Text
- View/download PDF
34. The impact of transitive annotation on the training of taxonomic classifiers
- Author
-
Harihara Subrahmaniam Muralidharan, Noam Y. Fox, and Mihai Pop
- Subjects
transitive annotation ,taxonomic classification ,naïve Bayes classifier ,RDP classifier ,data poisoning ,error percolation ,Microbiology ,QR1-502 - Abstract
IntroductionA common task in the analysis of microbial communities involves assigning taxonomic labels to the sequences derived from organisms found in the communities. Frequently, such labels are assigned using machine learning algorithms that are trained to recognize individual taxonomic groups based on training data sets that comprise sequences with known taxonomic labels. Ideally, the training data should rely on labels that are experimentally verified—formal taxonomic labels require knowledge of physical and biochemical properties of organisms that cannot be directly inferred from sequence alone. However, the labels associated with sequences in biological databases are most commonly computational predictions which themselves may rely on computationally-generated data—a process commonly referred to as “transitive annotation.”MethodsIn this manuscript we explore the implications of training a machine learning classifier (the Ribosomal Database Project’s Bayesian classifier in our case) on data that itself has been computationally generated. We generate new training examples based on 16S rRNA data from a metagenomic experiment, and evaluate the extent to which the taxonomic labels predicted by the classifier change after re-training.ResultsWe demonstrate that even a few computationally-generated training data points can significantly skew the output of the classifier to the point where entire regions of the taxonomic space can be disturbed.Discussion and conclusionsWe conclude with a discussion of key factors that affect the resilience of classifiers to transitively-annotated training data, and propose best practices to avoid the artifacts described in our paper.
- Published
- 2024
- Full Text
- View/download PDF
35. Genetic characteristics and integration specificity of Salmonella enterica temperate phages.
- Author
-
Siqi Sun and Xianglilan Zhang
- Subjects
SALMONELLA enterica ,BACTERIOPHAGES ,HORIZONTAL gene transfer ,BACTERIAL mutation ,BACTERIAL genomes ,BACTERIAL evolution - Abstract
Introduction: Temperate phages can engage in the horizontal transfer of functional genes to their bacterial hosts. Thus, their genetic material becomes an intimate part of bacterial genomes and plays essential roles in bacterial mutation and evolution. Specifically, temperate phages can naturally transmit genes by integrating their genomes into the bacterial host genomes via integrases. Our previous study showed that Salmonella enterica contains the largest number of temperate phages among all publicly available bacterial species. S. enterica is an important pathogen that can cause serious systemic infections and even fatalities. Methods: Initially, we extracted all S. enterica temperate phages from the extensively developed temperate phage database established in our previous study. Subsequently, we conducted an in-depth analysis of the genetic characteristics and integration specificity exhibited by these S. enterica temperate phages. Results: Here we identified 8,777 S. enterica temperate phages, all of which have integrases in their genomes. We found 491 non-redundant S. enterica temperate phage integrases (integrase entries). S. enterica temperate phage integrases were classified into three types: intA, intS, and phiRv2. Correlation analysis showed that the sequence lengths of S. enterica integrase and core regions of attB and attP were strongly correlated. Further phylogenetic analysis and taxonomic classification indicated that both the S. enterica temperate phage genomes and the integrase gene sequences were of high diversities. Discussion: Our work provides insight into the essential integration specificity and genetic diversity of S. enterica temperate phages. This study paves the way for a better understanding of the interactions between phages and S. enterica. By analyzing a large number of S. enterica temperate phages and their integrases, we provide valuable insights into the genetic diversity and prevalence of these elements. This knowledge has important implications for developing targeted therapeutic interventions, such as phage therapy, to combat S. enterica infections. By harnessing the lytic capabilities of temperate phages, they can be engineered or utilized in phage cocktails to specifically target and eradicate S. enterica strains, offering an alternative or complementary approach to traditional antibiotic treatments. Our study has implications for public health and holds potential significance in combating clinical infections caused by S. enterica. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
36. Characterization and suitability assessment of soils in rain forest zone southwest Nigeria for cassava, maize and rice production using parametric method.
- Author
-
FAWOLE, Olakunle A., OJETADE, Julius O., and MUDA, Sikiru A.
- Subjects
RAIN forests ,CORN ,ARABLE land ,FLUVISOLS ,AGRICULTURAL productivity - Abstract
Copyright of Harran Journal of Agricultural & Food Science is the property of Harran University, Faculty of Agriculture and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
- Full Text
- View/download PDF
37. Latest Trends in Industrial Vinegar Production and the Role of Acetic Acid Bacteria: Classification, Metabolism, and Applications—A Comprehensive Review.
- Author
-
Román-Camacho, Juan J., García-García, Isidoro, Santos-Dueñas, Inés M., García-Martínez, Teresa, and Mauricio, Juan C.
- Subjects
ACETOBACTER ,BACTERIA classification ,VINEGAR ,ACETIC acid ,ETHANOL ,METABOLISM ,FERMENTED foods - Abstract
Vinegar is one of the most appreciated fermented foods in European and Asian countries. In industry, its elaboration depends on numerous factors, including the nature of starter culture and raw material, as well as the production system and operational conditions. Furthermore, vinegar is obtained by the action of acetic acid bacteria (AAB) on an alcoholic medium in which ethanol is transformed into acetic acid. Besides the highlighted oxidative metabolism of AAB, their versatility and metabolic adaptability make them a taxonomic group with several biotechnological uses. Due to new and rapid advances in this field, this review attempts to approach the current state of knowledge by firstly discussing fundamental aspects related to industrial vinegar production and then exploring aspects related to AAB: classification, metabolism, and applications. Emphasis has been placed on an exhaustive taxonomic review considering the progressive increase in the number of new AAB species and genera, especially those with recognized biotechnological potential. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
38. Sequencing, Fast and Slow: Profiling Microbiomes in Human Samples with Nanopore Sequencing.
- Author
-
Park, Yunseol, Lee, Jeesu, and Shim, Hyunjin
- Subjects
NANOPORES ,DIAGNOSIS of bacterial diseases ,MICROBIAL communities ,DRUG resistance in microorganisms ,COMPUTER software - Abstract
Rapid and accurate pathogen identification is crucial in effectively combating infectious diseases. However, the current diagnostic tools for bacterial infections predominantly rely on century-old culture-based methods. Furthermore, recent research highlights the significance of host–microbe interactions within the host microbiota in influencing the outcome of infection episodes. As our understanding of science and medicine advances, there is a pressing need for innovative diagnostic methods that can identify pathogens and also rapidly and accurately profile the microbiome landscape in human samples. In clinical settings, such diagnostic tools will become a powerful predictive instrument in directing the diagnosis and prognosis of infectious diseases by providing comprehensive insights into the patient's microbiota. Here, we explore the potential of long-read sequencing in profiling the microbiome landscape from various human samples in terms of speed and accuracy. Using nanopore sequencers, we generate native DNA sequences from saliva and stool samples rapidly, from which each long-read is basecalled in real-time to provide downstream analyses such as taxonomic classification and antimicrobial resistance through the built-in software (<12 h). Subsequently, we utilize the nanopore sequence data for in-depth analysis of each microbial species in terms of host–microbe interaction types and deep learning-based classification of unidentified reads. We find that the nanopore sequence data encompass complex information regarding the microbiome composition of the host and its microbial communities, and also shed light on the unexplored human mobilome including bacteriophages. In this study, we use two different systems of long-read sequencing to give insights into human microbiome samples in the 'slow' and 'fast' modes, which raises additional inquiries regarding the precision of this novel technology and the feasibility of extracting native DNA sequences from other human microbiomes. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
39. QUE ANIMAL É ESSE? UMA PROPOSTA TEÓRICO-METODOLÓGICA DE CLASSIFICAÇÃO TAXONÔMICA ARQUEOLÓGICA EM REGISTRO RUPESTRE DAS REPRESENTAÇÕES ZOOMÓRFICAS RECONHECÍVEIS.
- Author
-
Fonseca de Souza, Thiago, Mutzenberg, Demétrio, and Nogueira de Queiroz, Alberico
- Subjects
ARID regions ,CLUSTER analysis (Statistics) ,DATABASES ,ROCK paintings ,ACCESS to information ,CLASSIFICATION ,ZOOARCHAEOLOGY - Abstract
Copyright of Revista de Arqueologia is the property of Revista de Arqueologia and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
- Full Text
- View/download PDF
40. POSMM: an efficient alignment-free metagenomic profiler that complements alignment-based profiling
- Author
-
David J. Burks, Vaidehi Pusadkar, and Rajeev K. Azad
- Subjects
Metagenomes ,Microbiome ,Taxonomic classification ,Markov model ,Sequence alignment ,Environmental sciences ,GE1-350 ,Microbiology ,QR1-502 - Abstract
Abstract We present here POSMM (pronounced ‘Possum’), Python-Optimized Standard Markov Model classifier, which is a new incarnation of the Markov model approach to metagenomic sequence analysis. Built on the top of a rapid Markov model based classification algorithm SMM, POSMM reintroduces high sensitivity associated with alignment-free taxonomic classifiers to probe whole genome or metagenome datasets of increasingly prohibitive sizes. Logistic regression models generated and optimized using the Python sklearn library, transform Markov model probabilities to scores suitable for thresholding. Featuring a dynamic database-free approach, models are generated directly from genome fasta files per run, making POSMM a valuable accompaniment to many other programs. By combining POSMM with ultrafast classifiers such as Kraken2, their complementary strengths can be leveraged to produce higher overall accuracy in metagenomic sequence classification than by either as a standalone classifier. POSMM is a user-friendly and highly adaptable tool designed for broad use by the metagenome scientific community.
- Published
- 2023
- Full Text
- View/download PDF
41. Inferring taxonomic placement from DNA barcoding aiding in discovery of new taxa
- Author
-
Alessandro Zito, Tommaso Rigon, and David B. Dunson
- Subjects
Bayesian nonparametrics ,DNA barcoding ,species novelty ,species sampling models ,taxonomic classification ,Ecology ,QH540-549.5 ,Evolution ,QH359-425 - Abstract
Abstract Predicting the taxonomic affiliation of DNA sequences collected from biological samples is a fundamental step in biodiversity assessment. This task is performed by leveraging existing databases containing reference DNA sequences endowed with a taxonomic identification. However, environmental sequences can be from organisms that are either unknown to science or for which there are no reference sequences available. Thus, taxonomic novelty of a sequence needs to be accounted for when doing classification. We propose Bayesian nonparametric taxonomic classifiers, BayesANT, which use species sampling model priors to allow unobserved taxa to be discovered at each taxonomic rank. Using a simple product multinomial likelihood with conjugate Dirichlet priors at the lowest rank, a highly flexible supervised algorithm is developed to provide a probabilistic prediction of the taxa placement of each sequence at each rank. As an illustration, we run our algorithm on a carefully annotated library of Finnish arthropods (FinBOL). To assess the ability of BayesANT to recognize novelty and to predict known taxonomic affiliations correctly, we test it on two training‐test splitting scenarios, each with a different proportion of taxa unobserved in training. We show how our algorithm attains accurate predictions and reliably quantifies classification uncertainty, especially when many sequences in the test set are affiliated to taxa unknown in training. By enabling taxonomic predictions for DNA barcodes to identify unseen branches, we believe BayesANT will be of broad utility as a tool for DNA metabarcoding within bioinformatics pipelines.
- Published
- 2023
- Full Text
- View/download PDF
42. Genomic diversity and comprehensive taxonomical classification of 61 Bacillus subtilis group member infecting bacteriophages, and the identification of ortholog taxonomic signature genes
- Author
-
Haftom Baraki Abraha, Jae-Won Lee, Gayeong Kim, Mokhammad Khoiron Ferdiansyah, Rathnayaka Mudiyanselage Ramesha, and Kwang-Pyo Kim
- Subjects
Bacteriophages ,Taxonomic classification ,Intergenomic similarities ,Signature genes ,Orthologs ,Bacillus phages ,Biotechnology ,TP248.13-248.65 ,Genetics ,QH426-470 - Abstract
Abstract Background Despite the applications of Bacillus subtilis group species in various sectors, limited information is available regarding their phages. Here, 61 B. subtilis group species-infecting phages (BSPs) were studied for their taxonomic classification considering the genome-size, genomic diversity, and the host, followed by the identification of orthologs taxonomic signature genes. Results BSPs have widely ranging genome sizes that can be bunched into groups to demonstrate correlations to family and subfamily classifications. Comparative analysis re-confirmed the existing, BSPs-containing 14 genera and 21 species and displayed inter-genera similarities within existing subfamilies. Importantly, it also revealed the need for the creation of new taxonomic classifications, including 28 species, nine genera, and two subfamilies (New subfamily1 and New subfamily2) to accommodate inter-genera relatedness. Following pangenome analysis, no ortholog shared by all BSPs was identified, while orthologs, namely, the tail fibers/spike proteins and poly-gamma-glutamate hydrolase, that are shared by more than two-thirds of the BSPs were identified. More importantly, major capsid protein (MCP) type I, MCP type II, MCP type III and peptidoglycan binding proteins that are distinctive orthologs for Herelleviridae, Salasmaviridae, New subfamily1, and New subfamily2, respectively, were identified and analyzed which could serve as signatures to distinguish BSP members of the respective taxon. Conclusions In this study, we show the genomic diversity and propose a comprehensive classification of 61 BSPs, including the proposition for the creation of two new subfamilies, followed by the identification of orthologs taxonomic signature genes, potentially contributing to phage taxonomy.
- Published
- 2022
- Full Text
- View/download PDF
43. Phenanthrene Degradation by Photosynthetic Bacterial Consortium Dominated by Fischerella sp.
- Author
-
Márquez-Villa, José Martín, Rodríguez-Sierra, Juan Carlos, Amtanus Chequer, Nayem, Cob-Calan, Nubia Noemí, García-Maldonado, José Quinatzín, Cadena, Santiago, and Hernández-Núñez, Emanuel
- Subjects
- *
PHENANTHRENE , *TECHNOLOGICAL innovations , *AROMATIC compounds , *BIODEGRADATION , *MICROBIAL diversity , *AEROBIC bacteria - Abstract
Microbial degradation of aromatic hydrocarbons is an emerging technology, and it is well recognized for its economic methods, efficiency, and safety; however, its exploration is still scarce and greater emphasis on cyanobacteria–bacterial mutualistic interactions is needed. We evaluated and characterized the phenanthrene biodegradation capacity of consortium dominated by Fischerella sp. under holoxenic conditions with aerobic heterotrophic bacteria and their molecular identification through 16S rRNA Illumina sequencing. Results indicated that our microbial consortium can degrade up to 92% of phenanthrene in five days. Bioinformatic analyses revealed that consortium was dominated by Fischerella sp., however different members of Nostocaceae and Weeksellaceae, as well as several other bacteria, such as Chryseobacterium, and Porphyrobacter, were found to be putatively involved in the biological degradation of phenanthrene. This work contributes to a better understanding of biodegradation of phenanthrene by cyanobacteria and identify the microbial diversity related. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
44. Characterization and genomic analysis of the vibrio phage R01 lytic to Vibrio parahaemolyticus
- Author
-
Zhen Li, Yuan Ren, Zhenhui Wang, Zhitao Qi, Bilal Murtaza, and Hongyu Ren
- Subjects
Bacteriophage/Phage ,Vibrio parahaemolyticus ,Bacterial inhibition ,Genomic sequencing ,Taxonomic classification ,Aquaculture. Fisheries. Angling ,SH1-691 - Abstract
Vibrio parahaemolyticus is a significant zoonotic pathogen that is capable of causing infections in marine animals and contaminating seafood. In this study, we characterized a bacteriophage, vibrio phage R01, which exhibited lytic activity against V. parahaemolyticus VP-ABTNL, a strain known to cause skin ulceration in the juvenile sea cucumber. Morphological analysis revealed that vibrio phage R01 belongs to the Siphoviridae family of the Caudovirales order. A one-step growth curve analysis showed that the latent period and burst period of R01 were approximately 20 min, with a burst size of 316 PFU (plaque-forming unit) per infected cell. The stability determination assay showed that the lytic activity of phage R01 was optimal at 4–28 °C and a pH of 7.0, while it ceased at temperatures higher than 60 °C. The inhibition ability of the vibrio phage R01 against the VP-ABTNL was tested in vitro using the phage with three MOI (multiplicity of infection) values of 1, 10 and 100. All treated cultures exhibited a significant (P 0.05). Subsequently, the whole genome of the vibrio phage R01 was sequenced and analyzed. The genome of phage R01 consists of 75,514 bp with a G+C content of 49.42%. Twenty-four of the 75 putative proteins encoded by this phage have known functions, and no rRNA and tRNA genes were identified. Phylogenetic analysis based on DNA polymerase and terminase large subunit revealed that phage R01 is a new species within the genus of Mardecavirus. These findings indicate that the vibrio phage R01 may serve as a viable alternative agent for preventing V. parahaemolyticus infections.
- Published
- 2023
- Full Text
- View/download PDF
45. POSMM: an efficient alignment-free metagenomic profiler that complements alignment-based profiling.
- Author
-
Burks, David J., Pusadkar, Vaidehi, and Azad, Rajeev K.
- Subjects
METAGENOMICS ,MARKOV processes ,PYTHON programming language ,CLASSIFICATION algorithms ,LOGISTIC regression analysis ,SEQUENCE analysis ,SCIENTIFIC community - Abstract
We present here POSMM (pronounced 'Possum'), Python-Optimized Standard Markov Model classifier, which is a new incarnation of the Markov model approach to metagenomic sequence analysis. Built on the top of a rapid Markov model based classification algorithm SMM, POSMM reintroduces high sensitivity associated with alignment-free taxonomic classifiers to probe whole genome or metagenome datasets of increasingly prohibitive sizes. Logistic regression models generated and optimized using the Python sklearn library, transform Markov model probabilities to scores suitable for thresholding. Featuring a dynamic database-free approach, models are generated directly from genome fasta files per run, making POSMM a valuable accompaniment to many other programs. By combining POSMM with ultrafast classifiers such as Kraken2, their complementary strengths can be leveraged to produce higher overall accuracy in metagenomic sequence classification than by either as a standalone classifier. POSMM is a user-friendly and highly adaptable tool designed for broad use by the metagenome scientific community. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
46. Classifying and discovering genomic sequences in metagenomic repositories.
- Author
-
Silva, Jorge Miguel, Almeida, João Rafael, and Oliveira, José Luís
- Subjects
METAGENOMICS ,DATA visualization ,WEB databases ,INSTITUTIONAL repositories ,AGRICULTURE - Abstract
The taxonomic and functional composition of microbial communities from environmental, agricultural, and therapeutic settings is increasingly being studied using metagenomic methodologies in large-scale genomic applications. This has led to exponential growth in the field and has impacted on healthcare, pharmacology and biotechnology. However, with the current methodologies, it is sometimes difficult to obtain conclusive identification of an organism. In addition, the growth of the metagenomic field has led to the creation of large amounts of data held by different hosts, which characterize data differently and make analysis difficult. Therefore, correct data aggregation and classification improve and facilitate the discovery of repositories of interest. This paper tackles these issues by proposing a methodology for organism identification, data aggregation and content characterization, visualization and selection. We propose a three-step pipeline for organism identification that uses compression-based metrics, an aggregation mechanism for content characterization, and a web database catalogue for data exposition and visualization. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
47. Inferring taxonomic placement from DNA barcoding aiding in discovery of new taxa.
- Author
-
Zito, Alessandro, Rigon, Tommaso, and Dunson, David B.
- Subjects
GENETIC barcoding ,DNA sequencing ,SPECIES - Abstract
Predicting the taxonomic affiliation of DNA sequences collected from biological samples is a fundamental step in biodiversity assessment. This task is performed by leveraging existing databases containing reference DNA sequences endowed with a taxonomic identification. However, environmental sequences can be from organisms that are either unknown to science or for which there are no reference sequences available. Thus, taxonomic novelty of a sequence needs to be accounted for when doing classification.We propose Bayesian nonparametric taxonomic classifiers, BayesANT, which use species sampling model priors to allow unobserved taxa to be discovered at each taxonomic rank. Using a simple product multinomial likelihood with conjugate Dirichlet priors at the lowest rank, a highly flexible supervised algorithm is developed to provide a probabilistic prediction of the taxa placement of each sequence at each rank.As an illustration, we run our algorithm on a carefully annotated library of Finnish arthropods (FinBOL). To assess the ability of BayesANT to recognize novelty and to predict known taxonomic affiliations correctly, we test it on two training‐test splitting scenarios, each with a different proportion of taxa unobserved in training. We show how our algorithm attains accurate predictions and reliably quantifies classification uncertainty, especially when many sequences in the test set are affiliated to taxa unknown in training.By enabling taxonomic predictions for DNA barcodes to identify unseen branches, we believe BayesANT will be of broad utility as a tool for DNA metabarcoding within bioinformatics pipelines. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
48. SprayNPray: user-friendly taxonomic profiling of genome and metagenome contigs
- Author
-
Arkadiy I. Garber, Catherine R. Armbruster, Stella E. Lee, Vaughn S. Cooper, Jennifer M. Bomberger, and Sean M. McAllister
- Subjects
Symbiont ,Taxonomic classification ,Binning ,Bioinformatics ,Contaminant identification ,HGT ,Biotechnology ,TP248.13-248.65 ,Genetics ,QH426-470 - Abstract
Abstract Background Shotgun sequencing of cultured microbial isolates/individual eukaryotes (whole-genome sequencing) and microbial communities (metagenomics) has become commonplace in biology. Very often, sequenced samples encompass organisms spanning multiple domains of life, necessitating increasingly elaborate software for accurate taxonomic classification of assembled sequences. Results While many software tools for taxonomic classification exist, SprayNPray offers a quick and user-friendly, semi-automated approach, allowing users to separate contigs by taxonomy (and other metrics) of interest. Easy installation, usage, and intuitive output, which is amenable to visual inspection and/or further computational parsing, will reduce barriers for biologists beginning to analyze genomes and metagenomes. This approach can be used for broad-level overviews, preliminary analyses, or as a supplement to other taxonomic classification or binning software. SprayNPray profiles contigs using multiple metrics, including closest homologs from a user-specified reference database, gene density, read coverage, GC content, tetranucleotide frequency, and codon-usage bias. Conclusions The output from this software is designed to allow users to spot-check metagenome-assembled genomes, identify, and remove contigs from putative contaminants in isolate assemblies, identify bacteria in eukaryotic assemblies (and vice-versa), and identify possible horizontal gene transfer events.
- Published
- 2022
- Full Text
- View/download PDF
49. Corrigendum: Variation in accessory genes within the Klebsiella oxytoca species complex delineates monophyletic members and simplifies coherent genotyping
- Author
-
Amar Cosic, Eva Leitner, Christian Petternel, Herbert Galler, Franz F. Reinthaler, Kathrin A. Herzog-Obereder, Elisabeth Tatscher, Sandra Raffl, Gebhard Feierl, Christoph Högenauer, Ellen L. Zechner, and Sabine Kienesberger
- Subjects
bacterial phylogeny ,Klebsiella oxytoca species complex ,taxonomic classification ,necrotizing enterocolitis ,bacterial cytotoxicity ,intestinal disease ,Microbiology ,QR1-502 - Published
- 2023
- Full Text
- View/download PDF
50. A Customized Monkeypox Virus Genomic Database (MPXV DB v1.0) for Rapid Sequence Analysis and Phylogenomic Discoveries in CLC Microbial Genomics.
- Author
-
Shen-Gunther, Jane, Cai, Hong, and Wang, Yufeng
- Subjects
- *
MONKEYPOX , *MICROBIAL genomics , *DATABASES , *SEQUENCE analysis , *FECAL contamination , *NUCLEOTIDE sequencing - Abstract
Monkeypox has been a neglected, zoonotic tropical disease for over 50 years. Since the 2022 global outbreak, hundreds of human clinical samples have been subjected to next-generation sequencing (NGS) worldwide with raw data deposited in public repositories. However, sequence analysis for in-depth investigation of viral evolution remains hindered by the lack of a curated, whole genome Monkeypox virus (MPXV) database (DB) and efficient bioinformatics pipelines. To address this, we developed a customized MPXV DB for integration with "ready-to-use" workflows in the CLC Microbial Genomics Module for whole genomic and metagenomic analysis. After database construction (218 MPXV genomes), whole genome alignment, pairwise comparison, and evolutionary analysis of all genomes were analyzed to autogenerate tabular outputs and visual displays (collective runtime: 16 min). The clinical utility of the MPXV DB was demonstrated by using a Chimpanzee fecal, hybrid-capture NGS dataset (publicly available) for metagenomic, phylogenomic, and viral/host integration analysis. The clinically relevant MPXV DB embedded in CLC workflows proved to be a rapid method of sequence analysis useful for phylogenomic exploration and a wide range of applications in translational science. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.