Back to Search Start Over

Highly accurate classification and discovery of microbial protein-coding gene functions using FunGeneTyper: an extensible deep learning framework.

Authors :
Zhang, Guoqing
Wang, Hui
Zhang, Zhiguo
Zhang, Lu
Guo, Guibing
Yang, Jian
Yuan, Fajie
Ju, Feng
Source :
Briefings in Bioinformatics; Jul2024, Vol. 25 Issue 4, p1-14, 14p
Publication Year :
2024

Abstract

High-throughput DNA sequencing technologies decode tremendous amounts of microbial protein-coding gene sequences. However, accurately assigning protein functions to novel gene sequences remain a challenge. To this end, we developed FunGeneTyper, an extensible framework with two new deep learning models (i.e. FunTrans and FunRep), structured databases, and supporting resources for achieving highly accurate (Accuracy > 0.99, F1-score > 0.97) and fine-grained classification of antibiotic resistance genes (ARGs) and virulence factor genes. Using an experimentally confirmed dataset of ARGs comprising remote homologous sequences as the test set, our framework achieves by-far-the-best performance in the discovery of new ARGs from human gut (F1-score: 0.6948), wastewater (0.6072), and soil (0.5445) microbiomes, beating the state-of-the-art bioinformatics tools and sequence alignment-based (F1-score: 0.0556–0.5065) and domain-based (F1-score: 0.2630–0.5224) annotation approaches. Furthermore, our framework is implemented as a lightweight, privacy-preserving, and plug-and-play neural network module, facilitating its versatility and accessibility to developers and users worldwide. We anticipate widespread utilization of FunGeneTyper (https://github.com/emblab-westlake/FunGeneTyper) for precise classification of protein-coding gene functions and the discovery of numerous valuable enzymes. This advancement will have a significant impact on various fields, including microbiome research, biotechnology, metagenomics, and bioinformatics. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
14675463
Volume :
25
Issue :
4
Database :
Complementary Index
Journal :
Briefings in Bioinformatics
Publication Type :
Academic Journal
Accession number :
178650395
Full Text :
https://doi.org/10.1093/bib/bbae319