1. 'PANNZER – a practical tool for protein function prediction'
- Author
-
Liisa Holm, Petri Törönen, Institute of Biotechnology, Computational genomics, Organismal and Evolutionary Biology Research Programme, Genetics, and Bioinformatics
- Subjects
Web server ,Source code ,web server ,Computer science ,DATABASE ,media_common.quotation_subject ,Inference ,computer.software_genre ,Biochemistry ,ANNOTATION ,03 medical and health sciences ,0302 clinical medicine ,Protein Annotation ,SEARCH ,Protein function prediction ,Function (engineering) ,Databases, Protein ,Molecular Biology ,030304 developmental biology ,media_common ,0303 health sciences ,ENOLASE SUPERFAMILY ,Information retrieval ,evaluation ,Tools for Protein Science ,Gene ontology ,GENE ONTOLOGY ,Computational Biology ,Proteins ,Molecular Sequence Annotation ,protein function ,DEHYDRATASE ,Data quality ,1182 Biochemistry, cell and molecular biology ,computer ,030217 neurology & neurosurgery ,Algorithms ,Software - Abstract
The facility of next-generation sequencing has led to an explosion of gene catalogs for novel genomes, transcriptomes and metagenomes, which are functionally uncharacterized. Computational inference has emerged as a necessary substitute for first-hand experimental evidence. PANNZER (Protein ANNotation with Z-scoRE) is a high-throughput functional annotation web server that stands out among similar publically accessible web servers in supporting submission of up to 100,000 protein sequences at once and providing both Gene Ontology (GO) annotations and free text description predictions. Here, we demonstrate the use of PANNZER and discuss future plans and challenges. We present two case studies to illustrate problems related to data quality and method evaluation. Some commonly used evaluation metrics and used evaluation datasets promote methods that that favor unspecific and broad classes over more informative and specific classes. We argue that this can bias the development of automated function prediction methods. The PANNZER web server and source code are available at http://ekhidna2.biocenter.helsinki.fi/sanspanz/. This article is protected by copyright. All rights reserved.
- Published
- 2022