Back to Search
Start Over
DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier
- Source :
- Bioinformatics
- Publication Year :
- 2017
- Publisher :
- Oxford University Press (OUP), 2017.
-
Abstract
- Motivation A large number of protein sequences are becoming available through the application of novel high-throughput sequencing technologies. Experimental functional characterization of these proteins is time-consuming and expensive, and is often only done rigorously for few selected model organisms. Computational function prediction approaches have been suggested to fill this gap. The functions of proteins are classified using the Gene Ontology (GO), which contains over 40 000 classes. Additionally, proteins have multiple functions, making function prediction a large-scale, multi-class, multi-label problem. Results We have developed a novel method to predict protein function from sequence. We use deep learning to learn features from protein sequences as well as a cross-species protein–protein interaction network. Our approach specifically outputs information in the structure of the GO and utilizes the dependencies between GO classes as background information to construct a deep learning model. We evaluate our method using the standards established by the Computational Assessment of Function Annotation (CAFA) and demonstrate a significant improvement over baseline methods such as BLAST, in particular for predicting cellular locations. Availability and implementation Web server: http://deepgo.bio2vec.net, Source code: https://github.com/bio-ontology-research-group/deepgo Supplementary information Supplementary data are available at Bioinformatics online.
- Subjects :
- FOS: Computer and information sciences
0301 basic medicine
Computer science
ved/biology.organism_classification_rank.species
Microbial metabolism
computer.software_genre
Quantitative Biology - Quantitative Methods
Biochemistry
Machine Learning (cs.LG)
0302 clinical medicine
Sequence Analysis, Protein
Protein methods
Protein Interaction Maps
Quantitative Methods (q-bio.QM)
Protein function
Eukaryota
Original Papers
Computer Science Applications
Computational Mathematics
Computational Theory and Mathematics
Supervised Machine Learning
Statistics and Probability
Sequence analysis
Databases and Ontologies
Machine learning
03 medical and health sciences
Interaction network
Animals
Humans
Quantitative Biology - Genomics
Model organism
Molecular Biology
Genomics (q-bio.GN)
Bacteria
ved/biology
business.industry
Deep learning
Computational Biology
Proteins
Computer Science - Learning
Gene Ontology
ComputingMethodologies_PATTERNRECOGNITION
030104 developmental biology
FOS: Biological sciences
Proteins metabolism
Artificial intelligence
business
computer
Classifier (UML)
Software
030217 neurology & neurosurgery
Subjects
Details
- ISSN :
- 13674811 and 13674803
- Volume :
- 34
- Database :
- OpenAIRE
- Journal :
- Bioinformatics
- Accession number :
- edsair.doi.dedup.....8a8610fbe7bdedbcfd206c289f4b36b6
- Full Text :
- https://doi.org/10.1093/bioinformatics/btx624