Back to Search Start Over

SPOT-1D-Single: improving the single-sequence-based prediction of protein secondary structure, backbone angles, solvent accessibility and half-sphere exposures using a large training set and ensembled deep learning.

Authors :
Singh, Jaspreet
Litfin, Thomas
Paliwal, Kuldip
Singh, Jaswinder
Hanumanthappa, Anil Kumar
Zhou, Yaoqi
Source :
Bioinformatics. 10/15/2021, Vol. 37 Issue 20, p3464-3472. 9p.
Publication Year :
2021

Abstract

Motivation Knowing protein secondary and other one-dimensional structural properties are essential for accurate protein structure and function prediction. As a result, many methods have been developed for predicting these one-dimensional structural properties. However, most methods relied on evolutionary information that may not exist for many proteins due to a lack of sequence homologs. Moreover, it is computationally intensive for obtaining evolutionary information as the library of protein sequences continues to expand exponentially. Here, we developed a new single-sequence method called SPOT-1D-Single based on a large training dataset of 39 120 proteins deposited prior to 2016 and an ensemble of hybrid long-short-term-memory bidirectional neural network and convolutional neural network. Results We showed that SPOT-1D-Single consistently improves over SPIDER3-Single and ProteinUnet for secondary structure, solvent accessibility, contact number and backbone angles prediction for all seven independent test sets (TEST2018, SPOT-2016, SPOT-2016-HQ, SPOT-2018, SPOT-2018-HQ, CASP12 and CASP13 free-modeling targets). For example, the predicted three-state secondary structure's accuracy ranges from 72.12% to 74.28% by SPOT-1D-Single, compared to 69.1–72.6% by SPIDER3-Single and 70.6–73% by ProteinUnet. SPOT-1D-Single also predicts SS3 and SS8 with 6.24% and 6.98% better accuracy than SPOT-1D on SPOT-2018 proteins with no homologs (Neff = 1), respectively. The new method's improvement over existing techniques is due to a larger training set combined with ensembled learning. Availability and implementation Standalone-version of SPOT-1D-Single is available at https://github.com/jas-preet/SPOT-1D-Single. Direct prediction can also be made at https://sparks-lab.org/server/spot-1d-single. The datasets used in this research can also be downloaded from GitHub. Supplementary information Supplementary data are available at Bioinformatics online. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
13674803
Volume :
37
Issue :
20
Database :
Academic Search Index
Journal :
Bioinformatics
Publication Type :
Academic Journal
Accession number :
153299243
Full Text :
https://doi.org/10.1093/bioinformatics/btab316