1. RECOORD: a recalculated coordinate database of 500+ proteins from the PDB using restraints from the BioMagResBank
- Author
-
Nederveen, A.J., Doreleijers, J.F., Vranken, W., Miller, Z., Spronk, C.A.E.M., Nabuurs, S.B., Guntert, P., Livny, M., Markley, J.L., Nilges, M., Ulrich, E.L., Kaptein, R., Bonvin, A.M.J.J., NMR-spectroscopie, NMR Spectroscopy 1, Dep Scheikunde, Other departments, and Department of Bio-engineering Sciences
- Subjects
Bioinformatics ,Computer science ,Protein Conformation ,Protein Data Bank (RCSB PDB) ,010402 general chemistry ,computer.software_genre ,01 natural sciences ,Biochemistry ,03 medical and health sciences ,Structural Biology ,Taverne ,Databases, Protein ,Molecular Biology ,030304 developmental biology ,0303 health sciences ,Database ,biology ,Proteins ,Reproducibility of Results ,Structure validation ,computer.file_format ,Cyana ,biology.organism_classification ,Protein Data Bank ,Solution structure ,0104 chemical sciences ,Weak correlation ,International ,Reference database ,Data mining ,Stress, Mechanical ,Cellular energy metabolism [UMCN 5.3] ,computer ,Ramachandran plot - Abstract
State-of-the-art methods based on CNS and CYANA were used to recalculate the nuclear magnetic resonance (NMR) solution structures of 500 proteins for which coordinates and NMR restraints are available from the Protein Data Bank. Curated restraints were obtained from the BioMagResBank FRED database. Although the original NMR struc- tures were determined by various methods, they all were recalculated by CNS and CYANA and refined subsequently by restrained molecular dynamics (CNS) in a hydrated environment. We present an extensive analysis of the results, in terms of various quality indicators generated by PROCHECK and WHAT- _CHECK. On average, the quality indicators for pack- ing and Ramachandran appearance moved one stan- dard deviation closer to the mean of the reference database. The structural quality of the recalculated structures is discussed in relation to various parame- ters, including number of restraints per residue, NOE completeness and positional root mean square devia- tion (RMSD). Correlations between pairs of these quality indicators were generally low; for example, there is a weak correlation between the number of restraints per residue and the Ramachandran appear- ance according to WHAT_CHECK (r 0.31). The set of recalculated coordinates constitutes a unified data- base of protein structures in which potential user- and software-dependent biases have been kept as small as possible. The database can be used by the structural biology community for further develop- ment of calculation protocols, validation tools, struc- ture-based statistical approaches and modeling. The RECOORD database of recalculated structures is pub- licly available from http://www.ebi.ac.uk/msd/reco
- Published
- 2005