1. Deep learning enables the atomic structure determination of the Fanconi Anemia core complex from cryoEM
- Author
-
Shabih Shakeel, Frank DiMaio, Daniel P. Farrell, David Baker, Anna Lauko, Ivan Anishchenko, and Lori A. Passmore
- Subjects
Computer science ,Protein subunit ,fanconi anemia core complex ,Interstrand crosslink ,Computational biology ,Biochemistry ,Convolutional neural network ,03 medical and health sciences ,0302 clinical medicine ,Fanconi anemia ,medicine ,Atomic model ,General Materials Science ,Protein secondary structure ,030304 developmental biology ,Physics ,0303 health sciences ,Crystallography ,biology ,business.industry ,Deep learning ,deep learning ,General Chemistry ,Condensed Matter Physics ,medicine.disease ,Research Papers ,Ubiquitin ligase ,cryoem ,QD901-999 ,biology.protein ,Protein folding ,Artificial intelligence ,distance predictions ,business ,Model building ,030217 neurology & neurosurgery - Abstract
This paper describes a method for determining an atomic model of a protein complex using moderate-resolution cryoEM data and distance predictions from deep learning., Cryo-electron microscopy of protein complexes often leads to moderate resolution maps (4–8 Å), with visible secondary-structure elements but poorly resolved loops, making model building challenging. In the absence of high-resolution structures of homologues, only coarse-grained structural features are typically inferred from these maps, and it is often impossible to assign specific regions of density to individual protein subunits. This paper describes a new method for overcoming these difficulties that integrates predicted residue distance distributions from a deep-learned convolutional neural network, computational protein folding using Rosetta, and automated EM-map-guided complex assembly. We apply this method to a 4.6 Å resolution cryoEM map of Fanconi Anemia core complex (FAcc), an E3 ubiquitin ligase required for DNA interstrand crosslink repair, which was previously challenging to interpret as it comprises 6557 residues, only 1897 of which are covered by homology models. In the published model built from this map, only 387 residues could be assigned to the specific subunits with confidence. By building and placing into density 42 deep-learning-guided models containing 4795 residues not included in the previously published structure, we are able to determine an almost-complete atomic model of FAcc, in which 5182 of the 6557 residues were placed. The resulting model is consistent with previously published biochemical data, and facilitates interpretation of disease-related mutational data. We anticipate that our approach will be broadly useful for cryoEM structure determination of large complexes containing many subunits for which there are no homologues of known structure.
- Published
- 2020