1
|
Si D, Moritz SA, Pfab J, Hou J, Cao R, Wang L, Wu T, Cheng J. Deep Learning to Predict Protein Backbone Structure from High-Resolution Cryo-EM Density Maps. Sci Rep 2020; 10:4282. [PMID: 32152330 PMCID: PMC7063051 DOI: 10.1038/s41598-020-60598-y] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2019] [Accepted: 02/10/2020] [Indexed: 11/29/2022] Open
Abstract
Cryo-electron microscopy (cryo-EM) has become a leading technology for determining protein structures. Recent advances in this field have allowed for atomic resolution. However, predicting the backbone trace of a protein has remained a challenge on all but the most pristine density maps (<2.5 Å resolution). Here we introduce a deep learning model that uses a set of cascaded convolutional neural networks (CNNs) to predict Cα atoms along a protein's backbone structure. The cascaded-CNN (C-CNN) is a novel deep learning architecture comprised of multiple CNNs, each predicting a specific aspect of a protein's structure. This model predicts secondary structure elements (SSEs), backbone structure, and Cα atoms, combining the results of each to produce a complete prediction map. The cascaded-CNN is a semantic segmentation image classifier and was trained using thousands of simulated density maps. This method is largely automatic and only requires a recommended threshold value for each protein density map. A specialized tabu-search path walking algorithm was used to produce an initial backbone trace with Cα placements. A helix-refinement algorithm made further improvements to the α-helix SSEs of the backbone trace. Finally, a novel quality assessment-based combinatorial algorithm was used to effectively map protein sequences onto Cα traces to obtain full-atom protein structures. This method was tested on 50 experimental maps between 2.6 Å and 4.4 Å resolution. It outperformed several state-of-the-art prediction methods including Rosetta de-novo, MAINMAST, and a Phenix based method by producing the most complete predicted protein structures, as measured by percentage of found Cα atoms. This method accurately predicted 88.9% (mean) of the Cα atoms within 3 Å of a protein's backbone structure surpassing the 66.8% mark achieved by the leading alternate method (Phenix based fully automatic method) on the same set of density maps. The C-CNN also achieved an average root-mean-square deviation (RMSD) of 1.24 Å on a set of 50 experimental density maps which was tested by the Phenix based fully automatic method. The source code and demo of this research has been published at https://github.com/DrDongSi/Ca-Backbone-Prediction.
Collapse
Affiliation(s)
- Dong Si
- Division of Computing & Software Systems, University of Washington, Bothell, WA, 98011, USA.
| | - Spencer A Moritz
- Division of Computing & Software Systems, University of Washington, Bothell, WA, 98011, USA
| | - Jonas Pfab
- Division of Computing & Software Systems, University of Washington, Bothell, WA, 98011, USA
| | - Jie Hou
- Department of Computer Science, Saint Louis University, Saint Louis, MO, 63103, USA
- Program in Bioinformatics & Computational Biology, Saint Louis University, Saint Louis, MO, 63103, USA
| | - Renzhi Cao
- Department of Computer Science, Pacific Lutheran University, Tacoma, WA, 98447, USA
| | - Liguo Wang
- Department of Biological Structure, University of Washington, Seattle, WA, 98185, USA
| | - Tianqi Wu
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, 65211, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, 65211, USA
| |
Collapse
|
2
|
Terwilliger TC, Adams PD, Afonine PV, Sobolev OV. Cryo-EM map interpretation and protein model-building using iterative map segmentation. Protein Sci 2020; 29:87-99. [PMID: 31599033 PMCID: PMC6933853 DOI: 10.1002/pro.3740] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2019] [Revised: 09/30/2019] [Accepted: 10/01/2019] [Indexed: 11/17/2022]
Abstract
A procedure for building protein chains into maps produced by single-particle electron cryo-microscopy (cryo-EM) is described. The procedure is similar to the way an experienced structural biologist might analyze a map, focusing first on secondary structure elements such as helices and sheets, then varying the contour level to identify connections between these elements. Since the high density in a map typically follows the main-chain of the protein, the main-chain connection between secondary structure elements can often be identified as the unbranched path between them with the highest minimum value along the path. This chain-tracing procedure is then combined with finding side-chain positions based on the presence of density extending away from the main path of the chain, allowing generation of a Cα model. The Cα model is converted to an all-atom model and is refined against the map. We show that this procedure is as effective as other existing methods for interpretation of cryo-EM maps and that it is considerably faster and produces models with fewer chain breaks than our previous methods that were based on approaches developed for crystallographic maps.
Collapse
Affiliation(s)
- Thomas C. Terwilliger
- Los Alamos National LaboratoryLos AlamosNew Mexico
- New Mexico ConsortiumLos AlamosNew Mexico
| | - Paul D. Adams
- Molecular Biophysics & Integrated Bioimaging DivisionLawrence Berkeley National LaboratoryBerkeleyCalifornia
- Department of BioengineeringUniversity of California BerkeleyBerkeleyCalifornia
| | - Pavel V. Afonine
- Molecular Biophysics & Integrated Bioimaging DivisionLawrence Berkeley National LaboratoryBerkeleyCalifornia
| | - Oleg V. Sobolev
- Molecular Biophysics & Integrated Bioimaging DivisionLawrence Berkeley National LaboratoryBerkeleyCalifornia
| |
Collapse
|
3
|
Hinshaw SM, Harrison SC. The structure of the Ctf19c/CCAN from budding yeast. eLife 2019; 8:44239. [PMID: 30762520 PMCID: PMC6407923 DOI: 10.7554/elife.44239] [Citation(s) in RCA: 53] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2018] [Accepted: 02/13/2019] [Indexed: 12/29/2022] Open
Abstract
Eukaryotic kinetochores connect spindlemicrotubules to chromosomal centromeres. A group of proteins called the Ctf19 complex (Ctf19c) in yeast and the constitutive centromere associated network (CCAN) in other organisms creates the foundation of a kinetochore. The Ctf19c/CCAN influences the timing of kinetochore assembly, sets its location by associating with a specialized nucleosome containing the histone H3 variant Cse4/CENP-A, and determines the organization of the microtubule attachment apparatus. We present here the structure of a reconstituted 13-subunit Ctf19c determined by cryo-electron microscopy at ~4 Å resolution. The structure accounts for known and inferred contacts with the Cse4 nucleosome and for an observed assembly hierarchy. We describe its implications for establishment of kinetochores and for their regulation by kinases throughout the cell cycle.
Collapse
Affiliation(s)
- Stephen M Hinshaw
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Howard Hughes Medical Institute, Boston, United States
| | - Stephen C Harrison
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Howard Hughes Medical Institute, Boston, United States
| |
Collapse
|
4
|
Terwilliger TC, Adams PD, Afonine PV, Sobolev OV. A fully automatic method yielding initial models from high-resolution cryo-electron microscopy maps. Nat Methods 2018; 15:905-908. [PMID: 30377346 PMCID: PMC6214191 DOI: 10.1038/s41592-018-0173-1] [Citation(s) in RCA: 106] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2018] [Accepted: 08/06/2018] [Indexed: 01/31/2023]
Abstract
We report a fully automated procedure for the optimization and interpretation of reconstructions from cryo-electron microscopy (cryo-EM) data, available in Phenix as phenix.map_to_model. We applied our approach to 476 datasets with resolution of 4.5 Å or better, including reconstructions of 47 ribosomes and 32 other protein-RNA complexes. The median fraction of residues in the deposited structures reproduced automatically was 71% for reconstructions determined at resolutions of 3 Å or better and 47% for those at resolutions worse than 3 Å.
Collapse
Affiliation(s)
- Thomas C Terwilliger
- Los Alamos National Laboratory, Los Alamos, NM, USA.
- New Mexico Consortium, Los Alamos, NM, USA.
| | - Paul D Adams
- Molecular Biophysics & Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
| | - Pavel V Afonine
- Molecular Biophysics & Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Department of Physics and International Centre for Quantum and Molecular Structures, Shanghai University, Shanghai, China
| | - Oleg V Sobolev
- Molecular Biophysics & Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| |
Collapse
|
5
|
Terwilliger TC, Adams PD, Afonine PV, Sobolev OV. Map segmentation, automated model-building and their application to the Cryo-EM Model Challenge. J Struct Biol 2018; 204:338-343. [PMID: 30063987 PMCID: PMC6163059 DOI: 10.1016/j.jsb.2018.07.016] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2018] [Revised: 07/11/2018] [Accepted: 07/27/2018] [Indexed: 11/27/2022]
Abstract
A recently-developed method for identifying a compact, contiguous region representing the unique part of a density map was applied to 218 Cryo-EM maps with resolutions of 4.5 Å or better. The key elements of the segmentation procedure are (1) identification of all regions of density above a threshold and (2) choice of a unique set of these regions, taking symmetry into consideration, that maximize connectivity and compactness. This segmentation approach was then combined with tools for automated map sharpening and model-building to generate models for the 12 maps in the 2016 Cryo-EM Model Challenge in a fully automated manner. The resulting models have completeness from 24% to 82% and RMS distances from reference interpretations of 0.6 Å-2.1 Å.
Collapse
Affiliation(s)
- Thomas C Terwilliger
- Los Alamos National Laboratory, Los Alamos, NM 87545, USA; New Mexico Consortium, Los Alamos, NM 87544, USA.
| | - Paul D Adams
- Molecular Biophysics & Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720-8235, USA; Department of Bioengineering, University of California Berkeley, Berkeley, CA, USA
| | - Pavel V Afonine
- Molecular Biophysics & Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720-8235, USA; Department of Physics and International Centre for Quantum and Molecular Structures, Shanghai University, Shanghai 200444, People's Republic of China
| | - Oleg V Sobolev
- Molecular Biophysics & Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720-8235, USA
| |
Collapse
|