1
|
Bermejo GA, Tjandra N, Clore GM, Schwieters CD. Xplor-NIH: Better parameters and protocols for NMR protein structure determination. Protein Sci 2024; 33:e4922. [PMID: 38501482 PMCID: PMC10962493 DOI: 10.1002/pro.4922] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2023] [Revised: 01/26/2024] [Accepted: 01/28/2024] [Indexed: 03/20/2024]
Abstract
The present work describes an update to the protein covalent geometry and atomic radii parameters in the Xplor-NIH biomolecular structure determination package. In combination with an improved treatment of selected non-bonded interactions between atoms three bonds apart, such as those involving methyl hydrogens, and a previously developed term that affects the system's gyration volume, the new parameters are tested using structure calculations on 30 proteins with restraints derived from nuclear magnetic resonance data. Using modern structure validation criteria, including several formally adopted by the Protein Data Bank, and a clear measure of structural accuracy, the results show superior performance relative to previous Xplor-NIH implementations. Additionally, the Xplor-NIH structures compare favorably against originally determined NMR models.
Collapse
Affiliation(s)
- Guillermo A. Bermejo
- Laboratory of Chemical PhysicsNational Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of HealthBethesdaMarylandUSA
| | - Nico Tjandra
- Biochemistry and Biophysics Center, National Heart, Lung, and Blood Institute, National Institutes of HealthBethesdaMarylandUSA
| | - G. Marius Clore
- Laboratory of Chemical PhysicsNational Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of HealthBethesdaMarylandUSA
| | - Charles D. Schwieters
- Laboratory of Chemical PhysicsNational Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of HealthBethesdaMarylandUSA
| |
Collapse
|
2
|
Banayan NE, Loughlin BJ, Singh S, Forouhar F, Lu G, Wong K, Neky M, Hunt HS, Bateman LB, Tamez A, Handelman SK, Price WN, Hunt JF. Systematic enhancement of protein crystallization efficiency by bulk lysine-to-arginine (KR) substitution. Protein Sci 2024; 33:e4898. [PMID: 38358135 PMCID: PMC10868448 DOI: 10.1002/pro.4898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Revised: 01/01/2024] [Accepted: 01/02/2024] [Indexed: 02/16/2024]
Abstract
Structural genomics consortia established that protein crystallization is the primary obstacle to structure determination using x-ray crystallography. We previously demonstrated that crystallization propensity is systematically related to primary sequence, and we subsequently performed computational analyses showing that arginine is the most overrepresented amino acid in crystal-packing interfaces in the Protein Data Bank. Given the similar physicochemical characteristics of arginine and lysine, we hypothesized that multiple lysine-to-arginine (KR) substitutions should improve crystallization. To test this hypothesis, we developed software that ranks lysine sites in a target protein based on the redundancy-corrected KR substitution frequency in homologs. This software can be run interactively on the worldwide web at https://www.pxengineering.org/. We demonstrate that three unrelated single-domain proteins can tolerate 5-11 KR substitutions with at most minor destabilization, and, for two of these three proteins, the construct with the largest number of KR substitutions exhibits significantly enhanced crystallization propensity. This approach rapidly produced a 1.9 Å crystal structure of a human protein domain refractory to crystallization with its native sequence. Structures from Bulk KR-substituted domains show the engineered arginine residues frequently make hydrogen-bonds across crystal-packing interfaces. We thus demonstrate that Bulk KR substitution represents a rational and efficient method for probabilistic engineering of protein surface properties to improve crystallization.
Collapse
Affiliation(s)
- Nooriel E. Banayan
- Department of Biological Sciences702A Sherman Fairchild Center, MC2434, Columbia UniversityNew YorkNew YorkUSA
| | - Blaine J. Loughlin
- Department of Biological Sciences702A Sherman Fairchild Center, MC2434, Columbia UniversityNew YorkNew YorkUSA
| | - Shikha Singh
- Department of Biological Sciences702A Sherman Fairchild Center, MC2434, Columbia UniversityNew YorkNew YorkUSA
| | - Farhad Forouhar
- Department of Biological Sciences702A Sherman Fairchild Center, MC2434, Columbia UniversityNew YorkNew YorkUSA
| | - Guanqi Lu
- Department of Biological Sciences702A Sherman Fairchild Center, MC2434, Columbia UniversityNew YorkNew YorkUSA
| | - Kam‐Ho Wong
- Department of Biological Sciences702A Sherman Fairchild Center, MC2434, Columbia UniversityNew YorkNew YorkUSA
- Present address:
Vaccine Research and DevelopmentPfizer Inc.Pearl RiverNew YorkUSA
| | - Matthew Neky
- Department of Biological Sciences702A Sherman Fairchild Center, MC2434, Columbia UniversityNew YorkNew YorkUSA
- Present address:
Columbia UniversityNew YorkNew YorkUSA
| | - Henry S. Hunt
- Department of PhysicsStanford UniversityStanfordCaliforniaUSA
| | | | | | - Samuel K. Handelman
- Department of Biological Sciences702A Sherman Fairchild Center, MC2434, Columbia UniversityNew YorkNew YorkUSA
- Present address:
Department of Pain & Neuronal HealthEli Lily & Co.893 Delaware StIndianapolisIndianaUSA
| | - W. Nicholson Price
- Department of Biological Sciences702A Sherman Fairchild Center, MC2434, Columbia UniversityNew YorkNew YorkUSA
- Present address:
University of Michigan Law SchoolAnn ArborMichiganUSA
| | - John F. Hunt
- Department of Biological Sciences702A Sherman Fairchild Center, MC2434, Columbia UniversityNew YorkNew YorkUSA
| |
Collapse
|
3
|
Liu T, Huang S, Zhang Q, Xia Y, Zhang M, Sun B. Reconciling ASPP-p53 binding mode discrepancies through an ensemble binding framework that bridges crystallography and NMR data. PLoS Comput Biol 2024; 20:e1011519. [PMID: 38324587 PMCID: PMC10878502 DOI: 10.1371/journal.pcbi.1011519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 02/20/2024] [Accepted: 01/24/2024] [Indexed: 02/09/2024] Open
Abstract
ASPP2 and iASPP bind to p53 through their conserved ANK-SH3 domains to respectively promote and inhibit p53-dependent cell apoptosis. While crystallography has indicated that these two proteins employ distinct surfaces of their ANK-SH3 domains to bind to p53, solution NMR data has suggested similar surfaces. In this study, we employed multi-scale molecular dynamics (MD) simulations combined with free energy calculations to reconcile the discrepancy in the binding modes. We demonstrated that the binding mode based solely on a single crystal structure does not enable iASPP's RT loop to engage with p53's C-terminal linker-a verified interaction. Instead, an ensemble of simulated iASPP-p53 complexes facilitates this interaction. We showed that the ensemble-average inter-protein contacting residues and NMR-detected interfacial residues qualitatively overlap on ASPP proteins, and the ensemble-average binding free energies better match experimental KD values compared to single crystallgarphy-determined binding mode. For iASPP, the sampled ensemble complexes can be grouped into two classes, resembling the binding modes determined by crystallography and solution NMR. We thus propose that crystal packing shifts the equilibrium of binding modes towards the crystallography-determined one. Lastly, we showed that the ensemble binding complexes are sensitive to p53's intrinsically disordered regions (IDRs), attesting to experimental observations that these IDRs contribute to biological functions. Our results provide a dynamic and ensemble perspective for scrutinizing these important cancer-related protein-protein interactions (PPIs).
Collapse
Affiliation(s)
- Te Liu
- Research Center for Pharmacoinformatics, College of Pharmacy, Harbin Medical University, Harbin, China
| | - Sichao Huang
- Research Center for Pharmacoinformatics, College of Pharmacy, Harbin Medical University, Harbin, China
| | - Qian Zhang
- Research Center for Pharmacoinformatics, College of Pharmacy, Harbin Medical University, Harbin, China
| | - Yu Xia
- Research Center for Pharmacoinformatics, College of Pharmacy, Harbin Medical University, Harbin, China
| | - Manjie Zhang
- Research Center for Pharmacoinformatics, College of Pharmacy, Harbin Medical University, Harbin, China
| | - Bin Sun
- Research Center for Pharmacoinformatics, College of Pharmacy, Harbin Medical University, Harbin, China
| |
Collapse
|
4
|
Klukowski P, Damberger FF, Allain FHT, Iwai H, Kadavath H, Ramelot TA, Montelione GT, Riek R, Güntert P. The 100-protein NMR spectra dataset: A resource for biomolecular NMR data analysis. Sci Data 2024; 11:30. [PMID: 38177162 PMCID: PMC10767026 DOI: 10.1038/s41597-023-02879-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 12/22/2023] [Indexed: 01/06/2024] Open
Abstract
Multidimensional NMR spectra are the basis for studying proteins by NMR spectroscopy and crucial for the development and evaluation of methods for biomolecular NMR data analysis. Nevertheless, in contrast to derived data such as chemical shift assignments in the BMRB and protein structures in the PDB databases, this primary data is in general not publicly archived. To change this unsatisfactory situation, we present a standardized set of solution NMR data comprising 1329 2-4-dimensional NMR spectra and associated reference (chemical shift assignments, structures) and derived (peak lists, restraints for structure calculation, etc.) annotations. With the 100-protein NMR spectra dataset that was originally compiled for the development of the ARTINA deep learning-based spectra analysis method, 100 protein structures can be reproduced from their original experimental data. The 100-protein NMR spectra dataset is expected to help the development of computational methods for NMR spectroscopy, in particular machine learning approaches, and enable consistent and objective comparisons of these methods.
Collapse
Affiliation(s)
- Piotr Klukowski
- Institute of Molecular Physical Science, ETH Zurich, 8093, Zurich, Switzerland.
| | - Fred F Damberger
- Institute of Biochemistry, ETH Zurich, 8093, Zurich, Switzerland
| | | | - Hideo Iwai
- Institute of Biotechnology, University of Helsinki, 00100, Helsinki, Finland
| | | | - Theresa A Ramelot
- Department of Chemistry and Chemical Biology, and Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY, 12180, USA
| | - Gaetano T Montelione
- Department of Chemistry and Chemical Biology, and Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY, 12180, USA
| | - Roland Riek
- Institute of Molecular Physical Science, ETH Zurich, 8093, Zurich, Switzerland.
| | - Peter Güntert
- Institute of Molecular Physical Science, ETH Zurich, 8093, Zurich, Switzerland.
- Institute of Biophysical Chemistry, Goethe University, 60438, Frankfurt am Main, Germany.
- Department of Chemistry, Tokyo Metropolitan University, Hachioji, 192-0397, Tokyo, Japan.
| |
Collapse
|
5
|
McBride JM, Polev K, Abdirasulov A, Reinharz V, Grzybowski BA, Tlusty T. AlphaFold2 Can Predict Single-Mutation Effects. PHYSICAL REVIEW LETTERS 2023; 131:218401. [PMID: 38072605 DOI: 10.1103/physrevlett.131.218401] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Accepted: 09/26/2023] [Indexed: 12/18/2023]
Abstract
AlphaFold2 (AF) is a promising tool, but is it accurate enough to predict single mutation effects? Here, we report that the localized structural deformation between protein pairs differing by only 1-3 mutations-as measured by the effective strain-is correlated across 3901 experimental and AF-predicted structures. Furthermore, analysis of ∼11 000 proteins shows that the local structural change correlates with various phenotypic changes. These findings suggest that AF can predict the range and magnitude of single-mutation effects on average, and we propose a method to improve precision of AF predictions and to indicate when predictions are unreliable.
Collapse
Affiliation(s)
- John M McBride
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan 44919, South Korea
| | - Konstantin Polev
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan 44919, South Korea
- Department of Biomedical Engineering, Ulsan National Institute of Science and Technology, Ulsan 44919, South Korea
| | - Amirbek Abdirasulov
- Department of Computer Science and Engineering, Ulsan National Institute of Science and Technology, Ulsan 44919, South Korea
| | | | - Bartosz A Grzybowski
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan 44919, South Korea
- Departments of Physics and Chemistry, Ulsan National Institute of Science and Technology, Ulsan 44919, South Korea
| | - Tsvi Tlusty
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan 44919, South Korea
- Departments of Physics and Chemistry, Ulsan National Institute of Science and Technology, Ulsan 44919, South Korea
| |
Collapse
|
6
|
Lubecka EA, Liwo A. A coarse-grained approach to NMR-data-assisted modeling of protein structures. J Comput Chem 2022; 43:2047-2059. [PMID: 36134668 DOI: 10.1002/jcc.27003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Revised: 08/03/2022] [Accepted: 09/05/2022] [Indexed: 11/06/2022]
Abstract
The ESCASA algorithm for analytical estimation of proton positions from coarse-grained geometry developed in our recent work has been implemented in modeling protein structures with the highly coarse-grained UNRES model of polypeptide chains (two sites per residue) and nuclear magnetic resonance (NMR) data. A penalty function with the shape of intersecting gorges was applied to treat ambiguous distance restraints, which automatically selects consistent restraints. Hamiltonian replica exchange molecular dynamics was used to carry out the conformational search. The method was tested with both unambiguous and ambiguous restraints producing good-quality models with GDT_TS from 7.4 units higher to 14.4 units lower than those obtained with the CYANA or MELD software for protein-structure determination from NMR data at the all-atom resolution. The method can thus be applied in modeling the structures of flexible proteins, for which extensive conformational search enabled by coarse-graining is more important than high modeling accuracy.
Collapse
Affiliation(s)
- Emilia A Lubecka
- Faculty of Electronics, Telecommunications and Informatics, Gdańsk University of Technology, Gdańsk, Poland
| | - Adam Liwo
- Faculty of Chemistry, University of Gdańsk, Gdańsk, Poland
| |
Collapse
|
7
|
Grigas AT, Liu Z, Regan L, O'Hern CS. Core packing of well‐defined X‐ray and
NMR
structures is the same. Protein Sci 2022; 31:e4373. [PMID: 35900019 PMCID: PMC9277709 DOI: 10.1002/pro.4373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2022] [Revised: 05/06/2022] [Accepted: 06/02/2022] [Indexed: 11/10/2022]
Abstract
Numerous studies have investigated the differences and similarities between protein structures determined by solution NMR spectroscopy and those determined by X-ray crystallography. A fundamental question is whether any observed differences are due to differing methodologies or to differences in the behavior of proteins in solution versus in the crystalline state. Here, we compare the properties of the hydrophobic cores of high-resolution protein crystal structures and those in NMR structures, determined using increasing numbers and types of restraints. Prior studies have reported that many NMR structures have denser cores compared with those of high-resolution X-ray crystal structures. Our current work investigates this result in more detail and finds that these NMR structures tend to violate basic features of protein stereochemistry, such as small non-bonded atomic overlaps and few Ramachandran and sidechain dihedral angle outliers. We find that NMR structures solved with more restraints, and which do not significantly violate stereochemistry, have hydrophobic cores that have a similar size and packing fraction as their counterparts determined by X-ray crystallography at high resolution. These results lead us to conclude that, at least regarding the core packing properties, high-quality structures determined by NMR and X-ray crystallography are the same, and the differences reported earlier are most likely a consequence of methodology, rather than fundamental differences between the protein in the two different environments.
Collapse
Affiliation(s)
- Alex T. Grigas
- Graduate Program in Computational Biology and Bioinformatics Yale University New Haven Connecticut USA
- Integrated Graduate Program in Physical and Engineering Biology Yale University New Haven Connecticut USA
| | - Zhuoyi Liu
- Integrated Graduate Program in Physical and Engineering Biology Yale University New Haven Connecticut USA
- Department of Mechanical Engineering and Materials Science Yale University New Haven Connecticut USA
| | - Lynne Regan
- Institute of Quantitative Biology, Biochemistry and Biotechnology Centre for Synthetic and Systems Biology, School of Biological Sciences, University of Edinburgh Edinburgh UK
| | - Corey S. O'Hern
- Graduate Program in Computational Biology and Bioinformatics Yale University New Haven Connecticut USA
- Integrated Graduate Program in Physical and Engineering Biology Yale University New Haven Connecticut USA
- Department of Mechanical Engineering and Materials Science Yale University New Haven Connecticut USA
- Department of Physics Yale University New Haven Connecticut USA
- Department of Applied Physics Yale University New Haven Connecticut USA
| |
Collapse
|
8
|
Tejero R, Huang YJ, Ramelot TA, Montelione GT. AlphaFold Models of Small Proteins Rival the Accuracy of Solution NMR Structures. Front Mol Biosci 2022; 9:877000. [PMID: 35769913 PMCID: PMC9234698 DOI: 10.3389/fmolb.2022.877000] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Accepted: 04/25/2022] [Indexed: 11/13/2022] Open
Abstract
Recent advances in molecular modeling using deep learning have the potential to revolutionize the field of structural biology. In particular, AlphaFold has been observed to provide models of protein structures with accuracies rivaling medium-resolution X-ray crystal structures, and with excellent atomic coordinate matches to experimental protein NMR and cryo-electron microscopy structures. Here we assess the hypothesis that AlphaFold models of small, relatively rigid proteins have accuracies (based on comparison against experimental data) similar to experimental solution NMR structures. We selected six representative small proteins with structures determined by both NMR and X-ray crystallography, and modeled each of them using AlphaFold. Using several structure validation tools integrated under the Protein Structure Validation Software suite (PSVS), we then assessed how well these models fit to experimental NMR data, including NOESY peak lists (RPF-DP scores), comparisons between predicted rigidity and chemical shift data (ANSURR scores), and 15N-1H residual dipolar coupling data (RDC Q factors) analyzed by software tools integrated in the PSVS suite. Remarkably, the fits to NMR data for the protein structure models predicted with AlphaFold are generally similar, or better, than for the corresponding experimental NMR or X-ray crystal structures. Similar conclusions were reached in comparing AlphaFold2 predictions and NMR structures for three targets from the Critical Assessment of Protein Structure Prediction (CASP). These results contradict the widely held misperception that AlphaFold cannot accurately model solution NMR structures. They also document the value of PSVS for model vs. data assessment of protein NMR structures, and the potential for using AlphaFold models for guiding analysis of experimental NMR data and more generally in structural biology.
Collapse
Affiliation(s)
- Roberto Tejero
- Departamento de Química Física, Universidad de Valencia, Valencia, Spain
- *Correspondence: Roberto Tejero, ; Gaetano T. Montelione,
| | - Yuanpeng Janet Huang
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY, United States
| | - Theresa A. Ramelot
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY, United States
| | - Gaetano T. Montelione
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY, United States
- *Correspondence: Roberto Tejero, ; Gaetano T. Montelione,
| |
Collapse
|
9
|
The accuracy of protein structures in solution determined by AlphaFold and NMR. Structure 2022; 30:925-933.e2. [DOI: 10.1016/j.str.2022.04.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 03/18/2022] [Accepted: 04/13/2022] [Indexed: 02/05/2023]
|
10
|
Lubecka EA, Liwo A. ESCASA: Analytical estimation of atomic coordinates from coarse-grained geometry for nuclear-magnetic-resonance-assisted protein structure modeling. I. Backbone and H β protons. J Comput Chem 2021; 42:1579-1589. [PMID: 34048074 DOI: 10.1002/jcc.26695] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 05/06/2021] [Accepted: 05/11/2021] [Indexed: 12/13/2022]
Abstract
A method for the estimation of coordinates of atoms in proteins from coarse-grained geometry by simple analytical formulas (ESCASA), for use in nuclear-magnetic-resonance (NMR) data-assisted coarse-grained simulations of proteins is proposed. In this paper, the formulas for the backbone Hα and amide (HN ) protons, and the side-chain Hβ protons, given the Cα -trace, have been derived and parameterized, by using the interproton distances calculated from a set of 140 high-resolution non-homologous protein structures. The mean standard deviation over all types of proton pairs in the set was 0.44 Å after fitting. Validation against a set of 41 proteins with NMR-determined structures, which were not considered in parameterization, resulted in average standard deviation from average proton-proton distances of the NMR-determined structures of 0.25 Å, compared to 0.21 Å obtained with the PULCHRA all-atom-chain reconstruction algorithm and to the 0.12 Å standard deviation of the average-structure proton-proton distance of NMR-determined ensembles. The formulas provide analytical forces and can, therefore, be used in coarse-grained molecular dynamics.
Collapse
Affiliation(s)
- Emilia A Lubecka
- Faculty of Electronics, Telecommunications and Informatics, Gdańsk University of Technology, Gdańsk, Poland
| | - Adam Liwo
- Faculty of Chemistry, University of Gdańsk, Gdańsk, Poland
| |
Collapse
|
11
|
Prestegard JH. A perspective on the PDB's impact on the field of glycobiology. J Biol Chem 2021; 296:100556. [PMID: 33744289 PMCID: PMC8058564 DOI: 10.1016/j.jbc.2021.100556] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2020] [Revised: 03/07/2021] [Accepted: 03/16/2021] [Indexed: 12/12/2022] Open
Abstract
Structures deposited in the Protein Data Bank (PDB) facilitate our understanding of many biological processes including those that fall under the general category of glycobiology. However, structure-based studies of how glycans affect protein structure, how they are synthesized, and how they regulate other biological processes remain challenging. Despite the abundant presence of glycans on proteins and the dense layers of glycans that surround most of our cells, structures containing glycans are underrepresented in the PDB. There are sound reasons for this, including difficulties in producing proteins with well-defined glycosylation and the tendency of mobile and heterogeneous glycans to inhibit crystallization. Nevertheless, the structures we do find in the PDB, even some of the earliest deposited structures, have had an impact on our understanding of function. I highlight a few examples in this review and point to some promises for the future. Promises include new structures from methodologies, such as cryo-EM, that are less affected by the presence of glycans and experiment-aided computational methods that build on existing structures to provide insight into the many ways glycans affect biological function.
Collapse
Affiliation(s)
- James H Prestegard
- Complex Carbohydrate Research Center, University of Georgia, Athens, Georgia, USA.
| |
Collapse
|
12
|
Reinknecht C, Riga A, Rivera J, Snyder DA. Patterns in Protein Flexibility: A Comparison of NMR "Ensembles", MD Trajectories, and Crystallographic B-Factors. Molecules 2021; 26:molecules26051484. [PMID: 33803249 PMCID: PMC7967184 DOI: 10.3390/molecules26051484] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Revised: 02/18/2021] [Accepted: 02/28/2021] [Indexed: 11/16/2022] Open
Abstract
Proteins are molecular machines requiring flexibility to function. Crystallographic B-factors and Molecular Dynamics (MD) simulations both provide insights into protein flexibility on an atomic scale. Nuclear Magnetic Resonance (NMR) lacks a universally accepted analog of the B-factor. However, a lack of convergence in atomic coordinates in an NMR-based structure calculation also suggests atomic mobility. This paper describes a pattern in the coordinate uncertainties of backbone heavy atoms in NMR-derived structural “ensembles” first noted in the development of FindCore2 (previously called Expanded FindCore: DA Snyder, J Grullon, YJ Huang, R Tejero, GT Montelione, Proteins: Structure, Function, and Bioinformatics 82 (S2), 219–230) and demonstrates that this pattern exists in coordinate variances across MD trajectories but not in crystallographic B-factors. This either suggests that MD trajectories and NMR “ensembles” capture motional behavior of peptide bond units not captured by B-factors or indicates a deficiency common to force fields used in both NMR and MD calculations.
Collapse
|
13
|
Mei Z, Treado JD, Grigas AT, Levine ZA, Regan L, O'Hern CS. Analyses of protein cores reveal fundamental differences between solution and crystal structures. Proteins 2020; 88:1154-1161. [PMID: 32105366 PMCID: PMC7415476 DOI: 10.1002/prot.25884] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2019] [Revised: 02/05/2020] [Accepted: 02/23/2020] [Indexed: 12/20/2022]
Abstract
There have been several studies suggesting that protein structures solved by NMR spectroscopy and X-ray crystallography show significant differences. To understand the origin of these differences, we assembled a database of high-quality protein structures solved by both methods. We also find significant differences between NMR and crystal structures-in the root-mean-square deviations of the C α atomic positions, identities of core amino acids, backbone, and side-chain dihedral angles, and packing fraction of core residues. In contrast to prior studies, we identify the physical basis for these differences by modeling protein cores as jammed packings of amino acid-shaped particles. We find that we can tune the jammed packing fraction by varying the degree of thermalization used to generate the packings. For an athermal protocol, we find that the average jammed packing fraction is identical to that observed in the cores of protein structures solved by X-ray crystallography. In contrast, highly thermalized packing-generation protocols yield jammed packing fractions that are even higher than those observed in NMR structures. These results indicate that thermalized systems can pack more densely than athermal systems, which suggests a physical basis for the structural differences between protein structures solved by NMR and X-ray crystallography.
Collapse
Affiliation(s)
- Zhe Mei
- Integrated Graduate Program in Physical & Engineering Biology, Yale University, New Haven, Connecticut
- Department of Chemistry, Yale University, New Haven, Connecticut
| | - John D Treado
- Integrated Graduate Program in Physical & Engineering Biology, Yale University, New Haven, Connecticut
- Department of Mechanical Engineering & Materials Science, Yale University, New Haven, Connecticut
| | - Alex T Grigas
- Integrated Graduate Program in Physical & Engineering Biology, Yale University, New Haven, Connecticut
- Graduate Program in Computational Biology & Bioinformatics, Yale University, New Haven, Connecticut
| | - Zachary A Levine
- Department of Pathology, Yale University, New Haven, Connecticut
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut
| | - Lynne Regan
- Institute of Quantitative Biology, Biochemistry and Biotechnology, Center for Synthetic and Systems Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | - Corey S O'Hern
- Integrated Graduate Program in Physical & Engineering Biology, Yale University, New Haven, Connecticut
- Department of Mechanical Engineering & Materials Science, Yale University, New Haven, Connecticut
- Department of Physics, Yale University, New Haven, Connecticut
- Department of Applied Physics, Yale University, New Haven, Connecticut
| |
Collapse
|
14
|
Wei X, Li ZC, Li SJ, Peng XB, Zhao Q. Protein structure determination using a Riemannian approach. FEBS Lett 2019; 594:1036-1051. [PMID: 31769509 DOI: 10.1002/1873-3468.13688] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2019] [Revised: 10/31/2019] [Accepted: 11/14/2019] [Indexed: 11/05/2022]
Abstract
Protein NMR structure determination is one of the most extensively studied problems. Here, we adopt a novel method based on a matrix completion technique - the Riemannian approach - to rebuild the protein structure from the nuclear Overhauser effect distance restraints and the dihedral angle restraints. In comparison with the cyana method, the results generated via the Riemannian approach are more similar to the standard X-ray crystallographic structures as a result of the simple but powerful internal calculation processing function. In addition, our results demonstrate that the Riemannian approach has a comparable or even better performance than the cyana method on other structural assessment metrics, including the stereochemical quality and restraint violations. The Riemannian approach software is available at: https://github.com/xubiaopeng/Protein_Recon_MCRiemman.
Collapse
Affiliation(s)
- Xian Wei
- Center for Quantum Technology Research, School of Physics, Beijing Institute of Technology, China.,Department of Science, Taiyuan Institute of Technology, China
| | - Zhi-Cheng Li
- Department of Physics, Taiyuan Normal University, China
| | - Shi-Jian Li
- Center for Quantum Technology Research, School of Physics, Beijing Institute of Technology, China
| | - Xu-Biao Peng
- Center for Quantum Technology Research, School of Physics, Beijing Institute of Technology, China
| | - Qing Zhao
- Center for Quantum Technology Research, School of Physics, Beijing Institute of Technology, China
| |
Collapse
|
15
|
Huang YJ, Brock KP, Sander C, Marks DS, Montelione GT. A Hybrid Approach for Protein Structure Determination Combining Sparse NMR with Evolutionary Coupling Sequence Data. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2018; 1105:153-169. [PMID: 30617828 DOI: 10.1007/978-981-13-2200-6_10] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
While 3D structure determination of small (<15 kDa) proteins by solution NMR is largely automated and routine, structural analysis of larger proteins is more challenging. An emerging hybrid strategy for modeling protein structures combines sparse NMR data that can be obtained for larger proteins with sequence co-variation data, called evolutionary couplings (ECs), obtained from multiple sequence alignments of protein families. This hybrid "EC-NMR" method can be used to accurately model larger (15-60 kDa) proteins, and more rapidly determine structures of smaller (5-15 kDa) proteins using only backbone NMR data. The resulting structures have accuracies relative to reference structures comparable to those obtained with full backbone and sidechain NMR resonance assignments. The requirement that evolutionary couplings (ECs) are consistent with NMR data recorded on a specific member of a protein family, under specific conditions, potentially also allows identification of ECs that reflect alternative allosteric or excited states of the protein structure.
Collapse
Affiliation(s)
- Yuanpeng Janet Huang
- Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | - Kelly P Brock
- cBio Center, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Chris Sander
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA
- cBio Center, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Debora S Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Gaetano T Montelione
- Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, NJ, USA.
| |
Collapse
|
16
|
Koehler Leman J, D'Avino AR, Bhatnagar Y, Gray JJ. Comparison of NMR and crystal structures of membrane proteins and computational refinement to improve model quality. Proteins 2017; 86:57-74. [PMID: 29044728 DOI: 10.1002/prot.25402] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Revised: 09/27/2017] [Accepted: 10/11/2017] [Indexed: 12/29/2022]
Abstract
Membrane proteins are challenging to study and restraints for structure determination are typically sparse or of low resolution because the membrane environment that surrounds them leads to a variety of experimental challenges. When membrane protein structures are determined by different techniques in different environments, a natural question is "which structure is most biologically relevant?" Towards answering this question, we compiled a dataset of membrane proteins with known structures determined by both solution NMR and X-ray crystallography. By investigating differences between the structures, we found that RMSDs between crystal and NMR structures are below 5 Å in the membrane region, NMR ensembles have a higher convergence in the membrane region, crystal structures typically have a straighter transmembrane region, have higher stereo-chemical correctness, and are more tightly packed. After quantifying these differences, we used high-resolution refinement of the NMR structures to mitigate them, which paves the way for identifying and improving the structural quality of membrane proteins.
Collapse
Affiliation(s)
- Julia Koehler Leman
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland.,Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, New York
| | - Andrew R D'Avino
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland.,Department of Biology, Johns Hopkins University, Baltimore, Maryland
| | - Yash Bhatnagar
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland
| | - Jeffrey J Gray
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland
| |
Collapse
|
17
|
Faraggi E, Dunker AK, Sussman JL, Kloczkowski A. Comparing NMR and X-ray protein structure: Lindemann-like parameters and NMR disorder. J Biomol Struct Dyn 2017; 36:2331-2341. [PMID: 28714803 DOI: 10.1080/07391102.2017.1352539] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Disordered protein chains and segments are fast becoming a major pathway for our understanding of biological function, especially in more evolved species. However, the standard definition of disordered residues: the inability to constrain them in X-ray derived structures, is not easily applied to NMR derived structures. We carry out a statistical comparison between proteins whose structure was resolved using NMR and using X-ray protocols. We start by establishing a connection between these two protocols for obtaining protein structure. We find a close statistical correspondence between NMR and X-ray structures if fluctuations inherent to the NMR protocol are taken into account. Intuitively this tends to lend support to the validity of both NMR and X-ray protocols in deriving biomolecular models that correspond to in vivo conditions. We then establish Lindemann-like parameters for NMR derived structures and examine what order/disorder cutoffs for these parameters are most consistent with X-ray data and how consistent are they. Finally, we find critical value of [Formula: see text] for the best correspondence between X-ray and NMR derived order/disorder assignment, judged by maximizing the Matthews correlation, and a critical value [Formula: see text] if a balance between false positive and false negative prediction is sought. We examine a few non-conforming cases, and examine the origin of the structure derived in X-ray. This study could help in assigning meaningful disorder from NMR experiments.
Collapse
Affiliation(s)
- Eshel Faraggi
- a Department of Biochemistry and Molecular Biology , Indiana University School of Medicine , Indianapolis , 46202 IN , USA .,b Battelle Center for Mathematical Medicine , The Research Institute at Nationwide Children's Hospital , Columbus , 43205 OH , USA .,c Research and Information Systems , LLC , Carmel , 46032 IN , USA
| | - A Keith Dunker
- a Department of Biochemistry and Molecular Biology , Indiana University School of Medicine , Indianapolis , 46202 IN , USA .,d Center for Computational Biology and Bioinformatics , Indiana University School of Medicine , Indianapolis , 46202 IN , USA
| | - Joel L Sussman
- e Department of Structural Biology , Weizmann Institute of Science , Rehovot , 76100 Israel
| | - Andrzej Kloczkowski
- f Battelle Center for Mathematical Medicine , Nationwide Children's Hospital , Columbus , 43215 OH , USA .,g Department of Pediatrics , The Ohio State University , Columbus , 43215 OH , USA .,h Kavli Institute for Theoretical Physics China , Chinese Academy of Sciences , Beijing , 100190 China
| |
Collapse
|
18
|
Gao Q, Chalmers GR, Moremen KW, Prestegard JH. NMR assignments of sparsely labeled proteins using a genetic algorithm. JOURNAL OF BIOMOLECULAR NMR 2017; 67:283-294. [PMID: 28289927 PMCID: PMC5434516 DOI: 10.1007/s10858-017-0101-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2016] [Accepted: 02/22/2017] [Indexed: 05/16/2023]
Abstract
Sparse isotopic labeling of proteins for NMR studies using single types of amino acid (15N or 13C enriched) has several advantages. Resolution is enhanced by reducing numbers of resonances for large proteins, and isotopic labeling becomes economically feasible for glycoproteins that must be expressed in mammalian cells. However, without access to the traditional triple resonance strategies that require uniform isotopic labeling, NMR assignment of crosspeaks in heteronuclear single quantum coherence (HSQC) spectra is challenging. We present an alternative strategy which combines readily accessible NMR data with known protein domain structures. Based on the structures, chemical shifts are predicted, NOE cross-peak lists are generated, and residual dipolar couplings (RDCs) are calculated for each labeled site. Simulated data are then compared to measured values for a trial set of assignments and scored. A genetic algorithm uses the scores to search for an optimal pairing of HSQC crosspeaks with labeled sites. While none of the individual data types can give a definitive assignment for a particular site, their combination can in most cases. Four test proteins previously assigned using triple resonance methods and a sparsely labeled glycosylated protein, Robo1, previously assigned by manual analysis, are used to validate the method and develop a criterion for identifying sites assigned with high confidence.
Collapse
Affiliation(s)
- Qi Gao
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA, 30602, USA
| | - Gordon R Chalmers
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA, 30602, USA
- Department of Computer Science and Complex Carbohydrate Research Center, University of Georgia, Athens, GA, 30602, USA
| | - Kelley W Moremen
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA, 30602, USA
| | - James H Prestegard
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA, 30602, USA.
| |
Collapse
|
19
|
The impact of structural genomics: the first quindecennial. ACTA ACUST UNITED AC 2016; 17:1-16. [PMID: 26935210 DOI: 10.1007/s10969-016-9201-5] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2015] [Accepted: 02/17/2016] [Indexed: 12/21/2022]
Abstract
The period 2000-2015 brought the advent of high-throughput approaches to protein structure determination. With the overall funding on the order of $2 billion (in 2010 dollars), the structural genomics (SG) consortia established worldwide have developed pipelines for target selection, protein production, sample preparation, crystallization, and structure determination by X-ray crystallography and NMR. These efforts resulted in the determination of over 13,500 protein structures, mostly from unique protein families, and increased the structural coverage of the expanding protein universe. SG programs contributed over 4400 publications to the scientific literature. The NIH-funded Protein Structure Initiatives alone have produced over 2000 scientific publications, which to date have attracted more than 93,000 citations. Software and database developments that were necessary to handle high-throughput structure determination workflows have led to structures of better quality and improved integrity of the associated data. Organized and accessible data have a positive impact on the reproducibility of scientific experiments. Most of the experimental data generated by the SG centers are freely available to the community and has been utilized by scientists in various fields of research. SG projects have created, improved, streamlined, and validated many protocols for protein production and crystallization, data collection, and functional analysis, significantly benefiting biological and biomedical research.
Collapse
|