1
|
Flachsenberg F, Ehrt C, Gutermuth T, Rarey M. Redocking the PDB. J Chem Inf Model 2024; 64:219-237. [PMID: 38108627 DOI: 10.1021/acs.jcim.3c01573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Molecular docking is a standard technique in structure-based drug design (SBDD). It aims to predict the 3D structure of a small molecule in the binding site of a receptor (often a protein). Despite being a common technique, it often necessitates multiple tools and involves manual steps. Here, we present the JAMDA preprocessing and docking workflow that is easy to use and allows fully automated docking. We evaluate the JAMDA docking workflow on binding sites extracted from the complete PDB and derive key factors determining JAMDA's docking performance. With that, we try to remove most of the bias due to manual intervention and provide a realistic estimate of the redocking performance of our JAMDA preprocessing and docking workflow for any PDB structure. On this large PDBScan22 data set, our JAMDA workflow finds a pose with an RMSD of at most 2 Å to the crystal ligand on the top rank for 30.1% of the structures. When applying objective structure quality filters to the PDBScan22 data set, the success rate increases to 61.8%. Given the prepared structures from the JAMDA preprocessing pipeline, both JAMDA and the widely used AutoDock Vina perform comparably on this filtered data set (the PDBScan22-HQ data set).
Collapse
Affiliation(s)
- Florian Flachsenberg
- Universität Hamburg, ZBH - Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| | - Christiane Ehrt
- Universität Hamburg, ZBH - Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| | - Torben Gutermuth
- Universität Hamburg, ZBH - Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| | - Matthias Rarey
- Universität Hamburg, ZBH - Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| |
Collapse
|
2
|
Revillo Imbernon J, Chiesa L, Kellenberger E. Mining the Protein Data Bank to inspire fragment library design. Front Chem 2023; 11:1089714. [PMID: 36846858 PMCID: PMC9950109 DOI: 10.3389/fchem.2023.1089714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Accepted: 01/27/2023] [Indexed: 02/12/2023] Open
Abstract
The fragment approach has emerged as a method of choice for drug design, as it allows difficult therapeutic targets to be addressed. Success lies in the choice of the screened chemical library and the biophysical screening method, and also in the quality of the selected fragment and structural information used to develop a drug-like ligand. It has recently been proposed that promiscuous compounds, i.e., those that bind to several proteins, present an advantage for the fragment approach because they are likely to give frequent hits in screening. In this study, we searched the Protein Data Bank for fragments with multiple binding modes and targeting different sites. We identified 203 fragments represented by 90 scaffolds, some of which are not or hardly present in commercial fragment libraries. By contrast to other available fragment libraries, the studied set is enriched in fragments with a marked three-dimensional character (download at 10.5281/zenodo.7554649).
Collapse
Affiliation(s)
- Julia Revillo Imbernon
- Laboratoire d’Innovation Thérapeutique, Faculté de Pharmacie, UMR7200 CNRS Université de Strasbourg, Illkirch-Graffenstaden, France
| | - Luca Chiesa
- Laboratoire d’Innovation Thérapeutique, Faculté de Pharmacie, UMR7200 CNRS Université de Strasbourg, Illkirch-Graffenstaden, France
| | | |
Collapse
|
3
|
Gucwa M, Lenkiewicz J, Zheng H, Cymborowski M, Cooper DR, Murzyn K, Minor W. CMM-An enhanced platform for interactive validation of metal binding sites. Protein Sci 2023; 32:e4525. [PMID: 36464767 PMCID: PMC9794025 DOI: 10.1002/pro.4525] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 11/21/2022] [Accepted: 11/22/2022] [Indexed: 12/12/2022]
Abstract
Metal ions bound to macromolecules play an integral role in many cellular processes. They can directly participate in catalytic mechanisms or be essential for the structural integrity of proteins and nucleic acids. However, their unique nature in macromolecules can make them difficult to model and refine, and a substantial portion of metal ions in the PDB are misidentified or poorly refined. CheckMyMetal (CMM) is a validation tool that has gained widespread acceptance as an essential tool for researchers working on metal-macromolecule complexes. CMM can be used during structure determination or to validate metal binding sites in structural models within the PDB. The functionalities of CMM have recently been greatly enhanced and provide researchers with additional information that can guide modeling decisions. The new version of CMM shows metals in the context of electron density maps and allows for on-the-fly refinement of metal binding sites. The improvements should increase the reproducibility of biomedical research. The web server is available at https://cmm.minorlab.org.
Collapse
Affiliation(s)
- Michal Gucwa
- Department of Molecular Physiology and Biological PhysicsUniversity of VirginiaCharlottesvilleVirginiaUSA,Department of Computational Biophysics and BioinformaticsJagiellonian UniversityKrakowPoland
| | - Joanna Lenkiewicz
- Department of Molecular Physiology and Biological PhysicsUniversity of VirginiaCharlottesvilleVirginiaUSA
| | - Heping Zheng
- Department of Molecular Physiology and Biological PhysicsUniversity of VirginiaCharlottesvilleVirginiaUSA,Present address:
Hunan University College of BiologyBioinformatics CenterHunanPeople's Republic of China
| | - Marcin Cymborowski
- Department of Molecular Physiology and Biological PhysicsUniversity of VirginiaCharlottesvilleVirginiaUSA
| | - David R. Cooper
- Department of Molecular Physiology and Biological PhysicsUniversity of VirginiaCharlottesvilleVirginiaUSA
| | - Krzysztof Murzyn
- Department of Computational Biophysics and BioinformaticsJagiellonian UniversityKrakowPoland
| | - Wladek Minor
- Department of Molecular Physiology and Biological PhysicsUniversity of VirginiaCharlottesvilleVirginiaUSA
| |
Collapse
|
4
|
Andersson I, Carlsson GH, Hasse D. Structural Analysis of Strigolactone-Related Gene Products. Methods Mol Biol 2021; 2309:245-257. [PMID: 34028692 DOI: 10.1007/978-1-0716-1429-7_19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Structural knowledge of biological macromolecules is essential for understanding their function and for modifying that function by engineering. Protein crystallography is a powerful method for elucidating molecular structures of proteins, but it is essential that the investigator has a basic knowledge of good practices and of the major pitfalls in the technique. Here we describe issues specific for the case of structural studies of strigolactone (SL) receptor structure and function, and in particular the difficulties associated with capturing complexes of SL receptors with the SL hormone ligand in the crystal.
Collapse
Affiliation(s)
- Inger Andersson
- Laboratory of Molecular Biophysics, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden. .,Arctic University of Norway, Tromsø, Norway. .,Institute of Biotechnology of the Czech Academy of Sciences, BIOCEV, Vestec, Czech Republic.
| | - Gunilla H Carlsson
- Laboratory of Molecular Biophysics, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - Dirk Hasse
- Laboratory of Molecular Biophysics, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
5
|
' All That Glitters Is Not Gold': High-Resolution Crystal Structures of Ligand-Protein Complexes Need Not Always Represent Confident Binding Poses. Int J Mol Sci 2021; 22:ijms22136830. [PMID: 34202053 PMCID: PMC8268033 DOI: 10.3390/ijms22136830] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Revised: 05/24/2021] [Accepted: 05/24/2021] [Indexed: 01/09/2023] Open
Abstract
Our understanding of the structure–function relationships of biomolecules and thereby applying it to drug discovery programs are substantially dependent on the availability of the structural information of ligand–protein complexes. However, the correct interpretation of the electron density of a small molecule bound to a crystal structure of a macromolecule is not trivial. Our analysis involving quality assessment of ~0.28 million small molecule–protein binding site pairs derived from crystal structures corresponding to ~66,000 PDB entries indicates that the majority (65%) of the pairs might need little (54%) or no (11%) attention. Out of the remaining 35% of pairs that need attention, 11% of the pairs (including structures with high/moderate resolution) pose serious concerns. Unfortunately, most users of crystal structures lack the training to evaluate the quality of a crystal structure against its experimental data and, in general, rely on the resolution as a ‘gold standard’ quality metric. Our work aims to sensitize the non-crystallographers that resolution, which is a global quality metric, need not be an accurate indicator of local structural quality. In this article, we demonstrate the use of several freely available tools that quantify local structural quality and are easy to use from a non-crystallographer’s perspective. We further propose a few solutions for consideration by the scientific community to promote quality research in structural biology and applied areas.
Collapse
|
6
|
Gee CL, Holton JM, McPherson A. Structures of two novel crystal forms of Aspergillus oryzae alpha amylase (taka-amylase). J Biosci Bioeng 2021; 131:605-612. [PMID: 33814275 PMCID: PMC8187280 DOI: 10.1016/j.jbiosc.2021.02.008] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Revised: 02/12/2021] [Accepted: 02/16/2021] [Indexed: 01/22/2023]
Abstract
The structures of Aspergillus oryzae α-amylase were determined in a tetragonal crystal, having one molecule as asymmetric unit, and a monoclinic crystal with two molecules as asymmetric unit. Both crystal forms were obtained from trace contaminants of an old commercial lipase preparation. Structures were determined and refined to 1.65 Å and 1.43 Å resolution respectively. The latter crystal has a non-crystallographic (NCS) twofold axis within the asymmetric unit. Glycosylation at Asn197 is evident, and in the tetragonal crystal can be seen to include three, partially disordered sugar residues following the initial N-acetyl glucosamine (NAG). Superposition of the tetragonal crystal model on the α-amylases from Bacillus subtilis (PDB:1BAG), pig pancreas (PDB:3L2L), and barley (PDB:1AMY), show a high degree of coincidence, particularly for the (β/α)8-barrel domains, and especially within the active site. Using this structural agreement between amylases, we extrapolated the binding model of a six residue, limit dextrin found in pig pancreas α-amylase to the A. oryzae enzyme model, which predicts substrate interacting amino acid residues.
Collapse
Affiliation(s)
- Christine L Gee
- Department of Molecular and Cell Biology and Howard Hughes Medical Institute, University of California, Stanley Hall 527, Berkeley, CA 94720-3220, USA
| | - James M Holton
- Department of Biochemistry and Biophysics, UC San Francisco, San Francisco, CA 94158, USA; Department of Molecular Biophysics and Integrated Bioimaging, Advanced Light Source, MS-2108, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA; Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA
| | - Alexander McPherson
- Department of Molecular Biology and Biochemistry, University of California, 3205 McGaugh Hall, Irvine, CA 92697-3900, USA.
| |
Collapse
|
7
|
Perez AM, Wolfe JA, Schermerhorn JT, Qian Y, Cela BA, Kalinowski CR, Largoza GE, Fields PA, Brandt GS. Thermal stability and structure of glyceraldehyde-3-phosphate dehydrogenase from the coral Acropora millepora. RSC Adv 2021; 11:10364-10374. [PMID: 35423531 PMCID: PMC8695597 DOI: 10.1039/d0ra10119b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Accepted: 03/03/2021] [Indexed: 11/21/2022] Open
Abstract
Corals are vulnerable to increasing ocean temperatures. It is known that elevated temperatures lead to the breakdown of an essential mutualistic relationship with photosynthetic algae. The molecular mechanisms of this temperature-dependent loss of symbiosis are less well understood. Here, the thermal stability of a critical metabolic enzyme, glyceraldehyde-3-phosphate dehydrogenase, from the stony coral Acropora millepora was found to increase significantly in the presence of its cofactor NAD+. Determination of the structure of the cofactor-enzyme complex (PDB ID 6PX2) revealed variable NAD+ occupancy across the four monomers of the tetrameric enzyme. The structure of the fully occupied monomers was compared to those with partial cofactor occupancy, identifying regions of difference that may account for the increased thermal stability.
Collapse
Affiliation(s)
- Astrid M Perez
- Department of Chemistry, Franklin & Marshall College Lancaster PA 17604 USA +1 717 358 4846 +1 717 358 4846
| | - Jacob A Wolfe
- Department of Chemistry, Franklin & Marshall College Lancaster PA 17604 USA +1 717 358 4846 +1 717 358 4846
| | - Janse T Schermerhorn
- Department of Chemistry, Franklin & Marshall College Lancaster PA 17604 USA +1 717 358 4846 +1 717 358 4846
- Department of Biology, Franklin & Marshall College Lancaster PA 17604 USA
| | - Yiwen Qian
- Department of Chemistry, Franklin & Marshall College Lancaster PA 17604 USA +1 717 358 4846 +1 717 358 4846
| | - Bekim A Cela
- Department of Biology, Franklin & Marshall College Lancaster PA 17604 USA
| | - Cody R Kalinowski
- Department of Biology, Franklin & Marshall College Lancaster PA 17604 USA
| | - Garrett E Largoza
- Department of Biology, Franklin & Marshall College Lancaster PA 17604 USA
| | - Peter A Fields
- Department of Biology, Franklin & Marshall College Lancaster PA 17604 USA
| | - Gabriel S Brandt
- Department of Chemistry, Franklin & Marshall College Lancaster PA 17604 USA +1 717 358 4846 +1 717 358 4846
| |
Collapse
|
8
|
Chachulski L, Windshügel B. LEADS-FRAG: A Benchmark Data Set for Assessment of Fragment Docking Performance. J Chem Inf Model 2020; 60:6544-6554. [PMID: 33289563 DOI: 10.1021/acs.jcim.0c00693] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Fragment-based drug design is a popular approach in drug discovery, which makes use of computational methods such as molecular docking. To assess fragment placement performance of molecular docking programs, we constructed LEADS-FRAG, a benchmark data set containing 93 high-quality protein-fragment complexes that were selected from the Protein Data Bank using a rational and unbiased process. The data set contains fully prepared protein and fragment structures and is publicly available. Moreover, we used LEADS-FRAG for evaluating the small-molecule docking programs AutoDock, AutoDock Vina, FlexX, and GOLD for their fragment docking performance. GOLD in combination with the scoring function ChemPLP and AutoDock Vina performed best and generated near-native conformations (root mean square deviation <1.5 Å) for more than 50% of the data set considering the top-ranked docking pose. Taking into account all docking poses, the tested programs generated near-native conformations for up to 86% of the fragments in LEADS-FRAG. By rescoring all docking poses with the GOLD scoring functions and the Protein-Ligand Informatics force field, the number of near-native conformations increased up to 40% with respect to the top-rescored poses. Our results show that conventional small-molecule docking programs achieve a satisfactory fragment docking performance when utilizing rescoring.
Collapse
Affiliation(s)
- Laura Chachulski
- Fraunhofer Institute for Molecular Biology and Applied Ecology IME, ScreeningPort, Hamburg 22525, Germany.,Jacobs University Bremen gGmbH, Bremen 28759, Germany
| | - Björn Windshügel
- Fraunhofer Institute for Molecular Biology and Applied Ecology IME, ScreeningPort, Hamburg 22525, Germany.,Institute for Biochemistry and Molecular Biology, Department of Chemistry, Universität Hamburg, Hamburg 20146, Germany
| |
Collapse
|
9
|
Rochira W, Agirre J. Iris: Interactive all-in-one graphical validation of 3D protein model iterations. Protein Sci 2020; 30:93-107. [PMID: 32964594 PMCID: PMC7737763 DOI: 10.1002/pro.3955] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Revised: 09/15/2020] [Accepted: 09/15/2020] [Indexed: 11/12/2022]
Abstract
Iris validation is a Python package created to represent comprehensive per‐residue validation metrics for entire protein chains in a compact, readable and interactive view. These metrics can either be calculated by Iris, or by a third‐party program such as MolProbity. We show that those parts of a protein model requiring attention may generate ripples across the metrics on the diagram, immediately catching the modeler's attention. Iris can run as a standalone tool, or be plugged into existing structural biology software to display per‐chain model quality at a glance, with a particular emphasis on evaluating incremental changes resulting from the iterative nature of model building and refinement. Finally, the integration of Iris into the CCP4i2 graphical user interface is provided as a showcase of its pluggable design.
Collapse
Affiliation(s)
- William Rochira
- Department of Chemistry, York Structural Biology Laboratory, University of York, York, UK
| | - Jon Agirre
- Department of Chemistry, York Structural Biology Laboratory, University of York, York, UK
| |
Collapse
|
10
|
Wlodawer A, Dauter Z, Shabalin IG, Gilski M, Brzezinski D, Kowiel M, Minor W, Rupp B, Jaskolski M. Ligand-centered assessment of SARS-CoV-2 drug target models in the Protein Data Bank. FEBS J 2020; 287:3703-3718. [PMID: 32418327 PMCID: PMC7276724 DOI: 10.1111/febs.15366] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2020] [Revised: 05/02/2020] [Accepted: 05/12/2020] [Indexed: 12/16/2022]
Abstract
A bright spot in the SARS‐CoV‐2 (CoV‐2) coronavirus pandemic has been the immediate mobilization of the biomedical community, working to develop treatments and vaccines for COVID‐19. Rational drug design against emerging threats depends on well‐established methodology, mainly utilizing X‐ray crystallography, to provide accurate structure models of the macromolecular drug targets and of their complexes with candidates for drug development. In the current crisis, the structural biological community has responded by presenting structure models of CoV‐2 proteins and depositing them in the Protein Data Bank (PDB), usually without time embargo and before publication. Since the structures from the first‐line research are produced in an accelerated mode, there is an elevated chance of mistakes and errors, with the ultimate risk of hindering, rather than speeding up, drug development. In the present work, we have used model‐validation metrics and examined the electron density maps for the deposited models of CoV‐2 proteins and a sample of related proteins available in the PDB as of April 1, 2020. We present these results with the aim of helping the biomedical community establish a better‐validated pool of data. The proteins are divided into groups according to their structure and function. In most cases, no major corrections were necessary. However, in several cases significant revisions in the functionally sensitive area of protein–inhibitor complexes or for bound ions justified correction, re‐refinement, and eventually reversioning in the PDB. The re‐refined coordinate files and a tool for facilitating model comparisons are available at https://covid-19.bioreproducibility.org.
Collapse
Affiliation(s)
- Alexander Wlodawer
- Protein Structure Section, Macromolecular Crystallography Laboratory, NCI, Frederick, MD, USA
| | - Zbigniew Dauter
- Synchrotron Radiation Research Section, Macromolecular Crystallography Laboratory, NCI, Argonne National Laboratory, IL, USA
| | - Ivan G Shabalin
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA
| | - Miroslaw Gilski
- Department of Crystallography, Faculty of Chemistry, A. Mickiewicz University, Poznan, Poland.,Center for Biocrystallographic Research, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Dariusz Brzezinski
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA.,Center for Biocrystallographic Research, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland.,Institute of Computing Science, Poznan University of Technology, Poland
| | - Marcin Kowiel
- Center for Biocrystallographic Research, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Wladek Minor
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA
| | - Bernhard Rupp
- k.-k. Hofkristallamt, San Diego, CA, USA.,Institute of Genetic Epidemiology, Medical University Innsbruck, Austria
| | - Mariusz Jaskolski
- Department of Crystallography, Faculty of Chemistry, A. Mickiewicz University, Poznan, Poland.,Center for Biocrystallographic Research, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| |
Collapse
|
11
|
Fusani L, Palmer DS, Somers DO, Wall ID. Exploring Ligand Stability in Protein Crystal Structures Using Binding Pose Metadynamics. J Chem Inf Model 2020; 60:1528-1539. [PMID: 31910338 PMCID: PMC7145342 DOI: 10.1021/acs.jcim.9b00843] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
![]()
Identification of
correct protein–ligand binding poses is
important in structure-based drug design and crucial for the evaluation
of protein–ligand binding affinity. Protein–ligand coordinates are commonly obtained from
crystallography experiments that provide a static model of an ensemble
of conformations. Binding pose metadynamics (BPMD) is an enhanced
sampling method that allows for an efficient assessment of ligand
stability in solution. Ligand poses that are unstable under the bias
of the metadynamics simulation are expected to be infrequently occupied
in the energy landscape, thus making minimal contributions to the
binding affinity. Here, the robustness of the method is studied using
crystal structures with ligands known to be incorrectly modeled, as
well as 63 structurally diverse crystal structures with ligand fit
to electron density from the Twilight database. Results show that
BPMD can successfully differentiate compounds whose binding pose is
not supported by the electron density from those with well-defined
electron density.
Collapse
Affiliation(s)
- Lucia Fusani
- Molecular Design UK, GSK Medicines Research Centre, Gunnels Wood Road, Stevenage, Hertfordshire SG1 2NY, U.K.,Department of Pure and Applied Chemistry, University of Strathclyde, 295 Cathedral Street, Glasgow G11XL, U.K
| | - David S Palmer
- Department of Pure and Applied Chemistry, University of Strathclyde, 295 Cathedral Street, Glasgow G11XL, U.K
| | - Don O Somers
- Protein, Cellular and Structural Sciences, GSK Medicines Research Centre, Gunnels Wood Road, Stevenage, Hertfordshire SG1 2NY, U.K
| | - Ian D Wall
- Molecular Design UK, GSK Medicines Research Centre, Gunnels Wood Road, Stevenage, Hertfordshire SG1 2NY, U.K
| |
Collapse
|
12
|
El Hage K, Zoete V. Strong Enrichment of Aromatic and Sulfur-Containing Residues in Ligand-Protein Binding Sites. J Chem Inf Model 2019; 59:4921-4928. [PMID: 31661621 DOI: 10.1021/acs.jcim.9b00582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
While certain residues have clear involvement in determining the 3D structure of a macromolecule because they affect the folding topology or the overall protein stability, the role of different residues in ligand accommodation and binding has attracted less attention. On the basis of the assumption that drug-binding sites on target molecules have specific amino acid compositions, the incidence of each standard amino acid at the binding sites of small molecules and their correlations are calculated for an unprecedented large set of high-quality X-ray structures. Results show, for the first time, strong and highly correlated enrichments of aromatic and sulfur-containing residues, which play an important role in ligand binding and shape the nature of the chemical interactions.
Collapse
Affiliation(s)
- Krystel El Hage
- Computer-Aided Molecular Engineering Group, Department of Fundamental Oncology , University of Lausanne , Ludwig Lausanne Branch, Route de la Corniche 9A , 1066 Epalinges , Switzerland
| | - Vincent Zoete
- Computer-Aided Molecular Engineering Group, Department of Fundamental Oncology , University of Lausanne , Ludwig Lausanne Branch, Route de la Corniche 9A , 1066 Epalinges , Switzerland.,Molecular Modeling Group , SIB Swiss Institute of Bioinformatics , Quartier UNIL-Sorge, Bâtiment Amphipole , 1015 Lausanne , Switzerland
| |
Collapse
|
13
|
Masmaliyeva RC, Murshudov GN. Analysis and validation of macromolecular B values. Acta Crystallogr D Struct Biol 2019; 75:505-518. [PMID: 31063153 PMCID: PMC6503761 DOI: 10.1107/s2059798319004807] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2019] [Accepted: 04/09/2019] [Indexed: 11/10/2022] Open
Abstract
This paper describes a global analysis of macromolecular B values. It is shown that the distribution of B values generally follows the shifted inverse-gamma distribution (SIGD). The parameters of the SIGD are estimated using the Fisher scoring technique with the expected Fisher information matrix. It is demonstrated that a contour plot based on the parameters of the SIGD can play a role in the validation of macromolecular structures. The dependence of the peak-height distribution on resolution and atomic B values is also analysed. It is demonstrated that the B-value distribution can have a dramatically different effect on peak heights at different resolutions. Consequently, a comparative analysis of the B values of neighbouring atoms must account for resolution. A combination of the SIGD, peak-height distribution and outlier detection was used to identify a number of entries from the PDB that require attention. It is also shown that the presence of a multimodal B-value distribution often indicates that some loops or parts of the molecule have either been mismodelled or have dramatically different mobility, depending on their environment within the crystal. These distributions can also indicate the level of sharpening/blurring used before atomic structure refinement. It is recommended that procedures such as sharpening/blurring should be avoided during refinement, although they can play important roles in map visualization and model building.
Collapse
Affiliation(s)
- Rafiga C. Masmaliyeva
- Institute of Molecular Biology and Biotechnology ANAS, Matbuat Avenue 2a, Baku 1073, Azerbaijan
| | - Garib N. Murshudov
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, England
| |
Collapse
|
14
|
van Zundert GCP, Hudson BM, de Oliveira SHP, Keedy DA, Fonseca R, Heliou A, Suresh P, Borrelli K, Day T, Fraser JS, van den Bedem H. qFit-ligand Reveals Widespread Conformational Heterogeneity of Drug-Like Molecules in X-Ray Electron Density Maps. J Med Chem 2018; 61:11183-11198. [PMID: 30457858 DOI: 10.1021/acs.jmedchem.8b01292] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Proteins and ligands sample a conformational ensemble that governs molecular recognition, activity, and dissociation. In structure-based drug design, access to this conformational ensemble is critical to understand the balance between entropy and enthalpy in lead optimization. However, ligand conformational heterogeneity is currently severely underreported in crystal structures in the Protein Data Bank, owing in part to a lack of automated and unbiased procedures to model an ensemble of protein-ligand states into X-ray data. Here, we designed a computational method, qFit-ligand, to automatically resolve conformationally averaged ligand heterogeneity in crystal structures, and applied it to a large set of protein receptor-ligand complexes. In an analysis of the cancer related BRD4 domain, we found that up to 29% of protein crystal structures bound with drug-like molecules present evidence of unmodeled, averaged, relatively isoenergetic conformations in ligand-receptor interactions. In many retrospective cases, these alternate conformations were adventitiously exploited to guide compound design, resulting in improved potency or selectivity. Combining qFit-ligand with high-throughput screening or multitemperature crystallography could therefore augment the structure-based drug design toolbox.
Collapse
Affiliation(s)
| | - Brandi M Hudson
- Department of Bioengineering and Therapeutic Sciences , UCSF , San Francisco , California 94158 , United States
| | - Saulo H P de Oliveira
- SLAC National Accelerator Laboratory , Stanford University , Menlo Park , California 94025 United States
| | - Daniel A Keedy
- Department of Bioengineering and Therapeutic Sciences , UCSF , San Francisco , California 94158 , United States
| | - Rasmus Fonseca
- Department of Molecular and Cellular Physiology , Stanford University , Stanford , California 94305 , United States
| | - Amelie Heliou
- LIX, Ecole Polytechnique, CNRS, Inria , Université Paris-Saclay , 91128 Palaiseau , France
| | - Pooja Suresh
- Department of Bioengineering and Therapeutic Sciences , UCSF , San Francisco , California 94158 , United States
| | | | - Tyler Day
- Schrödinger , New York , New York 10036 , United States
| | - James S Fraser
- Department of Bioengineering and Therapeutic Sciences , UCSF , San Francisco , California 94158 , United States
| | - Henry van den Bedem
- Department of Bioengineering and Therapeutic Sciences , UCSF , San Francisco , California 94158 , United States.,SLAC National Accelerator Laboratory , Stanford University , Menlo Park , California 94025 United States
| |
Collapse
|
15
|
Srivastava A, Nagai T, Srivastava A, Miyashita O, Tama F. Role of Computational Methods in Going beyond X-ray Crystallography to Explore Protein Structure and Dynamics. Int J Mol Sci 2018; 19:E3401. [PMID: 30380757 PMCID: PMC6274748 DOI: 10.3390/ijms19113401] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Revised: 10/20/2018] [Accepted: 10/27/2018] [Indexed: 12/13/2022] Open
Abstract
Protein structural biology came a long way since the determination of the first three-dimensional structure of myoglobin about six decades ago. Across this period, X-ray crystallography was the most important experimental method for gaining atomic-resolution insight into protein structures. However, as the role of dynamics gained importance in the function of proteins, the limitations of X-ray crystallography in not being able to capture dynamics came to the forefront. Computational methods proved to be immensely successful in understanding protein dynamics in solution, and they continue to improve in terms of both the scale and the types of systems that can be studied. In this review, we briefly discuss the limitations of X-ray crystallography in studying protein dynamics, and then provide an overview of different computational methods that are instrumental in understanding the dynamics of proteins and biomacromolecular complexes.
Collapse
Affiliation(s)
- Ashutosh Srivastava
- Institute of Transformative Bio-Molecules (WPI), Nagoya University, Nagoya, Aichi 464-8601, Japan.
| | - Tetsuro Nagai
- Department of Physics, Graduate School of Science, Nagoya University, Nagoya, Aichi 464-8602, Japan.
| | - Arpita Srivastava
- Department of Physics, Graduate School of Science, Nagoya University, Nagoya, Aichi 464-8602, Japan.
| | - Osamu Miyashita
- RIKEN-Center for Computational Science, Kobe, Hyogo 650-0047, Japan.
| | - Florence Tama
- Institute of Transformative Bio-Molecules (WPI), Nagoya University, Nagoya, Aichi 464-8601, Japan.
- Department of Physics, Graduate School of Science, Nagoya University, Nagoya, Aichi 464-8602, Japan.
- RIKEN-Center for Computational Science, Kobe, Hyogo 650-0047, Japan.
| |
Collapse
|
16
|
Young JY, Westbrook JD, Feng Z, Peisach E, Persikova I, Sala R, Sen S, Berrisford JM, Swaminathan GJ, Oldfield TJ, Gutmanas A, Igarashi R, Armstrong DR, Baskaran K, Chen L, Chen M, Clark AR, Di Costanzo L, Dimitropoulos D, Gao G, Ghosh S, Gore S, Guranovic V, Hendrickx PMS, Hudson BP, Ikegawa Y, Kengaku Y, Lawson CL, Liang Y, Mak L, Mukhopadhyay A, Narayanan B, Nishiyama K, Patwardhan A, Sahni G, Sanz-García E, Sato J, Sekharan MR, Shao C, Smart OS, Tan L, van Ginkel G, Yang H, Zhuravleva MA, Markley JL, Nakamura H, Kurisu G, Kleywegt GJ, Velankar S, Berman HM, Burley SK. Worldwide Protein Data Bank biocuration supporting open access to high-quality 3D structural biology data. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018; 2018:4844086. [PMID: 29688351 PMCID: PMC5804564 DOI: 10.1093/database/bay002] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/02/2017] [Accepted: 01/02/2018] [Indexed: 11/24/2022]
Abstract
The Protein Data Bank (PDB) is the single global repository for experimentally determined 3D structures of biological macromolecules and their complexes with ligands. The worldwide PDB (wwPDB) is the international collaboration that manages the PDB archive according to the FAIR principles: Findability, Accessibility, Interoperability and Reusability. The wwPDB recently developed OneDep, a unified tool for deposition, validation and biocuration of structures of biological macromolecules. All data deposited to the PDB undergo critical review by wwPDB Biocurators. This article outlines the importance of biocuration for structural biology data deposited to the PDB and describes wwPDB biocuration processes and the role of expert Biocurators in sustaining a high-quality archive. Structural data submitted to the PDB are examined for self-consistency, standardized using controlled vocabularies, cross-referenced with other biological data resources and validated for scientific/technical accuracy. We illustrate how biocuration is integral to PDB data archiving, as it facilitates accurate, consistent and comprehensive representation of biological structure data, allowing efficient and effective usage by research scientists, educators, students and the curious public worldwide. Database URL: https://www.wwpdb.org/
Collapse
Affiliation(s)
- Jasmine Y Young
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - John D Westbrook
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Zukang Feng
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Ezra Peisach
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Irina Persikova
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Raul Sala
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Sanchayita Sen
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - John M Berrisford
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - G Jawahar Swaminathan
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Thomas J Oldfield
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Aleksandras Gutmanas
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Reiko Igarashi
- PDBj, Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita-shi, Osaka 565-0871, Japan
| | - David R Armstrong
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Kumaran Baskaran
- BMRB, BioMagResBank, University of Wisconsin-Madison, 433 Babcock Drive, Madison, WI 53706, USA
| | - Li Chen
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Minyu Chen
- PDBj, Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita-shi, Osaka 565-0871, Japan
| | - Alice R Clark
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Luigi Di Costanzo
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Dimitris Dimitropoulos
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Guanghua Gao
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Sutapa Ghosh
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Swanand Gore
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Vladimir Guranovic
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Pieter M S Hendrickx
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Brian P Hudson
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Yasuyo Ikegawa
- PDBj, Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita-shi, Osaka 565-0871, Japan
| | - Yumiko Kengaku
- PDBj, Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita-shi, Osaka 565-0871, Japan
| | - Catherine L Lawson
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Yuhe Liang
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Lora Mak
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Abhik Mukhopadhyay
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Buvaneswari Narayanan
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Kayoko Nishiyama
- PDBj, Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita-shi, Osaka 565-0871, Japan
| | - Ardan Patwardhan
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Gaurav Sahni
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Eduardo Sanz-García
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Junko Sato
- PDBj, Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita-shi, Osaka 565-0871, Japan
| | - Monica R Sekharan
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Chenghua Shao
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Oliver S Smart
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Lihua Tan
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Glen van Ginkel
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Huanwang Yang
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Marina A Zhuravleva
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - John L Markley
- BMRB, BioMagResBank, University of Wisconsin-Madison, 433 Babcock Drive, Madison, WI 53706, USA
| | - Haruki Nakamura
- PDBj, Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita-shi, Osaka 565-0871, Japan
| | - Genji Kurisu
- PDBj, Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita-shi, Osaka 565-0871, Japan
| | - Gerard J Kleywegt
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Sameer Velankar
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Helen M Berman
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Stephen K Burley
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA.,RCSB Protein Data Bank, San Diego Supercomputer Center and Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, 9500 Gilman Dr., La Jolla, CA 92093, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA.,Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, Little Albany St, New Brunswick, NJ 08901, USA
| |
Collapse
|
17
|
van Beusekom B, Joosten K, Hekkelman ML, Joosten RP, Perrakis A. Homology-based loop modeling yields more complete crystallographic protein structures. IUCRJ 2018; 5:585-594. [PMID: 30224962 PMCID: PMC6126648 DOI: 10.1107/s2052252518010552] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/18/2018] [Accepted: 07/23/2018] [Indexed: 06/08/2023]
Abstract
Inherent protein flexibility, poor or low-resolution diffraction data or poorly defined electron-density maps often inhibit the building of complete structural models during X-ray structure determination. However, recent advances in crystallographic refinement and model building often allow completion of previously missing parts. This paper presents algorithms that identify regions missing in a certain model but present in homologous structures in the Protein Data Bank (PDB), and 'graft' these regions of interest. These new regions are refined and validated in a fully automated procedure. Including these developments in the PDB-REDO pipeline has enabled the building of 24 962 missing loops in the PDB. The models and the automated procedures are publicly available through the PDB-REDO databank and webserver. More complete protein structure models enable a higher quality public archive but also a better understanding of protein function, better comparison between homologous structures and more complete data mining in structural bioinformatics projects.
Collapse
Affiliation(s)
- Bart van Beusekom
- Department of Biochemistry, The Netherlands Cancer Institute, Plesmanlaan 121, Amsterdam 1066CX, The Netherlands
| | - Krista Joosten
- Department of Biochemistry, The Netherlands Cancer Institute, Plesmanlaan 121, Amsterdam 1066CX, The Netherlands
| | - Maarten L. Hekkelman
- Department of Biochemistry, The Netherlands Cancer Institute, Plesmanlaan 121, Amsterdam 1066CX, The Netherlands
| | - Robbie P. Joosten
- Department of Biochemistry, The Netherlands Cancer Institute, Plesmanlaan 121, Amsterdam 1066CX, The Netherlands
| | - Anastassis Perrakis
- Department of Biochemistry, The Netherlands Cancer Institute, Plesmanlaan 121, Amsterdam 1066CX, The Netherlands
| |
Collapse
|
18
|
Ariz-Extreme I, Hub JS. Assigning crystallographic electron densities with free energy calculations-The case of the fluoride channel Fluc. PLoS One 2018; 13:e0196751. [PMID: 29771936 PMCID: PMC5957342 DOI: 10.1371/journal.pone.0196751] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2018] [Accepted: 04/18/2018] [Indexed: 11/25/2022] Open
Abstract
Approximately 90% of the structures in the Protein Data Bank (PDB) were obtained by X-ray crystallography or electron microscopy. Whereas the overall quality of structure is considered high, thanks to a wide range of tools for structure validation, uncertainties may arise from density maps of small molecules, such as organic ligands, ions or water, which are non-covalently bound to the biomolecules. Even with some experience and chemical intuition, the assignment of such disconnected electron densities is often far from obvious. In this study, we suggest the use of molecular dynamics (MD) simulations and free energy calculations, which are well-established computational methods, to aid in the assignment of ambiguous disconnected electron densities. Specifically, estimates of (i) relative binding affinities, for instance between an ion and water, (ii) absolute binding free energies, i.e., free energies for transferring a solute from bulk solvent to a binding site, and (iii) stability assessments during equilibrium simulations may reveal the most plausible assignments. We illustrate this strategy using the crystal structure of the fluoride specific channel (Fluc), which contains five disconnected electron densities previously interpreted as four fluoride and one sodium ion. The simulations support the assignment of the sodium ion. In contrast, calculations of relative and absolute binding free energies as well as stability assessments during free MD simulations suggest that four of the densities represent water molecules instead of fluoride. The assignment of water is compatible with the loss of these densities in the non-conductive F82I/F85I mutant of Fluc. We critically discuss the role of the ion force fields for the calculations presented here. Overall, these findings indicate that MD simulations and free energy calculations are helpful tools for modeling water and ions into crystallographic density maps.
Collapse
Affiliation(s)
- Igor Ariz-Extreme
- Institute for Microbiology and Genetics, University of Goettingen, Göttingen, Germany
| | - Jochen S. Hub
- Institute for Microbiology and Genetics, University of Goettingen, Göttingen, Germany
- * E-mail:
| |
Collapse
|
19
|
Carlsson GH, Hasse D, Cardinale F, Prandi C, Andersson I. The elusive ligand complexes of the DWARF14 strigolactone receptor. JOURNAL OF EXPERIMENTAL BOTANY 2018; 69:2345-2354. [PMID: 29394369 PMCID: PMC5913616 DOI: 10.1093/jxb/ery036] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/03/2017] [Accepted: 01/19/2018] [Indexed: 05/07/2023]
Abstract
Strigolactones, a group of terpenoid lactones, control many aspects of plant growth and development, but the active forms of these plant hormones and their mode of action at the molecular level are still unknown. The strigolactone protein receptor is unusual because it has been shown to cleave the hormone and supposedly forms a covalent bond with the cleaved hormone fragment. This interaction is suggested to induce a conformational change in the receptor that primes it for subsequent interaction with partners in the signalling pathway. Substantial efforts have been invested into describing the interaction of synthetic strigolactone analogues with the receptor, resulting in a number of crystal structures. This investigation combines a re-evaluation of models in the Protein Data Bank with a search for new conditions that may permit the capture of a receptor-ligand complex. While weak difference density is frequently observed in the binding cavity, possibly due to a low-occupancy compound, the models often contain features not supported by the X-ray data. Thus, at this stage, we do not believe that any detailed deductions about the nature, conformation, or binding mode of the ligand can be made with any confidence.
Collapse
Affiliation(s)
- Gunilla H Carlsson
- Laboratory of Molecular Biophysics, Department of Cell and Molecular Biology, Uppsala University, Husargatan, Uppsala, Sweden
| | - Dirk Hasse
- Laboratory of Molecular Biophysics, Department of Cell and Molecular Biology, Uppsala University, Husargatan, Uppsala, Sweden
| | - Francesca Cardinale
- Department of Agricultural, Forestry and Food Science, University of Turin, Largo Paolo Braccini, Grugliasco, Italy
| | - Cristina Prandi
- Department of Chemistry, University of Turin, via P. Giuria, Torino, Italy
| | - Inger Andersson
- Laboratory of Molecular Biophysics, Department of Cell and Molecular Biology, Uppsala University, Husargatan, Uppsala, Sweden
- Correspondence:
| |
Collapse
|
20
|
Wlodawer A, Dauter Z, Porebski PJ, Minor W, Stanfield R, Jaskolski M, Pozharski E, Weichenberger CX, Rupp B. Detect, correct, retract: How to manage incorrect structural models. FEBS J 2018; 285:444-466. [PMID: 29113027 PMCID: PMC5799025 DOI: 10.1111/febs.14320] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2017] [Accepted: 11/01/2017] [Indexed: 12/13/2022]
Abstract
The massive technical and computational progress of biomolecular crystallography has generated some adverse side effects. Most crystal structure models, produced by crystallographers or well-trained structural biologists, constitute useful sources of information, but occasional extreme outliers remind us that the process of structure determination is not fail-safe. The occurrence of severe errors or gross misinterpretations raises fundamental questions: Why do such aberrations emerge in the first place? How did they evade the sophisticated validation procedures which often produce clear and dire warnings, and why were severe errors not noticed by the depositors themselves, their supervisors, referees and editors? Once detected, what can be done to either correct, improve or eliminate such models? How do incorrect models affect the underlying claims or biomedical hypotheses they were intended, but failed, to support? What is the long-range effect of the propagation of such errors? And finally, what mechanisms can be envisioned to restore the validity of the scientific record and, if necessary, retract publications that are clearly invalidated by the lack of experimental evidence? We suggest that cognitive bias and flawed epistemology are likely at the root of the problem. By using examples from the published literature and from public repositories such as the Protein Data Bank, we provide case summaries to guide correction or improvement of structural models. When strong claims are unsustainable because of a deficient crystallographic model, removal of such a model and even retraction of the affected publication are necessary to restore the integrity of the scientific record.
Collapse
Affiliation(s)
- Alexander Wlodawer
- Protein Structure Section, Macromolecular Crystallography Laboratory, National Cancer Institute, Frederick, MD, 21702, USA
| | - Zbigniew Dauter
- Synchrotron Radiation Research Section, Macromolecular Crystallography Laboratory, National Cancer Institute, Argonne National Laboratory, Argonne, IL 60439, USA
| | - Przemyslaw J. Porebski
- Department of Molecular Physiology and Biological Physics, University of Virginia, 1340 Jefferson Park Avenue, Charlottesville, VA, 22908, USA
| | - Wladek Minor
- Department of Molecular Physiology and Biological Physics, University of Virginia, 1340 Jefferson Park Avenue, Charlottesville, VA, 22908, USA
| | - Robyn Stanfield
- Department of Structural and Computational Biology, BCC206, The Scripps Research Institute, 10550 N. Torrey Pines Road, La Jolla, CA, 92037, USA
| | - Mariusz Jaskolski
- Department of Crystallography, Faculty of Chemistry, A. Mickiewicz University, Umultowska 89b, Poznan, 61-614, Poland
- Center for Biocrystallographic Research, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, Poznan, 61-704, Poland
| | - Edwin Pozharski
- Department of Biochemistry and Molecular Biology, University of Maryland School of Medicine, Baltimore, MD, USA
| | | | - Bernhard Rupp
- CVMO, k.-k.Hofkristallamt, 991 Audrey Place, Vista, CA, 92084, USA
- Department of Genetic Epidemiology, Medical University Innsbruck, Schöpfstr. 41, Innsbruck, 6020, Austria
| |
Collapse
|
21
|
Helliwell JR, McMahon B, Guss JM, Kroon-Batenburg LMJ. The science is in the data. IUCRJ 2017; 4:714-722. [PMID: 29123672 PMCID: PMC5668855 DOI: 10.1107/s2052252517013690] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2017] [Accepted: 09/24/2017] [Indexed: 05/22/2023]
Abstract
Understanding published research results should be through one's own eyes and include the opportunity to work with raw diffraction data to check the various decisions made in the analyses by the original authors. Today, preserving raw diffraction data is technically and organizationally viable at a growing number of data archives, both centralized and distributed, which are empowered to register data sets and obtain a preservation descriptor, typically a 'digital object identifier'. This introduces an important role of preserving raw data, namely understanding where we fail in or could improve our analyses. Individual science area case studies in crystallography are provided.
Collapse
Affiliation(s)
- John R. Helliwell
- School of Chemistry, University of Manchester, Manchester M13 9PL, England
| | - Brian McMahon
- International Union of Crystallography, 5 Abbey Square, Chester CH1 2HU, England
| | - J. Mitchell Guss
- School of Life and Environmental Sciences, The University of Sydney, Sydney, NSW 2006, Australia
| | - Loes M. J. Kroon-Batenburg
- Crystal and Structural Chemistry, Bijvoet Center for Biomolecular Research, Utrecht University, Padualaan 8, CH 3584 Utrecht, The Netherlands
| |
Collapse
|
22
|
Porebski PJ, Sroka P, Zheng H, Cooper DR, Minor W. Molstack-Interactive visualization tool for presentation, interpretation, and validation of macromolecules and electron density maps. Protein Sci 2017; 27:86-94. [PMID: 28815771 DOI: 10.1002/pro.3272] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2017] [Revised: 08/11/2017] [Accepted: 08/14/2017] [Indexed: 11/07/2022]
Abstract
Our understanding of the world of biomolecular structures is based upon the interpretation of macromolecular models, of which ∼90% are an interpretation of electron density maps. This structural information guides scientific progress and exploration in many biomedical disciplines. The Protein Data Bank's web portals have made these structures available for mass scientific consumption and greatly broaden the scope of information presented in scientific publications. The portals provide numerous quality metrics; however, the portion of the structure that is most vital for interpretation of the function may have the most difficult to interpret electron density and this ambiguity is not reflected by any single metric. The possible consequences of basing research on suboptimal models make it imperative to inspect the agreement of a model with its experimental evidence. Molstack, a web-based interactive publishing platform for structural data, allows users to present density maps and structural models by displaying a collection of maps and models, including different interpretation of one's own data, re-refinements, and corrections of existing structures. Molstack organizes the sharing and dissemination of these structural models along with their experimental evidence as an interactive session. Molstack was designed with three groups of users in mind; researchers can present the evidence of their interpretation, reviewers and readers can independently judge the experimental evidence of the authors' conclusions, and other researchers can present or even publish their new hypotheses in the context of prior results. The server is available at http://molstack.bioreproducibility.org.
Collapse
Affiliation(s)
- Przemyslaw J Porebski
- Department of Biological Physics & Molecular Physiology, University of Virginia, Charlottesville, Virginia
| | - Piotr Sroka
- Department of Biological Physics & Molecular Physiology, University of Virginia, Charlottesville, Virginia
| | - Heping Zheng
- Department of Biological Physics & Molecular Physiology, University of Virginia, Charlottesville, Virginia
| | - David R Cooper
- Department of Biological Physics & Molecular Physiology, University of Virginia, Charlottesville, Virginia
| | - Wladek Minor
- Department of Biological Physics & Molecular Physiology, University of Virginia, Charlottesville, Virginia
| |
Collapse
|
23
|
Helliwell JR. New developments in crystallography: exploring its technology, methods and scope in the molecular biosciences. Biosci Rep 2017; 37:BSR20170204. [PMID: 28572170 PMCID: PMC6434086 DOI: 10.1042/bsr20170204] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2017] [Revised: 05/31/2017] [Accepted: 06/01/2017] [Indexed: 12/16/2022] Open
Abstract
Since the Protein Data Bank (PDB) was founded in 1971, there are now over 120,000 depositions, the majority of which are from X-ray crystallography and 90% of those made use of synchrotron beamlines. At the Cambridge Structure Database (CSD), founded in 1965, there are more than 800,000 'small molecule' crystal structure depositions and a very large number of those are relevant in the biosciences as ligands or cofactors. The technology for crystal structure analysis is still developing rapidly both at synchrotrons and in home labs. Determination of the details of the hydrogen atoms in biological macromolecules is well served using neutrons as probe. Large multi-macromolecular complexes cause major challenges to crystallization; electrons as probes offer unique advantages here. Methods developments naturally accompany technology change, mainly incremental but some, such as the tuneability, intensity and collimation of synchrotron radiation, have effected radical changes in capability of biological crystallography. In the past few years, the X-ray laser has taken X-ray crystallography measurement times into the femtosecond range. In terms of applications many new discoveries have been made in the molecular biosciences. The scope of crystallographic techniques is indeed very wide. As examples, new insights into chemical catalysis of enzymes and relating ligand bound structures to thermodynamics have been gained but predictive power is seen as not yet achieved. Metal complexes are also an emerging theme for biomedicine applications. Our studies of coloration of live and cooked lobsters proved to be an unexpected favourite with the public and schoolchildren. More generally, public understanding of the biosciences and crystallography's role within the field have been greatly enhanced by the United Nations International Year of Crystallography coordinated by the International Union of Crystallography. This topical review describes each of these areas along with illustrative results to document the scope of each methodology.
Collapse
|
24
|
Peach ML, Cachau RE, Nicklaus MC. Conformational energy range of ligands in protein crystal structures: The difficult quest for accurate understanding. J Mol Recognit 2017; 30:10.1002/jmr.2618. [PMID: 28233410 PMCID: PMC5553890 DOI: 10.1002/jmr.2618] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2016] [Revised: 01/31/2017] [Accepted: 01/31/2017] [Indexed: 12/25/2022]
Abstract
In this review, we address a fundamental question: What is the range of conformational energies seen in ligands in protein-ligand crystal structures? This value is important biophysically, for better understanding the protein-ligand binding process; and practically, for providing a parameter to be used in many computational drug design methods such as docking and pharmacophore searches. We synthesize a selection of previously reported conflicting results from computational studies of this issue and conclude that high ligand conformational energies really are present in some crystal structures. The main source of disagreement between different analyses appears to be due to divergent treatments of electrostatics and solvation. At the same time, however, for many ligands, a high conformational energy is in error, due to either crystal structure inaccuracies or incorrect determination of the reference state. Aside from simple chemistry mistakes, we argue that crystal structure error may mainly be because of the heuristic weighting of ligand stereochemical restraints relative to the fit of the structure to the electron density. This problem cannot be fixed with improvements to electron density fitting or with simple ligand geometry checks, though better metrics are needed for evaluating ligand and binding site chemistry in addition to geometry during structure refinement. The ultimate solution for accurately determining ligand conformational energies lies in ultrahigh-resolution crystal structures that can be refined without restraints.
Collapse
Affiliation(s)
- Megan L Peach
- Basic Science Program, Chemical Biology Laboratory, Leidos Biomedical Research Inc., Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | - Raul E Cachau
- Data Science and Information Technology Program, Advanced Biomedical Computing Center, Leidos Biomedical Research Inc., Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | - Marc C Nicklaus
- Chemical Biology Laboratory, Center for Cancer Research, National Cancer Institute, Frederick, MD, USA
| |
Collapse
|
25
|
Christensen EM, Patel SM, Korasick DA, Campbell AC, Krause KL, Becker DF, Tanner JJ. Resolving the cofactor-binding site in the proline biosynthetic enzyme human pyrroline-5-carboxylate reductase 1. J Biol Chem 2017; 292:7233-7243. [PMID: 28258219 PMCID: PMC5409489 DOI: 10.1074/jbc.m117.780288] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2017] [Revised: 02/27/2017] [Indexed: 01/22/2023] Open
Abstract
Pyrroline-5-carboxylate reductase (PYCR) is the final enzyme in proline biosynthesis, catalyzing the NAD(P)H-dependent reduction of Δ1-pyrroline-5-carboxylate (P5C) to proline. Mutations in the PYCR1 gene alter mitochondrial function and cause the connective tissue disorder cutis laxa. Furthermore, PYCR1 is overexpressed in multiple cancers, and the PYCR1 knock-out suppresses tumorigenic growth, suggesting that PYCR1 is a potential cancer target. However, inhibitor development has been stymied by limited mechanistic details for the enzyme, particularly in light of a previous crystallographic study that placed the cofactor-binding site in the C-terminal domain rather than the anticipated Rossmann fold of the N-terminal domain. To fill this gap, we report crystallographic, sedimentation-velocity, and kinetics data for human PYCR1. Structures of binary complexes of PYCR1 with NADPH or proline determined at 1.9 Å resolution provide insight into cofactor and substrate recognition. We see NADPH bound to the Rossmann fold, over 25 Å from the previously proposed site. The 1.85 Å resolution structure of a ternary complex containing NADPH and a P5C/proline analog provides a model of the Michaelis complex formed during hydride transfer. Sedimentation velocity shows that PYCR1 forms a concentration-dependent decamer in solution, consistent with the pentamer-of-dimers assembly seen crystallographically. Kinetic and mutational analysis confirmed several features seen in the crystal structure, including the importance of a hydrogen bond between Thr-238 and the substrate as well as limited cofactor discrimination.
Collapse
Affiliation(s)
| | - Sagar M Patel
- the Department of Biochemistry and Redox Biology Center, University of Nebraska, Lincoln, Nebraska 68588
| | | | | | - Kurt L Krause
- the Department of Biochemistry, University of Otago, Dunedin 9054, New Zealand, and
| | - Donald F Becker
- the Department of Biochemistry and Redox Biology Center, University of Nebraska, Lincoln, Nebraska 68588
| | - John J Tanner
- From the Departments of Chemistry and
- Biochemistry University of Missouri, Columbia, Missouri 65211
| |
Collapse
|
26
|
Abstract
Coot is a molecular-graphics program primarily aimed at model building using X-ray data. Recently, tools for the manipulation and representation of ligands have been introduced. Here, these new tools for ligand validation and comparison are described. Ligands in the wwPDB have been scored by density-fit, distortion and atom-clash metrics. The distributions of these scores can be used to assess the relative merits of the particular ligand in the protein-ligand complex of interest by means of `sliders' akin to those now available for each accession code on the wwPDB websites.
Collapse
Affiliation(s)
- Paul Emsley
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, England
| |
Collapse
|
27
|
Weichenberger CX, Pozharski E, Rupp B. Twilight reloaded: the peptide experience. Acta Crystallogr D Struct Biol 2017; 73:211-222. [PMID: 28291756 PMCID: PMC5349433 DOI: 10.1107/s205979831601620x] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2016] [Accepted: 10/12/2016] [Indexed: 01/20/2024] Open
Abstract
The de facto commoditization of biomolecular crystallography as a result of almost disruptive instrumentation automation and continuing improvement of software allows any sensibly trained structural biologist to conduct crystallographic studies of biomolecules with reasonably valid outcomes: that is, models based on properly interpreted electron density. Robust validation has led to major mistakes in the protein part of structure models becoming rare, but some depositions of protein-peptide complex structure models, which generally carry significant interest to the scientific community, still contain erroneous models of the bound peptide ligand. Here, the protein small-molecule ligand validation tool Twilight is updated to include peptide ligands. (i) The primary technical reasons and potential human factors leading to problems in ligand structure models are presented; (ii) a new method used to score peptide-ligand models is presented; (iii) a few instructive and specific examples, including an electron-density-based analysis of peptide-ligand structures that do not contain any ligands, are discussed in detail; (iv) means to avoid such mistakes and the implications for database integrity are discussed and (v) some suggestions as to how journal editors could help to expunge errors from the Protein Data Bank are provided.
Collapse
Affiliation(s)
| | - Edwin Pozharski
- Department of Biochemistry and Molecular Biology, University of Maryland School of Medicine, Baltimore, Maryland, USA
| | - Bernhard Rupp
- k.k. Hofkristallamt, 991 Audrey Place, Vista, CA 92084, USA
- Department of Genetic Epidemiology, Medical University Innsbruck, Schöpfstrasse 41, A-6020 Innsbruck, Austria
| |
Collapse
|
28
|
Shao C, Yang H, Westbrook JD, Young JY, Zardecki C, Burley SK. Multivariate Analyses of Quality Metrics for Crystal Structures in the PDB Archive. Structure 2017; 25:458-468. [PMID: 28216043 DOI: 10.1016/j.str.2017.01.013] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2016] [Revised: 01/19/2017] [Accepted: 01/29/2017] [Indexed: 11/18/2022]
Abstract
Following deployment of an augmented validation system by the Worldwide Protein Data Bank (wwPDB) partnership, the quality of crystal structures entering the PDB has improved. Of significance are improvements in quality measures now prominently displayed in the wwPDB validation report. Comparisons of PDB depositions made before and after introduction of the new reporting system show improvements in quality measures relating to pairwise atom-atom clashes, side-chain torsion angle rotamers, and local agreement between the atomic coordinate structure model and experimental electron density data. These improvements are largely independent of resolution limit and sample molecular weight. No significant improvement in the quality of associated ligands was observed. Principal component analysis revealed that structure quality could be summarized with three measures (Rfree, real-space R factor Z score, and a combined molecular geometry quality metric), which can in turn be reduced to a single overall quality metric readily interpretable by all PDB archive users.
Collapse
Affiliation(s)
- Chenghua Shao
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers University, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA.
| | - Huanwang Yang
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers University, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - John D Westbrook
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers University, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers University, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jasmine Y Young
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers University, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Christine Zardecki
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers University, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Stephen K Burley
- RCSB Protein Data Bank, Center for Integrative Proteomics Research, Rutgers University, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers University, The State University of New Jersey, Piscataway, NJ 08854, USA; Rutgers Cancer Institute of New Jersey, Rutgers University, The State University of New Jersey, New Brunswick, NJ 08903, USA; RCSB Protein Databank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA; Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093, USA
| |
Collapse
|
29
|
Abstract
Crystal structures of protein-ligand complexes are often used to infer biology and inform structure-based drug discovery. Hence, it is important to build accurate, reliable models of ligands that give confidence in the interpretation of the respective protein-ligand complex. This paper discusses key stages in the ligand-fitting process, including ligand binding-site identification, ligand description and conformer generation, ligand fitting, refinement and subsequent validation. The CCP4 suite contains a number of software tools that facilitate this task: AceDRG for the creation of ligand descriptions and conformers, Lidia and JLigand for two-dimensional and three-dimensional ligand editing and visual analysis, Coot for density interpretation, ligand fitting, analysis and validation, and REFMAC5 for macromolecular refinement. In addition to recent advancements in automatic carbohydrate building in Coot (LO/Carb) and ligand-validation tools (FLEV), the release of the CCP4i2 GUI provides an integrated solution that streamlines the ligand-fitting workflow, seamlessly passing results from one program to the next. The ligand-fitting process is illustrated using instructive practical examples, including problematic cases such as post-translational modifications, highlighting the need for careful analysis and rigorous validation.
Collapse
Affiliation(s)
- Robert A. Nicholls
- Structural Studies, MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, England
| |
Collapse
|
30
|
Long F, Nicholls RA, Emsley P, Gražulis S, Merkys A, Vaitkus A, Murshudov GN. AceDRG: a stereochemical description generator for ligands. Acta Crystallogr D Struct Biol 2017; 73:112-122. [PMID: 28177307 PMCID: PMC5297914 DOI: 10.1107/s2059798317000067] [Citation(s) in RCA: 221] [Impact Index Per Article: 31.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2016] [Accepted: 01/03/2017] [Indexed: 11/11/2022] Open
Abstract
The program AceDRG is designed for the derivation of stereochemical information about small molecules. It uses local chemical and topological environment-based atom typing to derive and organize bond lengths and angles from a small-molecule database: the Crystallography Open Database (COD). Information about the hybridization states of atoms, whether they belong to small rings (up to seven-membered rings), ring aromaticity and nearest-neighbour information is encoded in the atom types. All atoms from the COD have been classified according to the generated atom types. All bonds and angles have also been classified according to the atom types and, in a certain sense, bond types. Derived data are tabulated in a machine-readable form that is freely available from CCP4. AceDRG can also generate stereochemical information, provided that the basic bonding pattern of a ligand is known. The basic bonding pattern is perceived from one of the computational chemistry file formats, including SMILES, mmCIF, SDF MOL and SYBYL MOL2 files. Using the bonding chemistry, atom types, and bond and angle tables generated from the COD, AceDRG derives the `ideal' bond lengths, angles, plane groups, aromatic rings and chirality information, and writes them to an mmCIF file that can be used by the refinement program REFMAC5 and the model-building program Coot. Other refinement and model-building programs such as PHENIX and BUSTER can also use these files. AceDRG also generates one or more coordinate sets corresponding to the most favourable conformation(s) of a given ligand. AceDRG employs RDKit for chemistry perception and for initial conformation generation, as well as for the interpretation of SMILES strings, SDF MOL and SYBYL MOL2 files.
Collapse
Affiliation(s)
- Fei Long
- Structural Studies, MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, England
| | - Robert A. Nicholls
- Structural Studies, MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, England
| | - Paul Emsley
- Structural Studies, MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, England
| | - Saulius Gražulis
- Institute of Biotechnology, Saulėtekio al. 7, LT-10257 Vilnius, Lithuania
| | - Andrius Merkys
- Institute of Biotechnology, Saulėtekio al. 7, LT-10257 Vilnius, Lithuania
| | - Antanas Vaitkus
- Institute of Biotechnology, Saulėtekio al. 7, LT-10257 Vilnius, Lithuania
| | - Garib N. Murshudov
- Structural Studies, MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, England
| |
Collapse
|
31
|
Long F, Nicholls RA, Emsley P, Gražulis S, Merkys A, Vaitkus A, Murshudov GN. Validation and extraction of molecular-geometry information from small-molecule databases. Acta Crystallogr D Struct Biol 2017; 73:103-111. [PMID: 28177306 PMCID: PMC5297913 DOI: 10.1107/s2059798317000079] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Accepted: 01/03/2017] [Indexed: 11/18/2022] Open
Abstract
A freely available small-molecule structure database, the Crystallography Open Database (COD), is used for the extraction of molecular-geometry information on small-molecule compounds. The results are used for the generation of new ligand descriptions, which are subsequently used by macromolecular model-building and structure-refinement software. To increase the reliability of the derived data, and therefore the new ligand descriptions, the entries from this database were subjected to very strict validation. The selection criteria made sure that the crystal structures used to derive atom types, bond and angle classes are of sufficiently high quality. Any suspicious entries at a crystal or molecular level were removed from further consideration. The selection criteria included (i) the resolution of the data used for refinement (entries solved at 0.84 Å resolution or higher) and (ii) the structure-solution method (structures must be from a single-crystal experiment and all atoms of generated molecules must have full occupancies), as well as basic sanity checks such as (iii) consistency between the valences and the number of connections between atoms, (iv) acceptable bond-length deviations from the expected values and (v) detection of atomic collisions. The derived atom types and bond classes were then validated using high-order moment-based statistical techniques. The results of the statistical analyses were fed back to fine-tune the atom typing. The developed procedure was repeated four times, resulting in fine-grained atom typing, bond and angle classes. The procedure will be repeated in the future as and when new entries are deposited in the COD. The whole procedure can also be applied to any source of small-molecule structures, including the Cambridge Structural Database and the ZINC database.
Collapse
Affiliation(s)
- Fei Long
- Structural Studies, MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, England
| | - Robert A. Nicholls
- Structural Studies, MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, England
| | - Paul Emsley
- Structural Studies, MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, England
| | - Saulius Gražulis
- Institute of Biotechnology, Saulėtekio al. 7, LT-10257 Vilnius, Lithuania
| | - Andrius Merkys
- Institute of Biotechnology, Saulėtekio al. 7, LT-10257 Vilnius, Lithuania
| | - Antanas Vaitkus
- Institute of Biotechnology, Saulėtekio al. 7, LT-10257 Vilnius, Lithuania
| | - Garib N. Murshudov
- Structural Studies, MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, England
| |
Collapse
|
32
|
Pozharski E, Deller MC, Rupp B. Validation of Protein-Ligand Crystal Structure Models: Small Molecule and Peptide Ligands. Methods Mol Biol 2017; 1607:611-625. [PMID: 28573591 DOI: 10.1007/978-1-4939-7000-1_25] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Models of target proteins in complex with small molecule ligands or peptide ligands are of significant interest to the biomedical research community. Structure-guided lead discovery and structure-based drug design make extensive use of such models. The bound ligands comprise only a small fraction of the total X-ray scattering mass, and therefore particular care must be taken to properly validate the atomic model of the ligand as experimental data can often be scarce. The ligand model must be validated against both the primary experimental data and the local environment, specifically: (1) the primary evidence in the form of the electron density, (2) examined for reasonable stereochemistry, and (3) the chemical plausibility of the binding interactions must be inspected. Tools that assist the researcher in the validation process are presented.
Collapse
Affiliation(s)
- Edwin Pozharski
- Department of Biochemistry and Molecular Biology, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Marc C Deller
- Stanford ChEM-H, Macromolecular Structure Knowledge Center, Stanford University, Shriram Center, 443 Via Ortega, Room 097, MC5082, Stanford, CA, 94305-4125, USA
| | - Bernhard Rupp
- k.-k. Hofkristallamt, 991 Audrey Place, Vista, CA, 92084, USA.
- Department of Genetic Epidemiology, Medical University Innsbruck, Schöpfstr. 41, Innsbruck, 6020, Austria.
| |
Collapse
|
33
|
Abstract
The dramatic increase in the number of protein sequences and structures deposited in biological databases has led to the development of many bioinformatics tools and programs to manage, validate, compare, and interpret this large volume of data. In addition, powerful tools are being developed to use this sequence and structural data to facilitate protein classification and infer biological function of newly identified proteins. This chapter covers freely available bioinformatics resources on the World Wide Web that are commonly used for protein structure analysis.
Collapse
Affiliation(s)
- Jason J Paxman
- Department of Biochemistry and Genetics, La Trobe Institute for Molecular Science, La Trobe University, Rm 521, LIMS1, Kingsbury Drive, Bundoora, Melbourne, VIC, 3086, Australia
| | - Begoña Heras
- Department of Biochemistry and Genetics, La Trobe Institute for Molecular Science, La Trobe University, Rm 521, LIMS1, Kingsbury Drive, Bundoora, Melbourne, VIC, 3086, Australia.
| |
Collapse
|
34
|
Adams PD, Aertgeerts K, Bauer C, Bell JA, Berman HM, Bhat TN, Blaney JM, Bolton E, Bricogne G, Brown D, Burley SK, Case DA, Clark KL, Darden T, Emsley P, Feher VA, Feng Z, Groom CR, Harris SF, Hendle J, Holder T, Joachimiak A, Kleywegt GJ, Krojer T, Marcotrigiano J, Mark AE, Markley JL, Miller M, Minor W, Montelione GT, Murshudov G, Nakagawa A, Nakamura H, Nicholls A, Nicklaus M, Nolte RT, Padyana AK, Peishoff CE, Pieniazek S, Read RJ, Shao C, Sheriff S, Smart O, Soisson S, Spurlino J, Stouch T, Svobodova R, Tempel W, Terwilliger TC, Tronrud D, Velankar S, Ward SC, Warren GL, Westbrook JD, Williams P, Yang H, Young J. Outcome of the First wwPDB/CCDC/D3R Ligand Validation Workshop. Structure 2016; 24:502-508. [PMID: 27050687 DOI: 10.1016/j.str.2016.02.017] [Citation(s) in RCA: 53] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2015] [Revised: 02/24/2016] [Accepted: 02/25/2016] [Indexed: 10/22/2022]
Abstract
Crystallographic studies of ligands bound to biological macromolecules (proteins and nucleic acids) represent an important source of information concerning drug-target interactions, providing atomic level insights into the physical chemistry of complex formation between macromolecules and ligands. Of the more than 115,000 entries extant in the Protein Data Bank (PDB) archive, ∼75% include at least one non-polymeric ligand. Ligand geometrical and stereochemical quality, the suitability of ligand models for in silico drug discovery and design, and the goodness-of-fit of ligand models to electron-density maps vary widely across the archive. We describe the proceedings and conclusions from the first Worldwide PDB/Cambridge Crystallographic Data Center/Drug Design Data Resource (wwPDB/CCDC/D3R) Ligand Validation Workshop held at the Research Collaboratory for Structural Bioinformatics at Rutgers University on July 30-31, 2015. Experts in protein crystallography from academe and industry came together with non-profit and for-profit software providers for crystallography and with experts in computational chemistry and data archiving to discuss and make recommendations on best practices, as framed by a series of questions central to structural studies of macromolecule-ligand complexes. What data concerning bound ligands should be archived in the PDB? How should the ligands be best represented? How should structural models of macromolecule-ligand complexes be validated? What supplementary information should accompany publications of structural studies of biological macromolecules? Consensus recommendations on best practices developed in response to each of these questions are provided, together with some details regarding implementation. Important issues addressed but not resolved at the workshop are also enumerated.
Collapse
Affiliation(s)
- Paul D Adams
- Molecular Biophysics & Integrated Bioimaging Division, Lawrence Berkeley Laboratory, Department of Bioengineering, UC Berkeley, Berkeley, CA 94720-8235, USA
| | | | - Cary Bauer
- Bruker AXS, Inc., Madison, WI 53711, USA
| | | | - Helen M Berman
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Talapady N Bhat
- Biosystems and Biomaterials Division, NIST, Gaithersburg, MD 20899, USA
| | | | - Evan Bolton
- National Center for Biotechnology Information, U.S. National Library of Medicine, Bethesda, MD 20894, USA
| | | | - David Brown
- School of Biosciences, University of Kent, Canterbury CT2 7NH, UK; Charles River Ltd., Structural Biology and Biophysics, Cambridge CB10 1XL, UK
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Skaggs School of Pharmacy and Pharmaceutical Sciences and San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA.
| | - David A Case
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Kirk L Clark
- Novartis Institutes for BioMedical Research, Cambridge, MA 02139, USA
| | - Tom Darden
- OpenEye Scientific, Cambridge, MA 02142, USA
| | - Paul Emsley
- MRC Laboratory of Molecular Biology, Cambridge CB2 0QH, UK
| | - Victoria A Feher
- Drug Design Data Resource and Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093, USA.
| | - Zukang Feng
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Colin R Groom
- Cambridge Crystallographic Data Centre, Cambridge CB2 1EZ, UK.
| | | | - Jorg Hendle
- Structural Biology, Lilly Biotechnology Center, San Diego, CA 92121, USA
| | | | - Andrzej Joachimiak
- Structural Biology Center, Biosciences, Argonne National Laboratory, Argonne, IL 60439, USA
| | - Gerard J Kleywegt
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Tobias Krojer
- Structural Genomics Consortium, University of Oxford, Oxford OX3 7DQ, UK
| | - Joseph Marcotrigiano
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Alan E Mark
- School of Chemistry & Molecular Biosciences, University of Queensland, St Lucia, QLD 4072, Australia
| | - John L Markley
- BioMagResBank, Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706-1544, USA
| | - Matthew Miller
- Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Wladek Minor
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22908, USA
| | - Gaetano T Montelione
- Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | | | - Atsushi Nakagawa
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Osaka 565-0871, Japan
| | - Haruki Nakamura
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Osaka 565-0871, Japan
| | | | - Marc Nicklaus
- Computer-Aided Drug Design Group, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Frederick, MD 21702, USA
| | | | | | | | - Susan Pieniazek
- Bristol-Myers Squibb Research and Development, Pennington, NJ 08534, USA
| | - Randy J Read
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge CB2 0XY, UK
| | - Chenghua Shao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Steven Sheriff
- Bristol-Myers Squibb Research and Development, Princeton, NJ 08543, USA
| | - Oliver Smart
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | - John Spurlino
- Janssen Pharmaceuticals, Inc., Spring House, PA 19002, USA
| | - Terry Stouch
- Science For Solutions, LLC, West Windsor, NJ 08550, USA
| | - Radka Svobodova
- CEITEC-Central European Institute of Technology and National Centre for Biomolecular Research, Masaryk University Brno, 625 00 Brno, Czech Republic
| | - Wolfram Tempel
- Structural Genomics Consortium, University of Toronto, Toronto, ON M5G 1L7, Canada
| | | | - Dale Tronrud
- Department of Biochemistry and Biophysics, Oregon State University, Corvallis, OR 97331, USA
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Suzanna C Ward
- Cambridge Crystallographic Data Centre, Cambridge CB2 1EZ, UK
| | | | - John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | | | - Huanwang Yang
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jasmine Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| |
Collapse
|
35
|
Grabowski M, Langner KM, Cymborowski M, Porebski PJ, Sroka P, Zheng H, Cooper DR, Zimmerman MD, Elsliger MA, Burley SK, Minor W. A public database of macromolecular diffraction experiments. Acta Crystallogr D Struct Biol 2016; 72:1181-1193. [PMID: 27841751 PMCID: PMC5108346 DOI: 10.1107/s2059798316014716] [Citation(s) in RCA: 96] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2016] [Accepted: 09/17/2016] [Indexed: 12/28/2022] Open
Abstract
The low reproducibility of published experimental results in many scientific disciplines has recently garnered negative attention in scientific journals and the general media. Public transparency, including the availability of `raw' experimental data, will help to address growing concerns regarding scientific integrity. Macromolecular X-ray crystallography has led the way in requiring the public dissemination of atomic coordinates and a wealth of experimental data, making the field one of the most reproducible in the biological sciences. However, there remains no mandate for public disclosure of the original diffraction data. The Integrated Resource for Reproducibility in Macromolecular Crystallography (IRRMC) has been developed to archive raw data from diffraction experiments and, equally importantly, to provide related metadata. Currently, the database of our resource contains data from 2920 macromolecular diffraction experiments (5767 data sets), accounting for around 3% of all depositions in the Protein Data Bank (PDB), with their corresponding partially curated metadata. IRRMC utilizes distributed storage implemented using a federated architecture of many independent storage servers, which provides both scalability and sustainability. The resource, which is accessible via the web portal at http://www.proteindiffraction.org, can be searched using various criteria. All data are available for unrestricted access and download. The resource serves as a proof of concept and demonstrates the feasibility of archiving raw diffraction data and associated metadata from X-ray crystallographic studies of biological macromolecules. The goal is to expand this resource and include data sets that failed to yield X-ray structures in order to facilitate collaborative efforts that will improve protein structure-determination methods and to ensure the availability of `orphan' data left behind for various reasons by individual investigators and/or extinct structural genomics projects.
Collapse
Affiliation(s)
- Marek Grabowski
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22904, USA
| | - Karol M. Langner
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22904, USA
| | - Marcin Cymborowski
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22904, USA
| | - Przemyslaw J. Porebski
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22904, USA
- Jerzy Haber Institute of Catalysis and Surface Chemistry, Polish Academy of Sciences, Niezapominajek 8, 30-239 Cracow, Poland
| | - Piotr Sroka
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22904, USA
| | - Heping Zheng
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22904, USA
| | - David R. Cooper
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22904, USA
| | - Matthew D. Zimmerman
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22904, USA
| | - Marc-André Elsliger
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 90237, USA
| | - Stephen K. Burley
- RCSB Protein Data Bank; Center for Integrative Proteomics Research; Institute for Quantitative Biomedicine; Rutgers Cancer Institute of New Jersey; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- San Diego Supercomputer Center and Skaggs School of Pharmacological Sciences, University of California, San Diego, La Jolla, CA 92093, USA
| | - Wladek Minor
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22904, USA
| |
Collapse
|
36
|
Touw WG, Joosten RP, Vriend G. New Biological Insights from Better Structure Models. J Mol Biol 2016; 428:1375-1393. [PMID: 26869101 DOI: 10.1016/j.jmb.2016.02.002] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2015] [Revised: 01/04/2016] [Accepted: 02/01/2016] [Indexed: 02/01/2023]
Abstract
Structure validation is a key component of all steps in the structure determination process, from structure building, refinement, deposition, and evaluation all the way to post-deposition optimisation of structures in the Protein Data Bank (PDB) by re-refinement and re-building. Today, many aspects of protein structures are understood better than 10years ago, and combined with improved software and more computing power, the automated PDB_REDO procedure can significantly improve about 85% of all X-ray structures ever deposited in the PDB. We review structure validation, structure improvement, and a series of validation resources and facilities that give access to improved PDB files and to reports on the quality of the original and the improved structures. Post-deposition optimisation generally leads to improved protein structures and a series of examples will illustrate how that, in turn, leads to improved or even novel biological insights.
Collapse
Affiliation(s)
- Wouter G Touw
- Centre for Molecular and Biomolecular Informatics, Radboud University Medical Center, Geert Grooteplein-Zuid 26-28, 6525 GA Nijmegen, The Netherlands
| | - Robbie P Joosten
- Department of Biochemistry, Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, The Netherlands
| | - Gert Vriend
- Centre for Molecular and Biomolecular Informatics, Radboud University Medical Center, Geert Grooteplein-Zuid 26-28, 6525 GA Nijmegen, The Netherlands.
| |
Collapse
|
37
|
Abstract
The use of macromolecular structures is widespread for a variety of applications, from teaching protein structure principles all the way to ligand optimization in drug development. Applying data mining techniques on these experimentally determined structures requires a highly uniform, standardized structural data source. The Protein Data Bank (PDB) has evolved over the years toward becoming the standard resource for macromolecular structures. However, the process selecting the data most suitable for specific applications is still very much based on personal preferences and understanding of the experimental techniques used to obtain these models. In this chapter, we will first explain the challenges with data standardization, annotation, and uniformity in the PDB entries determined by X-ray crystallography. We then discuss the specific effect that crystallographic data quality and model optimization methods have on structural models and how validation tools can be used to make informed choices. We also discuss specific advantages of using the PDB_REDO databank as a resource for structural data. Finally, we will provide guidelines on how to select the most suitable protein structure models for detailed analysis and how to select a set of structure models suitable for data mining.
Collapse
Affiliation(s)
- Bart van Beusekom
- Department of Biochemistry, Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX, Amsterdam, The Netherlands
| | - Anastassis Perrakis
- Department of Biochemistry, Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX, Amsterdam, The Netherlands
| | - Robbie P Joosten
- Department of Biochemistry, Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX, Amsterdam, The Netherlands.
| |
Collapse
|
38
|
Dauter Z, Wlodawer A. Progress in protein crystallography. Protein Pept Lett 2016; 23:201-10. [PMID: 26732246 PMCID: PMC6287266 DOI: 10.2174/0929866523666160106153524] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2015] [Revised: 10/26/2015] [Accepted: 01/03/2016] [Indexed: 11/22/2022]
Abstract
Macromolecular crystallography evolved enormously from the pioneering days, when structures were solved by "wizards" performing all complicated procedures almost by hand. In the current situation crystal structures of large systems can be often solved very effectively by various powerful automatic programs in days or hours, or even minutes. Such progress is to a large extent coupled to the advances in many other fields, such as genetic engineering, computer technology, availability of synchrotron beam lines and many other techniques, creating the highly interdisciplinary science of macromolecular crystallography. Due to this unprecedented success crystallography is often treated as one of the analytical methods and practiced by researchers interested in structures of macromolecules, but not highly competent in the procedures involved in the process of structure determination. One should therefore take into account that the contemporary, highly automatic systems can produce results almost without human intervention, but the resulting structures must be carefully checked and validated before their release into the public domain.
Collapse
Affiliation(s)
- Zbigniew Dauter
- Macromolecular Crystallography Laboratory, National Cancer Institute, Frederick, MD and Argonne, IL, USA.
| | | |
Collapse
|
39
|
Deller MC, Rupp B. Models of protein-ligand crystal structures: trust, but verify. J Comput Aided Mol Des 2015; 29:817-36. [PMID: 25665575 PMCID: PMC4531100 DOI: 10.1007/s10822-015-9833-8] [Citation(s) in RCA: 59] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2014] [Accepted: 01/29/2015] [Indexed: 11/26/2022]
Abstract
X-ray crystallography provides the most accurate models of protein-ligand structures. These models serve as the foundation of many computational methods including structure prediction, molecular modelling, and structure-based drug design. The success of these computational methods ultimately depends on the quality of the underlying protein-ligand models. X-ray crystallography offers the unparalleled advantage of a clear mathematical formalism relating the experimental data to the protein-ligand model. In the case of X-ray crystallography, the primary experimental evidence is the electron density of the molecules forming the crystal. The first step in the generation of an accurate and precise crystallographic model is the interpretation of the electron density of the crystal, typically carried out by construction of an atomic model. The atomic model must then be validated for fit to the experimental electron density and also for agreement with prior expectations of stereochemistry. Stringent validation of protein-ligand models has become possible as a result of the mandatory deposition of primary diffraction data, and many computational tools are now available to aid in the validation process. Validation of protein-ligand complexes has revealed some instances of overenthusiastic interpretation of ligand density. Fundamental concepts and metrics of protein-ligand quality validation are discussed and we highlight software tools to assist in this process. It is essential that end users select high quality protein-ligand models for their computational and biological studies, and we provide an overview of how this can be achieved.
Collapse
Affiliation(s)
- Marc C Deller
- The Joint Center for Structural Genomics, San Diego, CA, USA
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, 92037, USA
| | - Bernhard Rupp
- , k.-k. Hofkristallamt 991 Audrey Place, Vista, CA, 92084, USA.
- Department of Genetic Epidemiology, Medical University of Innsbruck, Schöpfstr. 41, 6020, Innsbruck, Austria.
| |
Collapse
|
40
|
Shabalin I, Dauter Z, Jaskolski M, Minor W, Wlodawer A. Crystallography and chemistry should always go together: a cautionary tale of protein complexes with cisplatin and carboplatin. ACTA CRYSTALLOGRAPHICA. SECTION D, BIOLOGICAL CRYSTALLOGRAPHY 2015; 71:1965-79. [PMID: 26327386 PMCID: PMC4556316 DOI: 10.1107/s139900471500629x] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2015] [Accepted: 03/27/2015] [Indexed: 12/23/2022]
Abstract
The anticancer activity of platinum-containing drugs such as cisplatin and carboplatin is considered to primarily arise from their interactions with nucleic acids; nevertheless, these drugs, or the products of their hydrolysis, also bind to proteins, potentially leading to the known side effects of the treatments. Here, over 40 crystal structures deposited in the Protein Data Bank (PDB) of cisplatin and carboplatin complexes of several proteins were analysed. Significant problems of either a crystallographic or a chemical nature were found in most of the presented atomic models and they could be traced to less or more serious deficiencies in the data-collection and refinement procedures. The re-evaluation of these data and models was possible thanks to their mandatory or voluntary deposition in publicly available databases, emphasizing the point that the availability of such data is critical for making structural science reproducible. Based on this analysis of a selected group of macromolecular structures, the importance of deposition of raw diffraction data is stressed and a procedure for depositing, tracking and using re-refined crystallographic models is suggested.
Collapse
Affiliation(s)
- Ivan Shabalin
- Department of Molecular Physiology and Biological Physics, University of Virginia, 1340 Jefferson Park Avenue, Charlottesville, VA 22908, USA
| | - Zbigniew Dauter
- Synchrotron Radiation Research Section, MCL, National Cancer Institute, Argonne National Laboratory, Argonne, IL 60439, USA
| | - Mariusz Jaskolski
- Department of Crystallography, Faculty of Chemistry, A. Mickiewicz University, Poznan, Poland
- Center for Biocrystallographic Research, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Wladek Minor
- Department of Molecular Physiology and Biological Physics, University of Virginia, 1340 Jefferson Park Avenue, Charlottesville, VA 22908, USA
| | - Alexander Wlodawer
- Protein Structure Section, MCL, National Cancer Institute, Frederick, MD 21702, USA
| |
Collapse
|
41
|
Zheng H, Handing KB, Zimmerman MD, Shabalin IG, Almo SC, Minor W. X-ray crystallography over the past decade for novel drug discovery - where are we heading next? Expert Opin Drug Discov 2015; 10:975-89. [PMID: 26177814 DOI: 10.1517/17460441.2015.1061991] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
INTRODUCTION Macromolecular X-ray crystallography has been the primary methodology for determining the three-dimensional structures of proteins, nucleic acids and viruses. Structural information has paved the way for structure-guided drug discovery and laid the foundations for structural bioinformatics. However, X-ray crystallography still has a few fundamental limitations, some of which may be overcome and complemented using emerging methods and technologies in other areas of structural biology. AREAS COVERED This review describes how structural knowledge gained from X-ray crystallography has been used to advance other biophysical methods for structure determination (and vice versa). This article also covers current practices for integrating data generated by other biochemical and biophysical methods with those obtained from X-ray crystallography. Finally, the authors articulate their vision about how a combination of structural and biochemical/biophysical methods may improve our understanding of biological processes and interactions. EXPERT OPINION X-ray crystallography has been, and will continue to serve as, the central source of experimental structural biology data used in the discovery of new drugs. However, other structural biology techniques are useful not only to overcome the major limitation of X-ray crystallography, but also to provide complementary structural data that is useful in drug discovery. The use of recent advancements in biochemical, spectroscopy and bioinformatics methods may revolutionize drug discovery, albeit only when these data are combined and analyzed with effective data management systems. Accurate and complete data management is crucial for developing experimental procedures that are robust and reproducible.
Collapse
Affiliation(s)
- Heping Zheng
- University of Virginia, Department of Molecular Physiology and Biological Physics , 1340 Jefferson Park Avenue, Charlottesville, VA 22908 , USA +1 434 243 6865 ; +1 434 243 2981 ;
| | | | | | | | | | | |
Collapse
|
42
|
Weichenberger CX, Afonine PV, Kantardjieff K, Rupp B. The solvent component of macromolecular crystals. ACTA CRYSTALLOGRAPHICA. SECTION D, BIOLOGICAL CRYSTALLOGRAPHY 2015; 71:1023-38. [PMID: 25945568 PMCID: PMC4427195 DOI: 10.1107/s1399004715006045] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/05/2014] [Accepted: 03/25/2015] [Indexed: 11/10/2022]
Abstract
The mother liquor from which a biomolecular crystal is grown will contain water, buffer molecules, native ligands and cofactors, crystallization precipitants and additives, various metal ions, and often small-molecule ligands or inhibitors. On average, about half the volume of a biomolecular crystal consists of this mother liquor, whose components form the disordered bulk solvent. Its scattering contributions can be exploited in initial phasing and must be included in crystal structure refinement as a bulk-solvent model. Concomitantly, distinct electron density originating from ordered solvent components must be correctly identified and represented as part of the atomic crystal structure model. Herein, are reviewed (i) probabilistic bulk-solvent content estimates, (ii) the use of bulk-solvent density modification in phase improvement, (iii) bulk-solvent models and refinement of bulk-solvent contributions and (iv) modelling and validation of ordered solvent constituents. A brief summary is provided of current tools for bulk-solvent analysis and refinement, as well as of modelling, refinement and analysis of ordered solvent components, including small-molecule ligands.
Collapse
Affiliation(s)
- Christian X. Weichenberger
- Center for Biomedicine, European Academy of Bozen/Bolzano (EURAC), Viale Druso 1, Bozen/Bolzano, I-39100 Südtirol/Alto Adige, Italy
| | - Pavel V. Afonine
- Physical Biosciences Division, Lawrence Berkeley National Laboratory (LBNL), 1 Cyclotron Road, Mail Stop 64R0121, Berkeley, CA 94720, USA
| | - Katherine Kantardjieff
- College of Science and Mathematics, California State University, San Marcos, CA 92078, USA
| | - Bernhard Rupp
- Department of Forensic Crystallography, k.-k. Hofkristallamt, 991 Audrey Place, Vista, CA 92084, USA
- Department of Genetic Epidemiology, Medical University of Innsbruck, Schöpfstrasse 41, A-6020 Innsbruck, Austria
| |
Collapse
|
43
|
Lamb AL, Kappock TJ, Silvaggi NR. You are lost without a map: Navigating the sea of protein structures. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2014; 1854:258-68. [PMID: 25554228 DOI: 10.1016/j.bbapap.2014.12.021] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2014] [Accepted: 12/22/2014] [Indexed: 11/26/2022]
Abstract
X-ray crystal structures propel biochemistry research like no other experimental method, since they answer many questions directly and inspire new hypotheses. Unfortunately, many users of crystallographic models mistake them for actual experimental data. Crystallographic models are interpretations, several steps removed from the experimental measurements, making it difficult for nonspecialists to assess the quality of the underlying data. Crystallographers mainly rely on "global" measures of data and model quality to build models. Robust validation procedures based on global measures now largely ensure that structures in the Protein Data Bank (PDB) are largely correct. However, global measures do not allow users of crystallographic models to judge the reliability of "local" features in a region of interest. Refinement of a model to fit into an electron density map requires interpretation of the data to produce a single "best" overall model. This process requires inclusion of most probable conformations in areas of poor density. Users who misunderstand this can be misled, especially in regions of the structure that are mobile, including active sites, surface residues, and especially ligands. This article aims to equip users of macromolecular models with tools to critically assess local model quality. Structure users should always check the agreement of the electron density map and the derived model in all areas of interest, even if the global statistics are good. We provide illustrated examples of interpreted electron density as a guide for those unaccustomed to viewing electron density.
Collapse
Affiliation(s)
- Audrey L Lamb
- Department of Molecular Biosciences, University of Kansas, Lawrence, KS 66045, United States.
| | - T Joseph Kappock
- Department of Biochemistry, Purdue University, West Lafayette, IN 47907, United States
| | - Nicholas R Silvaggi
- Department of Chemistry and Biochemistry, University of Wisconsin-Milwaukee, Milwaukee, WI 53211, United States.
| |
Collapse
|
44
|
Terwilliger TC, Bricogne G. Continuous mutual improvement of macromolecular structure models in the PDB and of X-ray crystallographic software: the dual role of deposited experimental data. ACTA CRYSTALLOGRAPHICA. SECTION D, BIOLOGICAL CRYSTALLOGRAPHY 2014; 70:2533-43. [PMID: 25286839 PMCID: PMC4188001 DOI: 10.1107/s1399004714017040] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 05/05/2014] [Accepted: 07/23/2014] [Indexed: 11/22/2022]
Abstract
Accurate crystal structures of macromolecules are of high importance in the biological and biomedical fields. Models of crystal structures in the Protein Data Bank (PDB) are in general of very high quality as deposited. However, methods for obtaining the best model of a macromolecular structure from a given set of experimental X-ray data continue to progress at a rapid pace, making it possible to improve most PDB entries after their deposition by re-analyzing the original deposited data with more recent software. This possibility represents a very significant departure from the situation that prevailed when the PDB was created, when it was envisioned as a cumulative repository of static contents. A radical paradigm shift for the PDB is therefore proposed, away from the static archive model towards a much more dynamic body of continuously improving results in symbiosis with continuously improving methods and software. These simultaneous improvements in methods and final results are made possible by the current deposition of processed crystallographic data (structure-factor amplitudes) and will be supported further by the deposition of raw data (diffraction images). It is argued that it is both desirable and feasible to carry out small-scale and large-scale efforts to make this paradigm shift a reality. Small-scale efforts would focus on optimizing structures that are of interest to specific investigators. Large-scale efforts would undertake a systematic re-optimization of all of the structures in the PDB, or alternatively the redetermination of groups of structures that are either related to or focused on specific questions. All of the resulting structures should be made generally available, along with the precursor entries, with various views of the structures being made available depending on the types of questions that users are interested in answering.
Collapse
Affiliation(s)
- Thomas C. Terwilliger
- Bioscience Division, Los Alamos National Laboratory, Mail Stop M888, Los Alamos, NM 87507, USA
| | - Gerard Bricogne
- Global Phasing Ltd, Sheraton House, Castle Park, Cambridge CB3 0AX, England
| |
Collapse
|
45
|
Abstract
INTRODUCTION X-ray crystallography plays an important role in structure-based drug design (SBDD), and accurate analysis of crystal structures of target macromolecules and macromolecule-ligand complexes is critical at all stages. However, whereas there has been significant progress in improving methods of structural biology, particularly in X-ray crystallography, corresponding progress in the development of computational methods (such as in silico high-throughput screening) is still on the horizon. Crystal structures can be overinterpreted and thus bias hypotheses and follow-up experiments. As in any experimental science, the models of macromolecular structures derived from X-ray diffraction data have their limitations, which need to be critically evaluated and well understood for structure-based drug discovery. AREAS COVERED This review describes how the validity, accuracy and precision of a protein or nucleic acid structure determined by X-ray crystallography can be evaluated from three different perspectives: i) the nature of the diffraction experiment; ii) the interpretation of an electron density map; and iii) the interpretation of the structural model in terms of function and mechanism. The strategies to optimally exploit a macromolecular structure are also discussed in the context of 'Big Data' analysis, biochemical experimental design and structure-based drug discovery. EXPERT OPINION Although X-ray crystallography is one of the most detailed 'microscopes' available today for examining macromolecular structures, the authors would like to re-emphasize that such structures are only simplified models of the target macromolecules. The authors also wish to reinforce the idea that a structure should not be thought of as a set of precise coordinates but rather as a framework for generating hypotheses to be explored. Numerous biochemical and biophysical experiments, including new diffraction experiments, can and should be performed to verify or falsify these hypotheses. X-ray crystallography will find its future application in drug discovery by the development of specific tools that would allow realistic interpretation of the outcome coordinates and/or support testing of these hypotheses.
Collapse
Affiliation(s)
- Heping Zheng
- University of Virginia, Department of Molecular Physiology and Biological Physics, 1340 Jefferson Park Avenue, Charlottesville, VA 22908, USA
- Center for Structural Genomics of Infectious Diseases (CSGID)
- Midwest Center for Structural Genomics (MCSG), USA
- New York Structural Genomics Research Consortium (NYSGRC), USA
- Specializes in Protein Crystallography, Data Analytics and Data Mining, Research Scientist
| | - Jing Hou
- University of Virginia, Department of Molecular Physiology and Biological Physics, 1340 Jefferson Park Avenue, Charlottesville, VA 22908, USA
- Center for Structural Genomics of Infectious Diseases (CSGID)
- Enzyme Structure Initiative (EFI), USA
- New York Structural Genomics Research Consortium (NYSGRC), USA
- Specializes in Protein Crystallography, Research Associate
| | - Matthew D Zimmerman
- University of Virginia, Department of Molecular Physiology and Biological Physics, 1340 Jefferson Park Avenue, Charlottesville, VA 22908, USA
- Center for Structural Genomics of Infectious Diseases (CSGID)
- Enzyme Structure Initiative (EFI), USA
- Midwest Center for Structural Genomics (MCSG), USA
- New York Structural Genomics Research Consortium (NYSGRC), USA
- Specializes in Protein Crystallography, Data Mining and Management, Instructor of Research
| | - Alexander Wlodawer
- National Cancer Institute, Center for Cancer Research, Frederick, MD 21702, USA
- Specializes in Macromolecular Structure and Function, Chief of the Macromolecular Crystallography Laboratory
| | - Wladek Minor
- University of Virginia, Department of Molecular Physiology and Biological Physics, 1340 Jefferson Park Avenue, Charlottesville, VA 22908, USA
- Center for Structural Genomics of Infectious Diseases (CSGID)
- Enzyme Structure Initiative (EFI), USA
- Midwest Center for Structural Genomics (MCSG), USA
- New York Structural Genomics Research Consortium (NYSGRC), USA
- Specializes in Structural Biology, Data Mining and Management, Professor
| |
Collapse
|
46
|
Validation of metal-binding sites in macromolecular structures with the CheckMyMetal web server. Nat Protoc 2013; 9:156-70. [PMID: 24356774 DOI: 10.1038/nprot.2013.172] [Citation(s) in RCA: 227] [Impact Index Per Article: 20.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Metals have vital roles in both the mechanism and architecture of biological macromolecules. Yet structures of metal-containing macromolecules in which metals are misidentified and/or suboptimally modeled are abundant in the Protein Data Bank (PDB). This shows the need for a diagnostic tool to identify and correct such modeling problems with metal-binding environments. The CheckMyMetal (CMM) web server (http://csgid.org/csgid/metal_sites/) is a sophisticated, user-friendly web-based method to evaluate metal-binding sites in macromolecular structures using parameters derived from 7,350 metal-binding sites observed in a benchmark data set of 2,304 high-resolution crystal structures. The protocol outlines how the CMM server can be used to detect geometric and other irregularities in the structures of metal-binding sites, as well as how it can alert researchers to potential errors in metal assignment. The protocol also gives practical guidelines for correcting problematic sites by modifying the metal-binding environment and/or redefining metal identity in the PDB file. Several examples where this has led to meaningful results are described in the ANTICIPATED RESULTS section. CMM was designed for a broad audience--biomedical researchers studying metal-containing proteins and nucleic acids--but it is equally well suited for structural biologists validating new structures during modeling or refinement. The CMM server takes the coordinates of a metal-containing macromolecule structure in the PDB format as input and responds within a few seconds for a typical protein structure with 2-5 metal sites and a few hundred amino acids.
Collapse
|
47
|
Dunbar J, Krawczyk K, Leem J, Baker T, Fuchs A, Georges G, Shi J, Deane CM. SAbDab: the structural antibody database. Nucleic Acids Res 2013; 42:D1140-6. [PMID: 24214988 PMCID: PMC3965125 DOI: 10.1093/nar/gkt1043] [Citation(s) in RCA: 263] [Impact Index Per Article: 23.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
Structural antibody database (SAbDab; http://opig.stats.ox.ac.uk/webapps/sabdab) is an online resource containing all the publicly available antibody structures annotated and presented in a consistent fashion. The data are annotated with several properties including experimental information, gene details, correct heavy and light chain pairings, antigen details and, where available, antibody-antigen binding affinity. The user can select structures, according to these attributes as well as structural properties such as complementarity determining region loop conformation and variable domain orientation. Individual structures, datasets and the complete database can be downloaded.
Collapse
Affiliation(s)
- James Dunbar
- Department of Statistics, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK, Informatics, UCB Pharma, 216 Bath Road, Slough SL1 4EN, UK and Roche Pharma Research & Early Development, Roche Diagnostics GmbH, 82377 Penzberg, Germany
| | | | | | | | | | | | | | | |
Collapse
|
48
|
Muller YA. Unexpected features in the Protein Data Bank entries 3qd1 and 4i8e: the structural description of the binding of the serine-rich repeat adhesin GspB to host cell carbohydrate receptor is not a solved issue. Acta Crystallogr Sect F Struct Biol Cryst Commun 2013; 69:1071-6. [PMID: 24100551 PMCID: PMC3792659 DOI: 10.1107/s1744309113014383] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2013] [Accepted: 05/24/2013] [Indexed: 11/10/2022]
Abstract
The structure of a complex between a fragment of the adhesin GspB from Streptococcus gordonii and a disaccharide (PDB entries 3qd1 and 4i8e) has recently been proposed to identify the binding site for the sialyl-T antigen recognized by GspB. This structure exhibits numerous unrealistic and unusual features such as an excessive number of van der Waals clashes and a lack of correlation between atomic structure and experimental electron density. Here, it is shown that the crystallographic data can be fully explained by an alternative model, namely replacing the disaccharide with a buffer molecule. The conclusion is that the experimental data are likely to contain no information regarding the carbohydrate receptor binding site in GspB or the interaction of GspB with host cell receptors.
Collapse
Affiliation(s)
- Yves A. Muller
- Lehrstuhl für Biotechnik, Department of Biology, Friedrich-Alexander University Erlangen-Nuremberg, Henkestrasse 91, 91052 Erlangen, Germany
| |
Collapse
|
49
|
Jaskolski M. On the propagation of errors. ACTA CRYSTALLOGRAPHICA SECTION D: BIOLOGICAL CRYSTALLOGRAPHY 2013; 69:1865-6. [PMID: 24100306 DOI: 10.1107/s090744491301528x] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/09/2013] [Accepted: 06/02/2013] [Indexed: 11/11/2022]
Abstract
The policy of the Protein Data Bank (PDB) that the first deposition of a small-molecule ligand, even with erroneous atom numbering, sets a precedent over accepted nomenclature rules is disputed. Recommendations regarding ligand molecules in the PDB are suggested.
Collapse
Affiliation(s)
- Mariusz Jaskolski
- Center for Biocrystallographic Research, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| |
Collapse
|
50
|
Wlodawer A, Minor W, Dauter Z, Jaskolski M. Protein crystallography for aspiring crystallographers or how to avoid pitfalls and traps in macromolecular structure determination. FEBS J 2013; 280:5705-36. [PMID: 24034303 DOI: 10.1111/febs.12495] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2013] [Revised: 08/12/2013] [Accepted: 08/20/2013] [Indexed: 12/28/2022]
Abstract
The number of macromolecular structures deposited in the Protein Data Bank now approaches 100,000, with the vast majority of them determined by crystallographic methods. Thousands of papers describing such structures have been published in the scientific literature, and 20 Nobel Prizes in chemistry or medicine have been awarded for discoveries based on macromolecular crystallography. New hardware and software tools have made crystallography appear to be an almost routine (but still far from being analytical) technique and many structures are now being determined by scientists with very limited experience in the practical aspects of the field. However, this apparent ease is sometimes illusory and proper procedures need to be followed to maintain high standards of structure quality. In addition, many noncrystallographers may have problems with the critical evaluation and interpretation of structural results published in the scientific literature. The present review provides an outline of the technical aspects of crystallography for less experienced practitioners, as well as information that might be useful for users of macromolecular structures, aiming to show them how to interpret (but not overinterpret) the information present in the coordinate files and in their description. A discussion of the extent of information that can be gleaned from the atomic coordinates of structures solved at different resolution is provided, as well as problems and pitfalls encountered in structure determination and interpretation.
Collapse
Affiliation(s)
- Alexander Wlodawer
- Protein Structure Section, Macromolecular Crystallography Laboratory, NCI at Frederick, Frederick, MD, USA
| | | | | | | |
Collapse
|