51
|
Smart OS, Horský V, Gore S, Svobodová Vařeková R, Bendová V, Kleywegt GJ, Velankar S. Validation of ligands in macromolecular structures determined by X-ray crystallography. Acta Crystallogr D Struct Biol 2018; 74:228-236. [PMID: 29533230 PMCID: PMC5947763 DOI: 10.1107/s2059798318002541] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2017] [Accepted: 02/12/2018] [Indexed: 01/19/2023] Open
Abstract
Crystallographic studies of ligands bound to biological macromolecules (proteins and nucleic acids) play a crucial role in structure-guided drug discovery and design, and also provide atomic level insights into the physical chemistry of complex formation between macromolecules and ligands. The quality with which small-molecule ligands have been modelled in Protein Data Bank (PDB) entries has been, and continues to be, a matter of concern for many investigators. Correctly interpreting whether electron density found in a binding site is compatible with the soaked or co-crystallized ligand or represents water or buffer molecules is often far from trivial. The Worldwide PDB validation report (VR) provides a mechanism to highlight any major issues concerning the quality of the data and the model at the time of deposition and annotation, so the depositors can fix issues, resulting in improved data quality. The ligand-validation methods used in the generation of the current VRs are described in detail, including an examination of the metrics to assess both geometry and electron-density fit. It is found that the LLDF score currently used to identify ligand electron-density fit outliers can give misleading results and that better ligand-validation metrics are required.
Collapse
Affiliation(s)
- Oliver S. Smart
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, England
| | - Vladimír Horský
- National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech Republic
- CEITEC – Central European Institute of Technology, Masaryk University, Kamenice 5, 625 00 Brno, Czech Republic
| | - Swanand Gore
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, England
| | - Radka Svobodová Vařeková
- National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech Republic
- CEITEC – Central European Institute of Technology, Masaryk University, Kamenice 5, 625 00 Brno, Czech Republic
| | - Veronika Bendová
- National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech Republic
- CEITEC – Central European Institute of Technology, Masaryk University, Kamenice 5, 625 00 Brno, Czech Republic
- Institute of Mathematics and Statistics, Masaryk University, Kotlářská 2, 611 37 Brno, Czech Republic
| | - Gerard J. Kleywegt
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, England
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, England
| |
Collapse
|
52
|
Friedrich NO, de Bruyn Kops C, Flachsenberg F, Sommer K, Rarey M, Kirchmair J. Benchmarking Commercial Conformer Ensemble Generators. J Chem Inf Model 2017; 57:2719-2728. [PMID: 28967749 DOI: 10.1021/acs.jcim.7b00505] [Citation(s) in RCA: 74] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
We assess and compare the performance of eight commercial conformer ensemble generators (ConfGen, ConfGenX, cxcalc, iCon, MOE LowModeMD, MOE Stochastic, MOE Conformation Import, and OMEGA) and one leading free algorithm, the distance geometry algorithm implemented in RDKit. The comparative study is based on a new version of the Platinum Diverse Dataset, a high-quality benchmarking dataset of 2859 protein-bound ligand conformations extracted from the PDB. Differences in the performance of commercial algorithms are much smaller than those observed for free algorithms in our previous study (J. Chem. Inf. MODEL 2017, 57, 529-539). For commercial algorithms, the median minimum root-mean-square deviations measured between protein-bound ligand conformations and ensembles of a maximum of 250 conformers are between 0.46 and 0.61 Å. Commercial conformer ensemble generators are characterized by their high robustness, with at least 99% of all input molecules successfully processed and few or even no substantial geometrical errors detectable in their output conformations. The RDKit distance geometry algorithm (with minimization enabled) appears to be a good free alternative since its performance is comparable to that of the midranked commercial algorithms. Based on a statistical analysis, we elaborate on which algorithms to use and how to parametrize them for best performance in different application scenarios.
Collapse
Affiliation(s)
- Nils-Ole Friedrich
- Center for Bioinformatics, Universität Hamburg , Bundesstr. 43, Hamburg 20146, Germany
| | | | - Florian Flachsenberg
- Center for Bioinformatics, Universität Hamburg , Bundesstr. 43, Hamburg 20146, Germany
| | - Kai Sommer
- Center for Bioinformatics, Universität Hamburg , Bundesstr. 43, Hamburg 20146, Germany
| | - Matthias Rarey
- Center for Bioinformatics, Universität Hamburg , Bundesstr. 43, Hamburg 20146, Germany
| | - Johannes Kirchmair
- Center for Bioinformatics, Universität Hamburg , Bundesstr. 43, Hamburg 20146, Germany
| |
Collapse
|
53
|
Nittinger E, Inhester T, Bietz S, Meyder A, Schomburg KT, Lange G, Klein R, Rarey M. Large-Scale Analysis of Hydrogen Bond Interaction Patterns in Protein–Ligand Interfaces. J Med Chem 2017; 60:4245-4257. [DOI: 10.1021/acs.jmedchem.7b00101] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Affiliation(s)
- Eva Nittinger
- Universität Hamburg, ZBH—Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| | - Therese Inhester
- Universität Hamburg, ZBH—Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| | - Stefan Bietz
- Universität Hamburg, ZBH—Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| | - Agnes Meyder
- Universität Hamburg, ZBH—Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| | - Karen T. Schomburg
- Universität Hamburg, ZBH—Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| | - Gudrun Lange
- Bayer CropScience AG, Industriepark
Hoechst, G836, 65926 Frankfurt am Main, Germany
| | - Robert Klein
- Bayer CropScience AG, Industriepark
Hoechst, G836, 65926 Frankfurt am Main, Germany
| | - Matthias Rarey
- Universität Hamburg, ZBH—Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| |
Collapse
|