1
|
Tyagi S, Yadav RK, Krishnan V. Determination of the Crystal Structure of the Cell Wall-Anchored Proteins and Pilins. Methods Mol Biol 2024; 2727:159-191. [PMID: 37815717 DOI: 10.1007/978-1-0716-3491-2_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/11/2023]
Abstract
Surface proteins and pili (or pilus) anchored on the Gram-positive bacterial cell wall play a vital role in adhesion, colonization, biofilm formation, and immunomodulation. The pilus consists of building blocks called pilins or pilus subunits. The surface proteins and pilins share some common sequences and structural features. They contain an N-terminal signal sequence and the C-terminal cell wall sorting region, enabling their transportation across the membrane and covalent attachment to the bacterial cell wall, respectively. The transpeptidase enzymes called sortases facilitate the covalent links between the pilins during the pilus assembly and between surface proteins or basal subunits of pili and peptidoglycan-bridge during the cell wall anchoring. Thus, elucidating three-dimensional structures for the surface proteins and pilins at the atomic level is essential for understanding the mechanism of adhesion, pilus assembly, and host interaction. This chapter aims to provide a general protocol for crystal structure determination of surface proteins and pilins anchored on the Gram-positive bacterial cell wall and substrates for sortases. The protocol involves the production of recombinant protein, crystallization, and structure determination by X-ray crystallography technique.
Collapse
Affiliation(s)
- Shivangi Tyagi
- Laboratory of Structural Biology, Regional Centre for Biotechnology, NCR Biotech Science Cluster, Faridabad, India
| | - Rajnesh Kumari Yadav
- Laboratory of Structural Biology, Regional Centre for Biotechnology, NCR Biotech Science Cluster, Faridabad, India
| | - Vengadesan Krishnan
- Laboratory of Structural Biology, Regional Centre for Biotechnology, NCR Biotech Science Cluster, Faridabad, India.
| |
Collapse
|
2
|
Gucwa M, Lenkiewicz J, Zheng H, Cymborowski M, Cooper DR, Murzyn K, Minor W. CMM-An enhanced platform for interactive validation of metal binding sites. Protein Sci 2023; 32:e4525. [PMID: 36464767 PMCID: PMC9794025 DOI: 10.1002/pro.4525] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 11/21/2022] [Accepted: 11/22/2022] [Indexed: 12/12/2022]
Abstract
Metal ions bound to macromolecules play an integral role in many cellular processes. They can directly participate in catalytic mechanisms or be essential for the structural integrity of proteins and nucleic acids. However, their unique nature in macromolecules can make them difficult to model and refine, and a substantial portion of metal ions in the PDB are misidentified or poorly refined. CheckMyMetal (CMM) is a validation tool that has gained widespread acceptance as an essential tool for researchers working on metal-macromolecule complexes. CMM can be used during structure determination or to validate metal binding sites in structural models within the PDB. The functionalities of CMM have recently been greatly enhanced and provide researchers with additional information that can guide modeling decisions. The new version of CMM shows metals in the context of electron density maps and allows for on-the-fly refinement of metal binding sites. The improvements should increase the reproducibility of biomedical research. The web server is available at https://cmm.minorlab.org.
Collapse
Affiliation(s)
- Michal Gucwa
- Department of Molecular Physiology and Biological PhysicsUniversity of VirginiaCharlottesvilleVirginiaUSA,Department of Computational Biophysics and BioinformaticsJagiellonian UniversityKrakowPoland
| | - Joanna Lenkiewicz
- Department of Molecular Physiology and Biological PhysicsUniversity of VirginiaCharlottesvilleVirginiaUSA
| | - Heping Zheng
- Department of Molecular Physiology and Biological PhysicsUniversity of VirginiaCharlottesvilleVirginiaUSA,Present address:
Hunan University College of BiologyBioinformatics CenterHunanPeople's Republic of China
| | - Marcin Cymborowski
- Department of Molecular Physiology and Biological PhysicsUniversity of VirginiaCharlottesvilleVirginiaUSA
| | - David R. Cooper
- Department of Molecular Physiology and Biological PhysicsUniversity of VirginiaCharlottesvilleVirginiaUSA
| | - Krzysztof Murzyn
- Department of Computational Biophysics and BioinformaticsJagiellonian UniversityKrakowPoland
| | - Wladek Minor
- Department of Molecular Physiology and Biological PhysicsUniversity of VirginiaCharlottesvilleVirginiaUSA
| |
Collapse
|
3
|
Tubiana J, Xiang Y, Fan L, Wolfson HJ, Chen K, Schneidman-Duhovny D, Shi Y. Reduced B cell antigenicity of Omicron lowers host serologic response. Cell Rep 2022; 41:111512. [PMID: 36223774 PMCID: PMC9515332 DOI: 10.1016/j.celrep.2022.111512] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 08/10/2022] [Accepted: 09/26/2022] [Indexed: 11/25/2022] Open
Abstract
The SARS-CoV-2 Omicron variant evades most neutralizing vaccine-induced antibodies and is associated with lower antibody titers upon breakthrough infections than previous variants. However, the mechanism remains unclear. Here, we find using a geometric deep-learning model that Omicron's extensively mutated receptor binding site (RBS) features reduced antigenicity compared with previous variants. Mice immunization experiments with different recombinant receptor binding domain (RBD) variants confirm that the serological response to Omicron is drastically attenuated and less potent. Analyses of serum cross-reactivity and competitive ELISA reveal a reduction in antibody response across both variable and conserved RBD epitopes. Computational modeling confirms that the RBS has a potential for further antigenicity reduction while retaining efficient receptor binding. Finally, we find a similar trend of antigenicity reduction over decades for hCoV229E, a common cold coronavirus. Thus, our study explains the reduced antibody titers associated with Omicron infection and reveals a possible trajectory of future viral evolution.
Collapse
Affiliation(s)
- Jérôme Tubiana
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv 6997801, Israel,School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem 9190501, Israel
| | - Yufei Xiang
- Center for Protein Engineering and Therapeutics, Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Li Fan
- Division of Pulmonary, Allergy, and Critical Care Medicine, Department of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Haim J. Wolfson
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Kong Chen
- Division of Pulmonary, Allergy, and Critical Care Medicine, Department of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA.
| | - Dina Schneidman-Duhovny
- School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem 9190501, Israel.
| | - Yi Shi
- Center for Protein Engineering and Therapeutics, Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
| |
Collapse
|
4
|
ScanNet: an interpretable geometric deep learning model for structure-based protein binding site prediction. Nat Methods 2022; 19:730-739. [DOI: 10.1038/s41592-022-01490-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2021] [Accepted: 04/12/2022] [Indexed: 11/08/2022]
|
5
|
Tubiana J, Xiang Y, Fan L, Wolfson HJ, Chen K, Schneidman-Duhovny D, Shi Y. Reduced antigenicity of Omicron lowers host serologic response. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2022. [PMID: 35194608 PMCID: PMC8863144 DOI: 10.1101/2022.02.15.480546] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
SARS-CoV-2 Omicron variant of concern (VOC) contains fifteen mutations on the receptor binding domain (RBD), evading most neutralizing antibodies from vaccinated sera. Emerging evidence suggests that Omicron breakthrough cases are associated with substantially lower antibody titers than other VOC cases. However, the mechanism remains unclear. Here, using a novel geometric deep-learning model, we discovered that the antigenic profile of Omicron RBD is distinct from the prior VOCs, featuring reduced antigenicity in its remodeled receptor binding sites (RBS). To substantiate our deep-learning prediction, we immunized mice with different recombinant RBD variants and found that the Omicron's extensive mutations can lead to a drastically attenuated serologic response with limited neutralizing activity in vivo , while the T cell response remains potent. Analyses of serum cross-reactivity and competitive ELISA with epitope-specific nanobodies revealed that the antibody response to Omicron was reduced across RBD epitopes, including both the variable RBS and epitopes without any known VOC mutations. Moreover, computational modeling confirmed that the RBS is highly versatile with a capacity to further decrease antigenicity while retaining efficient receptor binding. Longitudinal analysis showed that this evolutionary trend of decrease in antigenicity was also found in hCoV229E, a common cold coronavirus that has been circulating in humans for decades. Thus, our study provided unprecedented insights into the reduced antibody titers associated with Omicron infection, revealed a possible trajectory of future viral evolution and may inform the vaccine development against future outbreaks.
Collapse
|
6
|
Helliwell JR. Pre- and Post-publication Verification for Reproducible Data Mining in Macromolecular Crystallography. Methods Mol Biol 2022; 2449:235-261. [PMID: 35507266 DOI: 10.1007/978-1-0716-2095-3_10] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Like an article narrative is deemed by an editor and referees to be worthy of being a version of record on acceptance as a publication, so must the underpinning data also be scrutinized before passing it as a version of record. Indeed without the underpinning data, a study and its conclusions cannot be reproduced at any stage of evaluation, pre- or post-publication. Likewise, an independent study without its own underpinning data also cannot be reproduced let alone be considered a replicate of the first study. The PDB is a modern marvel of achievement providing an organized open access to depositor and user of the data held there opening numerous applications. Methods for modeling protein structures and for determination of structures are still improving their precision, and artifacts of the method exist. So their accuracy is realized if they are reproduced by other methods. It is on such foundations that reproducible data mining is based. Data rates are expanding considerably be they at synchrotrons, the X-ray free electron lasers (XFELs), electron cryomicroscopes (cryoEM), or at the neutron facilities. The work of a person as a referee or user with a narrative and its underpinning data may well be complemented in future by artificial intelligence with machine learning, the former for specific refereeing and the latter for the more general validation, both ideally before publication. Examples are described involving rhenium theranostics, the anti-cancer platins and the SARS-CoV-2 main protease.
Collapse
Affiliation(s)
- John R Helliwell
- Department of Chemistry, University of Manchester, Manchester, UK.
| |
Collapse
|
7
|
Grabowski M, Macnar JM, Cymborowski M, Cooper DR, Shabalin IG, Gilski M, Brzezinski D, Kowiel M, Dauter Z, Rupp B, Wlodawer A, Jaskolski M, Minor W. Rapid response to emerging biomedical challenges and threats. IUCRJ 2021; 8:395-407. [PMID: 33953926 PMCID: PMC8086160 DOI: 10.1107/s2052252521003018] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Accepted: 03/22/2021] [Indexed: 05/13/2023]
Abstract
As part of the global mobilization to combat the present pandemic, almost 100 000 COVID-19-related papers have been published and nearly a thousand models of macromolecules encoded by SARS-CoV-2 have been deposited in the Protein Data Bank within less than a year. The avalanche of new structural data has given rise to multiple resources dedicated to assessing the correctness and quality of structural data and models. Here, an approach to evaluate the massive amounts of such data using the resource https://covid19.bioreproducibility.org is described, which offers a template that could be used in large-scale initiatives undertaken in response to future biomedical crises. Broader use of the described methodology could considerably curtail information noise and significantly improve the reproducibility of biomedical research.
Collapse
Affiliation(s)
- Marek Grabowski
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
| | - Joanna M. Macnar
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
- College of Inter-Faculty Individual Studies in Mathematics and Natural Sciences, University of Warsaw, Warsaw, Poland
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Warsaw, Poland
| | - Marcin Cymborowski
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
| | - David R. Cooper
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
| | - Ivan G. Shabalin
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
| | - Miroslaw Gilski
- Department of Crystallography, Faculty of Chemistry, A. Mickiewicz University, Poznan, Poland
- Center for Biocrystallographic Research, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Dariusz Brzezinski
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
- Center for Biocrystallographic Research, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Marcin Kowiel
- Center for Biocrystallographic Research, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Zbigniew Dauter
- Center for Structural Biology, National Cancer Institute, Frederick, Maryland, USA
| | - Bernhard Rupp
- k.-k Hofkristallamt, San Diego, California, USA
- Institute of Genetic Epidemiology, Medical University Innsbruck, Innsbruck, Austria
| | - Alexander Wlodawer
- Center for Structural Biology, National Cancer Institute, Frederick, Maryland, USA
| | - Mariusz Jaskolski
- Department of Crystallography, Faculty of Chemistry, A. Mickiewicz University, Poznan, Poland
- Center for Biocrystallographic Research, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Wladek Minor
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia, USA
| |
Collapse
|
8
|
A mathematical representation of protein binding sites using structural dispersion of atoms from principal axes for classification of binding ligands. PLoS One 2021; 16:e0244905. [PMID: 33831020 PMCID: PMC8031081 DOI: 10.1371/journal.pone.0244905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Accepted: 03/09/2021] [Indexed: 11/23/2022] Open
Abstract
Many researchers have studied the relationship between the biological functions of proteins and the structures of both their overall backbones of amino acids and their binding sites. A large amount of the work has focused on summarizing structural features of binding sites as scalar quantities, which can result in a great deal of information loss since the structures are three-dimensional. Additionally, a common way of comparing binding sites is via aligning their atoms, which is a computationally intensive procedure that substantially limits the types of analysis and modeling that can be done. In this work, we develop a novel encoding of binding sites as covariance matrices of the distances of atoms to the principal axes of the structures. This representation is invariant to the chosen coordinate system for the atoms in the binding sites, which removes the need to align the sites to a common coordinate system, is computationally efficient, and permits the development of probability models. These can then be used to both better understand groups of binding sites that bind to the same ligand and perform classification for these ligand groups. We demonstrate the utility of our method for discrimination of binding ligand through classification studies with two benchmark datasets using nearest mean and polytomous logistic regression classifiers.
Collapse
|
9
|
D'Andréa ÉD, Retel JS, Diehl A, Schmieder P, Oschkinat H, Pires JR. NMR structure and dynamics of Q4DY78, a conserved kinetoplasid-specific protein from Trypanosoma cruzi. J Struct Biol 2021; 213:107715. [PMID: 33705979 DOI: 10.1016/j.jsb.2021.107715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Revised: 03/01/2021] [Accepted: 03/03/2021] [Indexed: 10/21/2022]
Abstract
The 106-residue protein Q4DY78 (UniProt accession number) from Trypanosoma cruzi is highly conserved in the related kinetoplastid pathogens Trypanosoma brucei and Leishmania major. Given the essentiality of its orthologue in T. brucei, the high sequence conservation with other trypanosomatid proteins, and the low sequence similarity with mammalian proteins, Q4DY78 is an attractive protein for structural characterization. Here, we solved the structure of Q4DY78 by solution NMR and evaluated its backbone dynamics. Q4DY78 is composed of five α -helices and a small, two-stranded antiparallel β-sheet. The backbone RMSD is 0.22 ± 0.05 Å for the representative ensemble of the 20 lowest-energy structures. Q4DY78 is overall rigid, except for N-terminal residues (V8 to I10), residues at loop 4 (K57 to G65) and residues at the C-terminus (F89 to F112). Q4DY78 has a short motif FPCAP that could potentially mediate interactions with the host cytoskeleton via interaction with EVH1 (Drosophila Enabled (Ena)/Vasodilator-stimulated phosphoprotein (VASP) homology 1) domains. Albeit Q4DY78 lacks calcium-binding motifs, its fold resembles that of eukaryotic calcium-binding proteins such as calcitracin, calmodulin, and polcacin Bet V4. We characterized this novel protein with a calcium binding fold without the capacity to bind calcium.
Collapse
Affiliation(s)
- Éverton Dias D'Andréa
- Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Av. Carlos Chagas Filho, 373 - Bloco E, sala 32, Rio de Janeiro, RJ 21941-902, Brazil
| | - Joren Sebastian Retel
- Leibniz-Institut für Molekulare Pharmakologie, FMP, Robert-Rössle-Straβe 10, Berlin 13125, Germany
| | - Anne Diehl
- Leibniz-Institut für Molekulare Pharmakologie, FMP, Robert-Rössle-Straβe 10, Berlin 13125, Germany
| | - Peter Schmieder
- Leibniz-Institut für Molekulare Pharmakologie, FMP, Robert-Rössle-Straβe 10, Berlin 13125, Germany
| | - Hartmut Oschkinat
- Leibniz-Institut für Molekulare Pharmakologie, FMP, Robert-Rössle-Straβe 10, Berlin 13125, Germany; Freie Universität Berlin, Institut für Chemie und Biochemie, Takustrasse 3, Berlin 14195, Germany
| | - José Ricardo Pires
- Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Av. Carlos Chagas Filho, 373 - Bloco E, sala 32, Rio de Janeiro, RJ 21941-902, Brazil.
| |
Collapse
|
10
|
Barradas-Bautista D, Cao Z, Cavallo L, Oliva R. The CASP13-CAPRI targets as case studies to illustrate a novel scoring pipeline integrating CONSRANK with clustering and interface analyses. BMC Bioinformatics 2020; 21:262. [PMID: 32938371 PMCID: PMC7493188 DOI: 10.1186/s12859-020-03600-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2020] [Accepted: 06/10/2020] [Indexed: 08/27/2023] Open
Abstract
Background Properly scoring protein-protein docking models to single out the correct ones is an open challenge, also object of assessment in CAPRI (Critical Assessment of PRedicted Interactions), a community-wide blind docking experiment. We introduced in the field CONSRANK (CONSensus RANKing), the first pure consensus method. Also available as a web server, CONSRANK ranks docking models in an ensemble based on their ability to match the most frequent inter-residue contacts in it. We have been blindly testing CONSRANK in all the latest CAPRI rounds, where we showed it to perform competitively with the state-of-the-art energy and knowledge-based scoring functions. More recently, we developed Clust-CONSRANK, an algorithm introducing a contact-based clustering of the models as a preliminary step of the CONSRANK scoring process. In the latest CASP13-CAPRI joint experiment, we participated as scorers with a novel pipeline, combining both our scoring tools, CONSRANK and Clust-CONSRANK, with our interface analysis tool COCOMAPS. Selection of the 10 models for submission was guided by the strength of the emerging consensus, and their final ranking was assisted by results of the interface analysis. Results As a result of the above approach, we were by far the first scorer in the CASP13-CAPRI top-1 ranking, having high/medium quality models ranked at the top-1 position for the majority of targets (11 out of the total 19). We were also the first scorer in the top-10 ranking, on a par with another group, and the second scorer in the top-5 ranking. Further, we topped the ranking relative to the prediction of binding interfaces, among all the scorers and predictors. Using the CASP13-CAPRI targets as case studies, we illustrate here in detail the approach we adopted. Conclusions Introducing some flexibility in the final model selection and ranking, as well as differentiating the adopted scoring approach depending on the targets were the key assets for our highly successful performance, as compared to previous CAPRI rounds. The approach we propose is entirely based on methods made available to the community and could thus be reproduced by any user.
Collapse
|
11
|
Abstract
Protein-protein and protein-DNA/RNA interactions are involved in many cellular processes. Therefore, determining their complex structures at the atomic level is valuable to gain insights into these interactions. Because of the technical difficulties and high cost in experimental methods, computational approaches like molecular docking have been developed to predict the structures of macromolecular complexes in the last decades. To automatically integrate the available binding information from the PDB, we have developed HDOCK, a protein-protein/nucleic acid docking web server by combining template-based and free docking. In this chapter, we first briefly introduce our HDOCK server and then give a step-by-step description of docking bovine chymotrypsinogen A against its inhibitor (PDB ID: 1CGI). Two case studies of realistic examples are also discussed. The HDOCK server is freely available at http://hdock.phys.hust.edu.cn/ .
Collapse
Affiliation(s)
- Yumeng Yan
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China.
| |
Collapse
|
12
|
High-Throughput Crystallization Pipeline at the Crystallography Core Facility of the Institut Pasteur. Molecules 2019; 24:molecules24244451. [PMID: 31817305 PMCID: PMC6943606 DOI: 10.3390/molecules24244451] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2019] [Revised: 12/02/2019] [Accepted: 12/03/2019] [Indexed: 11/25/2022] Open
Abstract
The availability of whole-genome sequence data, made possible by significant advances in DNA sequencing technology, led to the emergence of structural genomics projects in the late 1990s. These projects not only significantly increased the number of 3D structures deposited in the Protein Data Bank in the last two decades, but also influenced present crystallographic strategies by introducing automation and high-throughput approaches in the structure-determination pipeline. Today, dedicated crystallization facilities, many of which are open to the general user community, routinely set up and track thousands of crystallization screening trials per day. Here, we review the current methods for high-throughput crystallization and procedures to obtain crystals suitable for X-ray diffraction studies, and we describe the crystallization pipeline implemented in the medium-scale crystallography platform at the Institut Pasteur (Paris) as an example.
Collapse
|
13
|
Urresti S, Cartmell A, Liu F, Walton PH, Davies GJ. Structural studies of the unusual metal-ion site of the GH124 endoglucanase from Ruminiclostridium thermocellum. Acta Crystallogr F Struct Biol Commun 2018; 74:496-505. [PMID: 30084399 PMCID: PMC6096483 DOI: 10.1107/s2053230x18006842] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2018] [Accepted: 05/03/2018] [Indexed: 12/20/2022] Open
Abstract
The recent discovery of `lytic' polysaccharide monooxygenases, copper-dependent enzymes for biomass degradation, has provided new impetus for the analysis of unusual metal-ion sites in carbohydrate-active enzymes. In this context, the CAZY family GH124 endoglucanase from Ruminiclostridium thermocellum contains an unusual metal-ion site, which was originally modelled as a Ca2+ site but features aspartic acid, asparagine and two histidine imidazoles as coordinating residues, which are more consistent with a transition-metal binding environment. It was sought to analyse whether the GH124 metal-ion site might accommodate other metals. It is demonstrated through thermal unfolding experiments that this metal-ion site can accommodate a range of transition metals (Fe2+, Cu2+, Mn2+ and Ni2+), whilst the three-dimensional structure and mass spectrometry show that one of the histidines is partially covalently modified and is present as a 2-oxohistidine residue; a feature that is rarely observed but that is believed to be involved in an `off-switch' to transition-metal binding. Atomic resolution (<1.1 Å) complexes define the metal-ion site and also reveal the binding of an unusual fructosylated oligosaccharide, which was presumably present as a contaminant in the cellohexaose used for crystallization. Although it has not been possible to detect a biological role for the unusual metal-ion site, this work highlights the need to study some of the many metal-ion sites in carbohydrate-active enzymes that have long been overlooked or previously mis-assigned.
Collapse
Affiliation(s)
- Saioa Urresti
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, England
| | - Alan Cartmell
- Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne NE1 7RU, England
| | - Feng Liu
- Department of Chemistry, University of British Columbia, Vancouver V6T 1Z1, Canada
| | - Paul H. Walton
- Department of Chemistry, University of York, York YO10 5DD, England
| | - Gideon J. Davies
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, England
| |
Collapse
|
14
|
Simões ICM, Coimbra JTS, Neves RPP, Costa IPD, Ramos MJ, Fernandes PA. Properties that rank protein:protein docking poses with high accuracy. Phys Chem Chem Phys 2018; 20:20927-20942. [DOI: 10.1039/c8cp03888k] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
The development of docking algorithms to predict near-native structures of protein:protein complexes from the structure of the isolated monomers is of paramount importance for molecular biology and drug discovery.
Collapse
Affiliation(s)
- Inês C. M. Simões
- UCIBIO
- REQUIMTE
- Departamento de Química e Bioquímica
- Faculdade de Ciências
- Universidade do Porto
| | - João T. S. Coimbra
- UCIBIO
- REQUIMTE
- Departamento de Química e Bioquímica
- Faculdade de Ciências
- Universidade do Porto
| | - Rui P. P. Neves
- UCIBIO
- REQUIMTE
- Departamento de Química e Bioquímica
- Faculdade de Ciências
- Universidade do Porto
| | - Inês P. D. Costa
- UCIBIO
- REQUIMTE
- Departamento de Química e Bioquímica
- Faculdade de Ciências
- Universidade do Porto
| | - Maria J. Ramos
- UCIBIO
- REQUIMTE
- Departamento de Química e Bioquímica
- Faculdade de Ciências
- Universidade do Porto
| | - Pedro A. Fernandes
- UCIBIO
- REQUIMTE
- Departamento de Química e Bioquímica
- Faculdade de Ciências
- Universidade do Porto
| |
Collapse
|
15
|
Porebski PJ, Sroka P, Zheng H, Cooper DR, Minor W. Molstack-Interactive visualization tool for presentation, interpretation, and validation of macromolecules and electron density maps. Protein Sci 2017; 27:86-94. [PMID: 28815771 DOI: 10.1002/pro.3272] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2017] [Revised: 08/11/2017] [Accepted: 08/14/2017] [Indexed: 11/07/2022]
Abstract
Our understanding of the world of biomolecular structures is based upon the interpretation of macromolecular models, of which ∼90% are an interpretation of electron density maps. This structural information guides scientific progress and exploration in many biomedical disciplines. The Protein Data Bank's web portals have made these structures available for mass scientific consumption and greatly broaden the scope of information presented in scientific publications. The portals provide numerous quality metrics; however, the portion of the structure that is most vital for interpretation of the function may have the most difficult to interpret electron density and this ambiguity is not reflected by any single metric. The possible consequences of basing research on suboptimal models make it imperative to inspect the agreement of a model with its experimental evidence. Molstack, a web-based interactive publishing platform for structural data, allows users to present density maps and structural models by displaying a collection of maps and models, including different interpretation of one's own data, re-refinements, and corrections of existing structures. Molstack organizes the sharing and dissemination of these structural models along with their experimental evidence as an interactive session. Molstack was designed with three groups of users in mind; researchers can present the evidence of their interpretation, reviewers and readers can independently judge the experimental evidence of the authors' conclusions, and other researchers can present or even publish their new hypotheses in the context of prior results. The server is available at http://molstack.bioreproducibility.org.
Collapse
Affiliation(s)
- Przemyslaw J Porebski
- Department of Biological Physics & Molecular Physiology, University of Virginia, Charlottesville, Virginia
| | - Piotr Sroka
- Department of Biological Physics & Molecular Physiology, University of Virginia, Charlottesville, Virginia
| | - Heping Zheng
- Department of Biological Physics & Molecular Physiology, University of Virginia, Charlottesville, Virginia
| | - David R Cooper
- Department of Biological Physics & Molecular Physiology, University of Virginia, Charlottesville, Virginia
| | - Wladek Minor
- Department of Biological Physics & Molecular Physiology, University of Virginia, Charlottesville, Virginia
| |
Collapse
|
16
|
Zheng H, Porebski PJ, Grabowski M, Cooper DR, Minor W. Databases, Repositories, and Other Data Resources in Structural Biology. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2017; 1607:643-665. [PMID: 28573593 DOI: 10.1007/978-1-4939-7000-1_27] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Structural biology, like many other areas of modern science, produces an enormous amount of primary, derived, and "meta" data with a high demand on data storage and manipulations. Primary data come from various steps of sample preparation, diffraction experiments, and functional studies. These data are not only used to obtain tangible results, like macromolecular structural models, but also to enrich and guide our analysis and interpretation of various biomedical problems. Herein we define several categories of data resources, (a) Archives, (b) Repositories, (c) Databases, and (d) Advanced Information Systems, that can accommodate primary, derived, or reference data. Data resources may be used either as web portals or internally by structural biology software. To be useful, each resource must be maintained, curated, as well as integrated with other resources. Ideally, the system of interconnected resources should evolve toward comprehensive "hubs", or Advanced Information Systems. Such systems, encompassing the PDB and UniProt, are indispensable not only for structural biology, but for many related fields of science. The categories of data resources described herein are applicable well beyond our usual scientific endeavors.
Collapse
Affiliation(s)
- Heping Zheng
- Department of Molecular Physiology and Biological Physics, University of Virginia School of Medicine, 1340 Jefferson Park Avenue, Jordan Hall, Room 4223, Charlottesville, VA, 22908, USA
| | - Przemyslaw J Porebski
- Department of Molecular Physiology and Biological Physics, University of Virginia School of Medicine, 1340 Jefferson Park Avenue, Jordan Hall, Room 4223, Charlottesville, VA, 22908, USA
| | - Marek Grabowski
- Department of Molecular Physiology and Biological Physics, University of Virginia School of Medicine, 1340 Jefferson Park Avenue, Jordan Hall, Room 4223, Charlottesville, VA, 22908, USA
| | - David R Cooper
- Department of Molecular Physiology and Biological Physics, University of Virginia School of Medicine, 1340 Jefferson Park Avenue, Jordan Hall, Room 4223, Charlottesville, VA, 22908, USA
| | - Wladek Minor
- Department of Molecular Physiology and Biological Physics, University of Virginia School of Medicine, 1340 Jefferson Park Avenue, Jordan Hall, Room 4223, Charlottesville, VA, 22908, USA.
| |
Collapse
|
17
|
Vangone A, Rodrigues JPGLM, Xue LC, van Zundert GCP, Geng C, Kurkcuoglu Z, Nellen M, Narasimhan S, Karaca E, van Dijk M, Melquiond ASJ, Visscher KM, Trellet M, Kastritis PL, Bonvin AMJJ. Sense and simplicity in HADDOCK scoring: Lessons from CASP-CAPRI round 1. Proteins 2016; 85:417-423. [PMID: 27802573 PMCID: PMC5324763 DOI: 10.1002/prot.25198] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2016] [Revised: 10/14/2016] [Accepted: 10/25/2016] [Indexed: 12/28/2022]
Abstract
Our information-driven docking approach HADDOCK is a consistent top predictor and scorer since the start of its participation in the CAPRI community-wide experiment. This sustained performance is due, in part, to its ability to integrate experimental data and/or bioinformatics information into the modelling process, and also to the overall robustness of the scoring function used to assess and rank the predictions. In the CASP-CAPRI Round 1 scoring experiment we successfully selected acceptable/medium quality models for 18/14 of the 25 targets - a top-ranking performance among all scorers. Considering that for only 20 targets acceptable models were generated by the community, our effective success rate reaches as high as 90% (18/20). This was achieved using the standard HADDOCK scoring function, which, thirteen years after its original publication, still consists of a simple linear combination of intermolecular van der Waals and Coulomb electrostatics energies and an empirically derived desolvation energy term. Despite its simplicity, this scoring function makes sense from a physico-chemical perspective, encoding key aspects of biomolecular recognition. In addition to its success in the scoring experiment, the HADDOCK server takes the first place in the server prediction category, with 16 successful predictions. Much like our scoring protocol, because of the limited time per target, the predictions relied mainly on either an ab initio center-of-mass and symmetry restrained protocol, or on a template-based approach whenever applicable. These results underline the success of our simple but sensible prediction and scoring scheme. Proteins 2017; 85:417-423. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- A Vangone
- Department of Chemistry, Computational Structural Biology Group, Faculty of Science, Utrecht University, Padualaan 8, Utrecht, 3584CH, The Netherlands
| | - J P G L M Rodrigues
- Department of Chemistry, Computational Structural Biology Group, Faculty of Science, Utrecht University, Padualaan 8, Utrecht, 3584CH, The Netherlands
| | - L C Xue
- Department of Chemistry, Computational Structural Biology Group, Faculty of Science, Utrecht University, Padualaan 8, Utrecht, 3584CH, The Netherlands
| | - G C P van Zundert
- Department of Chemistry, Computational Structural Biology Group, Faculty of Science, Utrecht University, Padualaan 8, Utrecht, 3584CH, The Netherlands
| | - C Geng
- Department of Chemistry, Computational Structural Biology Group, Faculty of Science, Utrecht University, Padualaan 8, Utrecht, 3584CH, The Netherlands
| | - Z Kurkcuoglu
- Department of Chemistry, Computational Structural Biology Group, Faculty of Science, Utrecht University, Padualaan 8, Utrecht, 3584CH, The Netherlands
| | - M Nellen
- Department of Chemistry, Computational Structural Biology Group, Faculty of Science, Utrecht University, Padualaan 8, Utrecht, 3584CH, The Netherlands
| | - S Narasimhan
- Department of Chemistry, Computational Structural Biology Group, Faculty of Science, Utrecht University, Padualaan 8, Utrecht, 3584CH, The Netherlands
| | - E Karaca
- Department of Chemistry, Computational Structural Biology Group, Faculty of Science, Utrecht University, Padualaan 8, Utrecht, 3584CH, The Netherlands
| | - M van Dijk
- Department of Chemistry, Computational Structural Biology Group, Faculty of Science, Utrecht University, Padualaan 8, Utrecht, 3584CH, The Netherlands
| | - A S J Melquiond
- Department of Chemistry, Computational Structural Biology Group, Faculty of Science, Utrecht University, Padualaan 8, Utrecht, 3584CH, The Netherlands
| | - K M Visscher
- Department of Chemistry, Computational Structural Biology Group, Faculty of Science, Utrecht University, Padualaan 8, Utrecht, 3584CH, The Netherlands
| | - M Trellet
- Department of Chemistry, Computational Structural Biology Group, Faculty of Science, Utrecht University, Padualaan 8, Utrecht, 3584CH, The Netherlands
| | - P L Kastritis
- Department of Chemistry, Computational Structural Biology Group, Faculty of Science, Utrecht University, Padualaan 8, Utrecht, 3584CH, The Netherlands
| | - A M J J Bonvin
- Department of Chemistry, Computational Structural Biology Group, Faculty of Science, Utrecht University, Padualaan 8, Utrecht, 3584CH, The Netherlands
| |
Collapse
|
18
|
Bauer U, Breeze AL. “Ligandability” of Drug Targets: Assessment of Chemical Tractability via Experimental and
In Silico
Approaches. ACTA ACUST UNITED AC 2016. [DOI: 10.1002/9783527677047.ch03] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
|
19
|
The impact of structural genomics: the first quindecennial. ACTA ACUST UNITED AC 2016; 17:1-16. [PMID: 26935210 DOI: 10.1007/s10969-016-9201-5] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2015] [Accepted: 02/17/2016] [Indexed: 12/21/2022]
Abstract
The period 2000-2015 brought the advent of high-throughput approaches to protein structure determination. With the overall funding on the order of $2 billion (in 2010 dollars), the structural genomics (SG) consortia established worldwide have developed pipelines for target selection, protein production, sample preparation, crystallization, and structure determination by X-ray crystallography and NMR. These efforts resulted in the determination of over 13,500 protein structures, mostly from unique protein families, and increased the structural coverage of the expanding protein universe. SG programs contributed over 4400 publications to the scientific literature. The NIH-funded Protein Structure Initiatives alone have produced over 2000 scientific publications, which to date have attracted more than 93,000 citations. Software and database developments that were necessary to handle high-throughput structure determination workflows have led to structures of better quality and improved integrity of the associated data. Organized and accessible data have a positive impact on the reproducibility of scientific experiments. Most of the experimental data generated by the SG centers are freely available to the community and has been utilized by scientists in various fields of research. SG projects have created, improved, streamlined, and validated many protocols for protein production and crystallization, data collection, and functional analysis, significantly benefiting biological and biomedical research.
Collapse
|
20
|
Minor W, Dauter Z, Helliwell JR, Jaskolski M, Wlodawer A. Safeguarding Structural Data Repositories against Bad Apples. Structure 2016; 24:216-20. [PMID: 26840827 PMCID: PMC4743038 DOI: 10.1016/j.str.2015.12.010] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2015] [Revised: 12/15/2015] [Accepted: 12/16/2015] [Indexed: 11/17/2022]
Abstract
Structural biology research generates large amounts of data, some deposited in public databases or repositories, but a substantial remainder never becomes available to the scientific community. In addition, some of the deposited data contain less or more serious errors that may bias the results of data mining. Thorough analysis and discussion of these problems is needed to ameliorate this situation. This perspective is an attempt to propose some solutions and encourage both further discussion and action on the part of the relevant organizations, in particular the PDB and various bodies of the International Union of Crystallography.
Collapse
Affiliation(s)
- Wladek Minor
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22908, USA.
| | - Zbigniew Dauter
- Synchrotron Radiation Research Section, Macromolecular Crystallography Laboratory, National Cancer Institute, Argonne National Laboratory, Argonne, IL 60439, USA
| | - John R Helliwell
- School of Chemistry, University of Manchester, Manchester M13 9PL, UK
| | - Mariusz Jaskolski
- Center for Biocrystallographic Research, Institute of Bioorganic Chemistry, Polish Academy of Sciences and Department of Crystallography, Faculty of Chemistry, A. Mickiewicz University, 60-780 Poznan, Poland
| | - Alexander Wlodawer
- Protein Structure Section, Macromolecular Crystallography Laboratory, National Cancer Institute, Frederick, MD 21702, USA
| |
Collapse
|
21
|
Porebski PJ, Cymborowski M, Pasenkiewicz-Gierula M, Minor W. Fitmunk: improving protein structures by accurate, automatic modeling of side-chain conformations. Acta Crystallogr D Struct Biol 2016; 72:266-80. [PMID: 26894674 PMCID: PMC4756610 DOI: 10.1107/s2059798315024730] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2015] [Accepted: 12/23/2015] [Indexed: 11/21/2022] Open
Abstract
Improvements in crystallographic hardware and software have allowed automated structure-solution pipelines to approach a near-`one-click' experience for the initial determination of macromolecular structures. However, in many cases the resulting initial model requires a laborious, iterative process of refinement and validation. A new method has been developed for the automatic modeling of side-chain conformations that takes advantage of rotamer-prediction methods in a crystallographic context. The algorithm, which is based on deterministic dead-end elimination (DEE) theory, uses new dense conformer libraries and a hybrid energy function derived from experimental data and prior information about rotamer frequencies to find the optimal conformation of each side chain. In contrast to existing methods, which incorporate the electron-density term into protein-modeling frameworks, the proposed algorithm is designed to take advantage of the highly discriminatory nature of electron-density maps. This method has been implemented in the program Fitmunk, which uses extensive conformational sampling. This improves the accuracy of the modeling and makes it a versatile tool for crystallographic model building, refinement and validation. Fitmunk was extensively tested on over 115 new structures, as well as a subset of 1100 structures from the PDB. It is demonstrated that the ability of Fitmunk to model more than 95% of side chains accurately is beneficial for improving the quality of crystallographic protein models, especially at medium and low resolutions. Fitmunk can be used for model validation of existing structures and as a tool to assess whether side chains are modeled optimally or could be better fitted into electron density. Fitmunk is available as a web service at http://kniahini.med.virginia.edu/fitmunk/server/ or at http://fitmunk.bitbucket.org/.
Collapse
Affiliation(s)
- Przemyslaw Jerzy Porebski
- Department of Molecular Physiology and Biological Physics, University of Virginia, Jordan Hall, 1340 Jefferson Park Avenue, Charlottesville, VA 22908, USA
- Faculty of Biochemistry, Biophysics and Biotechnology, Jagiellonian University, ul. Gronostajowa 7, 30-387 Kraków, Poland
| | - Marcin Cymborowski
- Department of Molecular Physiology and Biological Physics, University of Virginia, Jordan Hall, 1340 Jefferson Park Avenue, Charlottesville, VA 22908, USA
| | - Marta Pasenkiewicz-Gierula
- Faculty of Biochemistry, Biophysics and Biotechnology, Jagiellonian University, ul. Gronostajowa 7, 30-387 Kraków, Poland
| | - Wladek Minor
- Department of Molecular Physiology and Biological Physics, University of Virginia, Jordan Hall, 1340 Jefferson Park Avenue, Charlottesville, VA 22908, USA
| |
Collapse
|
22
|
Zheng H, Handing KB, Zimmerman MD, Shabalin IG, Almo SC, Minor W. X-ray crystallography over the past decade for novel drug discovery - where are we heading next? Expert Opin Drug Discov 2015; 10:975-89. [PMID: 26177814 DOI: 10.1517/17460441.2015.1061991] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
INTRODUCTION Macromolecular X-ray crystallography has been the primary methodology for determining the three-dimensional structures of proteins, nucleic acids and viruses. Structural information has paved the way for structure-guided drug discovery and laid the foundations for structural bioinformatics. However, X-ray crystallography still has a few fundamental limitations, some of which may be overcome and complemented using emerging methods and technologies in other areas of structural biology. AREAS COVERED This review describes how structural knowledge gained from X-ray crystallography has been used to advance other biophysical methods for structure determination (and vice versa). This article also covers current practices for integrating data generated by other biochemical and biophysical methods with those obtained from X-ray crystallography. Finally, the authors articulate their vision about how a combination of structural and biochemical/biophysical methods may improve our understanding of biological processes and interactions. EXPERT OPINION X-ray crystallography has been, and will continue to serve as, the central source of experimental structural biology data used in the discovery of new drugs. However, other structural biology techniques are useful not only to overcome the major limitation of X-ray crystallography, but also to provide complementary structural data that is useful in drug discovery. The use of recent advancements in biochemical, spectroscopy and bioinformatics methods may revolutionize drug discovery, albeit only when these data are combined and analyzed with effective data management systems. Accurate and complete data management is crucial for developing experimental procedures that are robust and reproducible.
Collapse
Affiliation(s)
- Heping Zheng
- University of Virginia, Department of Molecular Physiology and Biological Physics , 1340 Jefferson Park Avenue, Charlottesville, VA 22908 , USA +1 434 243 6865 ; +1 434 243 2981 ;
| | | | | | | | | | | |
Collapse
|
23
|
Morshed N, Echols N, Adams PD. Using support vector machines to improve elemental ion identification in macromolecular crystal structures. ACTA CRYSTALLOGRAPHICA. SECTION D, BIOLOGICAL CRYSTALLOGRAPHY 2015; 71:1147-58. [PMID: 25945580 PMCID: PMC4427199 DOI: 10.1107/s1399004715004241] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/29/2014] [Accepted: 03/01/2015] [Indexed: 11/11/2022]
Abstract
In the process of macromolecular model building, crystallographers must examine electron density for isolated atoms and differentiate sites containing structured solvent molecules from those containing elemental ions. This task requires specific knowledge of metal-binding chemistry and scattering properties and is prone to error. A method has previously been described to identify ions based on manually chosen criteria for a number of elements. Here, the use of support vector machines (SVMs) to automatically classify isolated atoms as either solvent or one of various ions is described. Two data sets of protein crystal structures, one containing manually curated structures deposited with anomalous diffraction data and another with automatically filtered, high-resolution structures, were constructed. On the manually curated data set, an SVM classifier was able to distinguish calcium from manganese, zinc, iron and nickel, as well as all five of these ions from water molecules, with a high degree of accuracy. Additionally, SVMs trained on the automatically curated set of high-resolution structures were able to successfully classify most common elemental ions in an independent validation test set. This method is readily extensible to other elemental ions and can also be used in conjunction with previous methods based on a priori expectations of the chemical environment and X-ray scattering.
Collapse
Affiliation(s)
- Nader Morshed
- College of Letters and Science, University of California, Berkeley, CA 94720, USA
- Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Nathaniel Echols
- Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Paul D. Adams
- Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
- Department of Bioengineering, University of California, Berkeley, CA 94720, USA
| |
Collapse
|
24
|
Dauter Z, Wlodawer A, Minor W, Jaskolski M, Rupp B. Avoidable errors in deposited macromolecular structures: an impediment to efficient data mining. IUCRJ 2014; 1:179-93. [PMID: 25075337 PMCID: PMC4086436 DOI: 10.1107/s2052252514005442] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/24/2014] [Accepted: 03/10/2014] [Indexed: 05/20/2023]
Abstract
Whereas the vast majority of the more than 85 000 crystal structures of macromolecules currently deposited in the Protein Data Bank are of high quality, some suffer from a variety of imperfections. Although this fact has been pointed out in the past, it is still worth periodic updates so that the metadata obtained by global analysis of the available crystal structures, as well as the utilization of the individual structures for tasks such as drug design, should be based on only the most reliable data. Here, selected abnormal deposited structures have been analysed based on the Bayesian reasoning that the correctness of a model must be judged against both the primary evidence as well as prior knowledge. These structures, as well as information gained from the corresponding publications (if available), have emphasized some of the most prevalent types of common problems. The errors are often perfect illustrations of the nature of human cognition, which is frequently influenced by preconceptions that may lead to fanciful results in the absence of proper validation. Common errors can be traced to negligence and a lack of rigorous verification of the models against electron density, creation of non-parsimonious models, generation of improbable numbers, application of incorrect symmetry, illogical presentation of the results, or violation of the rules of chemistry and physics. Paying more attention to such problems, not only in the final validation stages but during the structure-determination process as well, is necessary not only in order to maintain the highest possible quality of the structural repositories and databases but most of all to provide a solid basis for subsequent studies, including large-scale data-mining projects. For many scientists PDB deposition is a rather infrequent event, so the need for proper training and supervision is emphasized, as well as the need for constant alertness of reason and critical judgment as absolutely necessary safeguarding measures against such problems. Ways of identifying more problematic structures are suggested so that their users may be properly alerted to their possible shortcomings.
Collapse
Affiliation(s)
- Zbigniew Dauter
- Synchrotron Radiation Research Section, Macromolecular Crystallography Laboratory, NCI, Argonne National Laboratory, Argonne, IL 60439, USA
| | - Alexander Wlodawer
- Protein Structure Section, Macromolecular Crystallography Laboratory, NCI at Frederick, Frederick, MD 21702, USA
| | - Wladek Minor
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22908, USA
- Midwest Center for Structural Genomics, USA
- New York Structural Genomics Consortium, USA
- Center for Structural Genomics of Infectious Diseases, USA
- Enzyme Function Initiative, USA
| | - Mariusz Jaskolski
- Department of Crystallography, Faculty of Chemistry, A. Mickiewicz University, Poznan, Poland
- Center for Biocrystallographic Research, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Bernhard Rupp
- k.-k. Hofkristallamt, 991 Audrey Place, Vista, CA 92084, USA
- Department of Genetic Epidemiology, Innsbruck Medical University, Schöpfstrasse 41, A-6020 Innsbruck, Austria
| |
Collapse
|
25
|
Rodrigues JPGLM, Bonvin AMJJ. Integrative computational modeling of protein interactions. FEBS J 2014; 281:1988-2003. [DOI: 10.1111/febs.12771] [Citation(s) in RCA: 86] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2013] [Revised: 01/03/2014] [Accepted: 02/19/2014] [Indexed: 01/09/2023]
Affiliation(s)
- João P. G. L. M. Rodrigues
- Computational Structural Biology Group; Bijvoet Center for Biomolecular Research; Utrecht University; the Netherlands
| | - Alexandre M. J. J. Bonvin
- Computational Structural Biology Group; Bijvoet Center for Biomolecular Research; Utrecht University; the Netherlands
| |
Collapse
|
26
|
Abstract
INTRODUCTION X-ray crystallography plays an important role in structure-based drug design (SBDD), and accurate analysis of crystal structures of target macromolecules and macromolecule-ligand complexes is critical at all stages. However, whereas there has been significant progress in improving methods of structural biology, particularly in X-ray crystallography, corresponding progress in the development of computational methods (such as in silico high-throughput screening) is still on the horizon. Crystal structures can be overinterpreted and thus bias hypotheses and follow-up experiments. As in any experimental science, the models of macromolecular structures derived from X-ray diffraction data have their limitations, which need to be critically evaluated and well understood for structure-based drug discovery. AREAS COVERED This review describes how the validity, accuracy and precision of a protein or nucleic acid structure determined by X-ray crystallography can be evaluated from three different perspectives: i) the nature of the diffraction experiment; ii) the interpretation of an electron density map; and iii) the interpretation of the structural model in terms of function and mechanism. The strategies to optimally exploit a macromolecular structure are also discussed in the context of 'Big Data' analysis, biochemical experimental design and structure-based drug discovery. EXPERT OPINION Although X-ray crystallography is one of the most detailed 'microscopes' available today for examining macromolecular structures, the authors would like to re-emphasize that such structures are only simplified models of the target macromolecules. The authors also wish to reinforce the idea that a structure should not be thought of as a set of precise coordinates but rather as a framework for generating hypotheses to be explored. Numerous biochemical and biophysical experiments, including new diffraction experiments, can and should be performed to verify or falsify these hypotheses. X-ray crystallography will find its future application in drug discovery by the development of specific tools that would allow realistic interpretation of the outcome coordinates and/or support testing of these hypotheses.
Collapse
Affiliation(s)
- Heping Zheng
- University of Virginia, Department of Molecular Physiology and Biological Physics, 1340 Jefferson Park Avenue, Charlottesville, VA 22908, USA
- Center for Structural Genomics of Infectious Diseases (CSGID)
- Midwest Center for Structural Genomics (MCSG), USA
- New York Structural Genomics Research Consortium (NYSGRC), USA
- Specializes in Protein Crystallography, Data Analytics and Data Mining, Research Scientist
| | - Jing Hou
- University of Virginia, Department of Molecular Physiology and Biological Physics, 1340 Jefferson Park Avenue, Charlottesville, VA 22908, USA
- Center for Structural Genomics of Infectious Diseases (CSGID)
- Enzyme Structure Initiative (EFI), USA
- New York Structural Genomics Research Consortium (NYSGRC), USA
- Specializes in Protein Crystallography, Research Associate
| | - Matthew D Zimmerman
- University of Virginia, Department of Molecular Physiology and Biological Physics, 1340 Jefferson Park Avenue, Charlottesville, VA 22908, USA
- Center for Structural Genomics of Infectious Diseases (CSGID)
- Enzyme Structure Initiative (EFI), USA
- Midwest Center for Structural Genomics (MCSG), USA
- New York Structural Genomics Research Consortium (NYSGRC), USA
- Specializes in Protein Crystallography, Data Mining and Management, Instructor of Research
| | - Alexander Wlodawer
- National Cancer Institute, Center for Cancer Research, Frederick, MD 21702, USA
- Specializes in Macromolecular Structure and Function, Chief of the Macromolecular Crystallography Laboratory
| | - Wladek Minor
- University of Virginia, Department of Molecular Physiology and Biological Physics, 1340 Jefferson Park Avenue, Charlottesville, VA 22908, USA
- Center for Structural Genomics of Infectious Diseases (CSGID)
- Enzyme Structure Initiative (EFI), USA
- Midwest Center for Structural Genomics (MCSG), USA
- New York Structural Genomics Research Consortium (NYSGRC), USA
- Specializes in Structural Biology, Data Mining and Management, Professor
| |
Collapse
|
27
|
Domagalski MJ, Zheng H, Zimmerman MD, Dauter Z, Wlodawer A, Minor W. The quality and validation of structures from structural genomics. Methods Mol Biol 2014; 1091:297-314. [PMID: 24203341 DOI: 10.1007/978-1-62703-691-7_21] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Quality control of three-dimensional structures of macromolecules is a critical step to ensure the integrity of structural biology data, especially those produced by structural genomics centers. Whereas the Protein Data Bank (PDB) has proven to be a remarkable success overall, the inconsistent quality of structures reveals a lack of universal standards for structure/deposit validation. Here, we review the state-of-the-art methods used in macromolecular structure validation, focusing on validation of structures determined by X-ray crystallography. We describe some general protocols used in the rebuilding and re-refinement of problematic structural models. We also briefly discuss some frontier areas of structure validation, including refinement of protein-ligand complexes, automation of structure redetermination, and the use of NMR structures and computational models to solve X-ray crystal structures by molecular replacement.
Collapse
Affiliation(s)
- Marcin J Domagalski
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA
| | | | | | | | | | | |
Collapse
|
28
|
Validation of metal-binding sites in macromolecular structures with the CheckMyMetal web server. Nat Protoc 2013; 9:156-70. [PMID: 24356774 DOI: 10.1038/nprot.2013.172] [Citation(s) in RCA: 226] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Metals have vital roles in both the mechanism and architecture of biological macromolecules. Yet structures of metal-containing macromolecules in which metals are misidentified and/or suboptimally modeled are abundant in the Protein Data Bank (PDB). This shows the need for a diagnostic tool to identify and correct such modeling problems with metal-binding environments. The CheckMyMetal (CMM) web server (http://csgid.org/csgid/metal_sites/) is a sophisticated, user-friendly web-based method to evaluate metal-binding sites in macromolecular structures using parameters derived from 7,350 metal-binding sites observed in a benchmark data set of 2,304 high-resolution crystal structures. The protocol outlines how the CMM server can be used to detect geometric and other irregularities in the structures of metal-binding sites, as well as how it can alert researchers to potential errors in metal assignment. The protocol also gives practical guidelines for correcting problematic sites by modifying the metal-binding environment and/or redefining metal identity in the PDB file. Several examples where this has led to meaningful results are described in the ANTICIPATED RESULTS section. CMM was designed for a broad audience--biomedical researchers studying metal-containing proteins and nucleic acids--but it is equally well suited for structural biologists validating new structures during modeling or refinement. The CMM server takes the coordinates of a metal-containing macromolecule structure in the PDB format as input and responds within a few seconds for a typical protein structure with 2-5 metal sites and a few hundred amino acids.
Collapse
|
29
|
Laitaoja M, Valjakka J, Jänis J. Zinc coordination spheres in protein structures. Inorg Chem 2013; 52:10983-91. [PMID: 24059258 DOI: 10.1021/ic401072d] [Citation(s) in RCA: 166] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Zinc metalloproteins are one of the most abundant and structurally diverse proteins in nature. In these proteins, the Zn(II) ion possesses a multifunctional role as it stabilizes the fold of small zinc fingers, catalyzes essential reactions in enzymes of all six classes, or assists in the formation of biological oligomers. Previously, a number of database surveys have been conducted on zinc proteins to gain broader insights into their rich coordination chemistry. However, many of these surveys suffer from severe flaws and misinterpretations or are otherwise limited. To provide a more comprehensive, up-to-date picture on zinc coordination environments in proteins, zinc containing protein structures deposited in the Protein Data Bank (PDB) were analyzed in detail. A statistical analysis in terms of zinc coordinating amino acids, metal-to-ligand bond lengths, coordination number, and structural classification was performed, revealing coordination spheres from classical tetrahedral cysteine/histidine binding sites to more complex binuclear sites with carboxylated lysine residues. According to the results, coordination spheres of hundreds of crystal structures in the PDB could be misinterpreted due to symmetry-related molecules or missing electron densities for ligands. The analysis also revealed increasing average metal-to-ligand bond length as a function of crystallographic resolution, which should be taken into account when interrogating metal ion binding sites. Moreover, one-third of the zinc ions present in crystal structures are artifacts, merely aiding crystal formation and packing with no biological significance. Our analysis provides solid evidence that a minimal stable zinc coordination sphere is made up by four ligands and adopts a tetrahedral coordination geometry.
Collapse
Affiliation(s)
- Mikko Laitaoja
- University of Eastern Finland , Department of Chemistry, P.O. Box 111, FI-80101 Joensuu, Finland
| | | | | |
Collapse
|
30
|
Wlodawer A, Minor W, Dauter Z, Jaskolski M. Protein crystallography for aspiring crystallographers or how to avoid pitfalls and traps in macromolecular structure determination. FEBS J 2013; 280:5705-36. [PMID: 24034303 DOI: 10.1111/febs.12495] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2013] [Revised: 08/12/2013] [Accepted: 08/20/2013] [Indexed: 12/28/2022]
Abstract
The number of macromolecular structures deposited in the Protein Data Bank now approaches 100,000, with the vast majority of them determined by crystallographic methods. Thousands of papers describing such structures have been published in the scientific literature, and 20 Nobel Prizes in chemistry or medicine have been awarded for discoveries based on macromolecular crystallography. New hardware and software tools have made crystallography appear to be an almost routine (but still far from being analytical) technique and many structures are now being determined by scientists with very limited experience in the practical aspects of the field. However, this apparent ease is sometimes illusory and proper procedures need to be followed to maintain high standards of structure quality. In addition, many noncrystallographers may have problems with the critical evaluation and interpretation of structural results published in the scientific literature. The present review provides an outline of the technical aspects of crystallography for less experienced practitioners, as well as information that might be useful for users of macromolecular structures, aiming to show them how to interpret (but not overinterpret) the information present in the coordinate files and in their description. A discussion of the extent of information that can be gleaned from the atomic coordinates of structures solved at different resolution is provided, as well as problems and pitfalls encountered in structure determination and interpretation.
Collapse
Affiliation(s)
- Alexander Wlodawer
- Protein Structure Section, Macromolecular Crystallography Laboratory, NCI at Frederick, Frederick, MD, USA
| | | | | | | |
Collapse
|
31
|
Parisien M, Wang X, Perdrizet G, Lamphear C, Fierke CA, Maheshwari KC, Wilde MJ, Sosnick TR, Pan T. Discovering RNA-protein interactome by using chemical context profiling of the RNA-protein interface. Cell Rep 2013; 3:1703-13. [PMID: 23665222 DOI: 10.1016/j.celrep.2013.04.010] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2012] [Revised: 03/04/2013] [Accepted: 04/12/2013] [Indexed: 02/04/2023] Open
Abstract
RNA-protein (RNP) interactions generally are required for RNA function. At least 5% of human genes code for RNA-binding proteins. Whereas many approaches can identify the RNA partners for a specific protein, finding the protein partners for a specific RNA is difficult. We present a machine-learning method that scores a protein's binding potential for an RNA structure by utilizing the chemical context profiles of the interface from known RNP structures. Our approach is applicable even when only a single RNP structure is available. We examined 801 mammalian proteins and find that 37 (4.6%) potentially bind transfer RNA (tRNA). Most are enzymes involved in cellular processes unrelated to translation and were not known to interact with RNA. We experimentally tested six positive and three negative predictions for tRNA binding in vivo, and all nine predictions were correct. Our computational approach provides a powerful complement to experiments in discovering new RNPs.
Collapse
Affiliation(s)
- Marc Parisien
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL 60637, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Yamaguchi H, Maeki M, Yamashita K, Nakamura H, Miyazaki M, Maeda H. Controlling one protein crystal growth by droplet-based microfluidic system. ACTA ACUST UNITED AC 2013; 153:339-46. [DOI: 10.1093/jb/mvt001] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
33
|
Berman HM. Creating a community resource for protein science. Protein Sci 2012; 21:1587-96. [PMID: 22969036 PMCID: PMC3527698 DOI: 10.1002/pro.2154] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2012] [Accepted: 08/30/2012] [Indexed: 12/13/2022]
Abstract
In addition to being one of the early pioneers in protein crystallography, Carl Brändén made significant contributions to science education with his elegant and beautifully illustrated book Introduction to Protein Structure (Brändén and Tooze, New York: Garland, 1991). It is truly an honor to receive this award in their names. This award and the 40th anniversary of the Protein Data Bank (PDB; Berman et al., Structure 2012;20:391-396) have given me an opportunity to reflect on the various components that have contributed to building a resource for protein science and to try to quantify the impact of having PDB data openly available.
Collapse
Affiliation(s)
- Helen M Berman
- Department of Chemistry and Chemical Biology, Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, Piscataway, New Jersey 08854, USA.
| |
Collapse
|
34
|
Ellingson L, Zhang J. Protein surface matching by combining local and global geometric information. PLoS One 2012; 7:e40540. [PMID: 22815760 PMCID: PMC3398928 DOI: 10.1371/journal.pone.0040540] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2011] [Accepted: 06/12/2012] [Indexed: 01/01/2023] Open
Abstract
Comparison of the binding sites of proteins is an effective means for predicting protein functions based on their structure information. Despite the importance of this problem and much research in the past, it is still very challenging to predict the binding ligands from the atomic structures of protein binding sites. Here, we designed a new algorithm, TIPSA (Triangulation-based Iterative-closest-point for Protein Surface Alignment), based on the iterative closest point (ICP) algorithm. TIPSA aims to find the maximum number of atoms that can be superposed between two protein binding sites, where any pair of superposed atoms has a distance smaller than a given threshold. The search starts from similar tetrahedra between two binding sites obtained from 3D Delaunay triangulation and uses the Hungarian algorithm to find additional matched atoms. We found that, due to the plasticity of protein binding sites, matching the rigid body of point clouds of protein binding sites is not adequate for satisfactory binding ligand prediction. We further incorporated global geometric information, the radius of gyration of binding site atoms, and used nearest neighbor classification for binding site prediction. Tested on benchmark data, our method achieved a performance comparable to the best methods in the literature, while simultaneously providing the common atom set and atom correspondences.
Collapse
Affiliation(s)
- Leif Ellingson
- Department of Mathematics and Statistics, Texas Tech University, Lubbock, Texas, United States of America
| | - Jinfeng Zhang
- Department of Statistics, Florida State University, Tallahassee, Florida, United States of America
- * E-mail:
| |
Collapse
|
35
|
Alvarez MA, Yan C. A new protein graph model for function prediction. Comput Biol Chem 2012; 37:6-10. [PMID: 22381922 DOI: 10.1016/j.compbiolchem.2012.01.003] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2011] [Revised: 01/02/2012] [Accepted: 01/04/2012] [Indexed: 11/27/2022]
Abstract
As several structural proteomic projects are producing an increasing number of protein structures with unknown function, methods that can reliably predict protein functions from protein structures are in urgent need. In this paper, we present a method to explore the clustering patterns of amino acids on the 3-dimensional space for protein function prediction. First, amino acid residues on a protein structure are clustered into spatial groups using hierarchical agglomerative clustering, based on the distance between them. Second, the protein structure is represented using a graph, where each node denotes a cluster of amino acids. The nodes are labeled with an evolutionary profile derived from the multiple alignment of homologous sequences. Then, a shortest-path graph kernel is used to calculate similarities between the graphs. Finally, a support vector machine using this graph kernel is used to train classifiers for protein function prediction. We applied the proposed method to two separate problems, namely, prediction of enzymes and prediction of DNA-binding proteins. In both cases, the results showed that the proposed method outperformed other state-of-the-art methods.
Collapse
Affiliation(s)
- Marco A Alvarez
- Department of Computer Science, Utah State University, Logan, UT 84322, USA
| | | |
Collapse
|
36
|
Overton IM, Barton GJ. Computational approaches to selecting and optimising targets for structural biology. Methods 2011; 55:3-11. [PMID: 21906678 PMCID: PMC3202631 DOI: 10.1016/j.ymeth.2011.08.014] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2011] [Revised: 08/18/2011] [Accepted: 08/22/2011] [Indexed: 11/29/2022] Open
Abstract
Selection of protein targets for study is central to structural biology and may be influenced by numerous factors. A key aim is to maximise returns for effort invested by identifying proteins with the balance of biophysical properties that are conducive to success at all stages (e.g. solubility, crystallisation) in the route towards a high resolution structural model. Selected targets can be optimised through construct design (e.g. to minimise protein disorder), switching to a homologous protein, and selection of experimental methodology (e.g. choice of expression system) to prime for efficient progress through the structural proteomics pipeline. Here we discuss computational techniques in target selection and optimisation, with more detailed focus on tools developed within the Scottish Structural Proteomics Facility (SSPF); namely XANNpred, ParCrys, OB-Score (target selection) and TarO (target optimisation). TarO runs a large number of algorithms, searching for homologues and annotating the pool of possible alternative targets. This pool of putative homologues is presented in a ranked, tabulated format and results are also visualised as an automatically generated and annotated multiple sequence alignment. The target selection algorithms each predict the propensity of a selected protein target to progress through the experimental stages leading to diffracting crystals. This single predictor approach has advantages for target selection, when compared with an approach using two or more predictors that each predict for success at a single experimental stage. The tools described here helped SSPF achieve a high (21%) success rate in progressing cloned targets to diffraction-quality crystals.
Collapse
Affiliation(s)
- Ian M Overton
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, Western General Hospital, Crewe Road, Edinburgh EH4 2XU, United Kingdom.
| | | |
Collapse
|
37
|
Daniel E, Lin B, Diprose JM, Griffiths SL, Morris C, Berry IM, Owens RJ, Blake R, Wilson KS, Stuart DI, Esnouf RM. xtalPiMS: a PiMS-based web application for the management and monitoring of crystallization trials. J Struct Biol 2011; 175:230-5. [PMID: 21605683 PMCID: PMC3477317 DOI: 10.1016/j.jsb.2011.05.008] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2011] [Revised: 04/29/2011] [Accepted: 05/07/2011] [Indexed: 11/29/2022]
Abstract
A major advance in protein structure determination has been the advent of nanolitre-scale crystallization and (in a high-throughput environment) the development of robotic systems for storing and imaging crystallization trials. Most of these trials are carried out in 96-well (or higher density) plates and managing them is a significant information management challenge. We describe xtalPiMS, a web-based application for the management and monitoring of crystallization trials. xtalPiMS has a user-interface layer based on the standards of the Protein Information Management System (PiMS) and a database layer which links the crystallization trial images to the meta-data associated with a particular crystallization trial. The user interface has been optimized for the efficient monitoring of high-throughput environments with three different automated imagers and work to support a fourth imager is in progress, but it can even be of use without robotics. The database can either be a PiMS database or a legacy database for which a suitable mapping layer has been developed.
Collapse
Affiliation(s)
- Ed Daniel
- CSED, STFC Daresbury Laboratory, Warrington WA4 4AD, UK
| | - Bill Lin
- CSED, STFC Daresbury Laboratory, Warrington WA4 4AD, UK
| | - Jonathan M. Diprose
- Division of Structural Biology, University of Oxford, Henry Wellcome Building for Genomic Medicine, Roosevelt Drive, Oxford OX3 7BN, UK
- The Oxford Protein Production Facility UK, Research Complex at Harwell, Rutherford Appleton Laboratory, R92, Harwell Oxford, Didcot OX11 0FA, UK
| | - Susanne L. Griffiths
- York Structural Biology Laboratory, Department of Chemistry, University of York, Heslington, York YO10 5DD, UK
| | - Chris Morris
- CSED, STFC Daresbury Laboratory, Warrington WA4 4AD, UK
| | - Ian M. Berry
- Division of Structural Biology, University of Oxford, Henry Wellcome Building for Genomic Medicine, Roosevelt Drive, Oxford OX3 7BN, UK
| | - Raymond J. Owens
- Division of Structural Biology, University of Oxford, Henry Wellcome Building for Genomic Medicine, Roosevelt Drive, Oxford OX3 7BN, UK
- The Oxford Protein Production Facility UK, Research Complex at Harwell, Rutherford Appleton Laboratory, R92, Harwell Oxford, Didcot OX11 0FA, UK
| | - Richard Blake
- CSED, STFC Daresbury Laboratory, Warrington WA4 4AD, UK
| | - Keith S. Wilson
- York Structural Biology Laboratory, Department of Chemistry, University of York, Heslington, York YO10 5DD, UK
| | - David I. Stuart
- Division of Structural Biology, University of Oxford, Henry Wellcome Building for Genomic Medicine, Roosevelt Drive, Oxford OX3 7BN, UK
- Diamond Light Source Ltd., Diamond House, Harwell Science and Innovation Campus, Didcot OX11 0DE, UK
| | - Robert M. Esnouf
- Division of Structural Biology, University of Oxford, Henry Wellcome Building for Genomic Medicine, Roosevelt Drive, Oxford OX3 7BN, UK
- Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN, UK
| |
Collapse
|
38
|
Cooper DR, Porebski PJ, Chruszcz M, Minor W. X-ray crystallography: Assessment and validation of protein-small molecule complexes for drug discovery. Expert Opin Drug Discov 2011; 6:771-782. [PMID: 21779303 PMCID: PMC3138648 DOI: 10.1517/17460441.2011.585154] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
INTRODUCTION: Crystallography is the key initial component for structure-based and fragment-based drug design and can often generate leads that can be developed into high potency drugs. Therefore, huge sums of money are committed based on the outcome of crystallography experiments and their interpretation. AREAS COVERED: This review discusses how to evaluate the correctness of an X-ray structure, focusing on the validation of small molecule-protein complexes. Various types of inaccuracies found within the PDB are identified and the ramifications of these errors are discussed. The reader will gain an understanding of the key parameters that need to be inspected before a structure can be used in drug discovery efforts, as well as an appreciation of the difficulties of correctly interpreting electron density for small molecules. The reader will also be introduced to methods for validating small molecules within the context of a macromolecular structure. EXPERT OPINION: One of the reasons that ligand identification and positioning, within a macromolecular crystal structure, is so difficult is that the quality of small molecules widely varies in the PDB. For this reason, the PDB can not always be considered a reliable repository of structural information pertaining to small molecules, and this makes the derivation of general principles that govern small molecule-protein interactions more difficult.
Collapse
Affiliation(s)
- David R Cooper
- Department of Molecular Physiology and Biological Physics, University of Virginia, 1340 Jefferson Park Avenue, Charlottesville, VA 22908, USA
| | | | | | | |
Collapse
|
39
|
Julfayev ES, McLaughlin RJ, Tao YP, McLaughlin WA. A new approach to assess and predict the functional roles of proteins across all known structures. JOURNAL OF STRUCTURAL AND FUNCTIONAL GENOMICS 2011; 12:9-20. [PMID: 21445639 PMCID: PMC3089730 DOI: 10.1007/s10969-011-9105-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/17/2010] [Accepted: 03/14/2011] [Indexed: 12/11/2022]
Abstract
The three dimensional atomic structures of proteins provide information regarding their function; and codified relationships between structure and function enable the assessment of function from structure. In the current study, a new data mining tool was implemented that checks current gene ontology (GO) annotations and predicts new ones across all the protein structures available in the Protein Data Bank (PDB). The tool overcomes some of the challenges of utilizing large amounts of protein annotation and measurement information to form correspondences between protein structure and function. Protein attributes were extracted from the Structural Biology Knowledgebase and open source biological databases. Based on the presence or absence of a given set of attributes, a given protein's functional annotations were inferred. The results show that attributes derived from the three dimensional structures of proteins enhanced predictions over that using attributes only derived from primary amino acid sequence. Some predictions reflected known but not completely documented GO annotations. For example, predictions for the GO term for copper ion binding reflected used information a copper ion was known to interact with the protein based on information in a ligand interaction database. Other predictions were novel and require further experimental validation. These include predictions for proteins labeled as unknown function in the PDB. Two examples are a role in the regulation of transcription for the protein AF1396 from Archaeoglobus fulgidus and a role in RNA metabolism for the protein psuG from Thermotoga maritima.
Collapse
Affiliation(s)
- Elchin S. Julfayev
- Department of Basic Science, The Commonwealth Medical College, 525 Pine Street, Scranton, PA 18509 USA
| | - Ryan J. McLaughlin
- Department of Basic Science, The Commonwealth Medical College, 525 Pine Street, Scranton, PA 18509 USA
| | - Yi-Ping Tao
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, 610 Taylor Road, Piscataway, NJ 08854-8087 USA
| | - William A. McLaughlin
- Department of Basic Science, The Commonwealth Medical College, 525 Pine Street, Scranton, PA 18509 USA
| |
Collapse
|