1
|
Alharbi E, Calinescu R, Cowtan K. Buccaneer model building with neural network fragment selection. Acta Crystallogr D Struct Biol 2023; 79:326-338. [PMID: 36974965 PMCID: PMC10071564 DOI: 10.1107/s205979832300181x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Accepted: 02/27/2023] [Indexed: 03/29/2023] Open
Abstract
Tracing the backbone is a critical step in protein model building, as incorrect tracing leads to poor protein models. Here, a neural network trained to identify unfavourable fragments and remove them from the model-building process in order to improve backbone tracing is presented. Moreover, a decision tree was trained to select an optimal threshold to eliminate unfavourable fragments. The neural network was tested on experimental phasing data sets from the Joint Center for Structural Genomics (JCSG), recently deposited experimental phasing data sets (from 2015 to 2021) and molecular-replacement data sets. The experimental results show that using the neural network in the Buccaneer protein-model-building software can produce significantly more complete protein models than those built using Buccaneer alone. In particular, Buccaneer with the neural network built protein models with a completeness that was at least 5% higher for 25% and 50% of the original and truncated resolution JCSG experimental phasing data sets, respectively, for 28% of the recently collected experimental phasing data sets and for 43% of the molecular-replacement data sets.
Collapse
Affiliation(s)
- Emad Alharbi
- Department of Computer Science, University of York, Heslington, York YO10 5GH, United Kingdom
| | - Radu Calinescu
- Department of Computer Science, University of York, Heslington, York YO10 5GH, United Kingdom
| | - Kevin Cowtan
- Department of Chemistry, University of York, Heslington, York YO10 5DD, United Kingdom
| |
Collapse
|
2
|
Krissinel E, Lebedev AA, Uski V, Ballard CB, Keegan RM, Kovalevskiy O, Nicholls RA, Pannu NS, Skubák P, Berrisford J, Fando M, Lohkamp B, Wojdyr M, Simpkin AJ, Thomas JMH, Oliver C, Vonrhein C, Chojnowski G, Basle A, Purkiss A, Isupov MN, McNicholas S, Lowe E, Triviño J, Cowtan K, Agirre J, Rigden DJ, Uson I, Lamzin V, Tews I, Bricogne G, Leslie AGW, Brown DG. CCP4 Cloud for structure determination and project management in macromolecular crystallography. Acta Crystallogr D Struct Biol 2022; 78:1079-1089. [PMID: 36048148 PMCID: PMC9435598 DOI: 10.1107/s2059798322007987] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 08/08/2022] [Indexed: 11/10/2022] Open
Abstract
Nowadays, progress in the determination of three-dimensional macromolecular structures from diffraction images is achieved partly at the cost of increasing data volumes. This is due to the deployment of modern high-speed, high-resolution detectors, the increased complexity and variety of crystallographic software, the use of extensive databases and high-performance computing. This limits what can be accomplished with personal, offline, computing equipment in terms of both productivity and maintainability. There is also an issue of long-term data maintenance and availability of structure-solution projects as the links between experimental observations and the final results deposited in the PDB. In this article, CCP4 Cloud, a new front-end of the CCP4 software suite, is presented which mitigates these effects by providing an online, cloud-based environment for crystallographic computation. CCP4 Cloud was developed for the efficient delivery of computing power, database services and seamless integration with web resources. It provides a rich graphical user interface that allows project sharing and long-term storage for structure-solution projects, and can be linked to data-producing facilities. The system is distributed with the CCP4 software suite version 7.1 and higher, and an online publicly available instance of CCP4 Cloud is provided by CCP4.
Collapse
|
3
|
Olek M, Cowtan K, Webb D, Chaban Y, Zhang P. IceBreaker: Software for high-resolution single-particle cryo-EM with non-uniform ice. Structure 2022; 30:522-531.e4. [PMID: 35150604 PMCID: PMC9033277 DOI: 10.1016/j.str.2022.01.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Revised: 12/01/2021] [Accepted: 01/18/2022] [Indexed: 12/23/2022]
Abstract
Despite the abundance of available software tools, optimal particle selection is still a vital issue in single-particle cryoelectron microscopy (cryo-EM). Regardless of the method used, most pickers struggle when ice thickness varies on a micrograph. IceBreaker allows users to estimate the relative ice gradient and flatten it by equalizing the local contrast. It allows the differentiation of particles from the background and improves overall particle picking performance. Furthermore, we introduce an additional parameter corresponding to local ice thickness for each particle. Particles with a defined ice thickness can be grouped and filtered based on this parameter during processing. These functionalities are especially valuable for on-the-fly processing to automatically pick as many particles as possible from each micrograph and to select optimal regions for data collection. Finally, estimated ice gradient distributions can be stored separately and used to inspect the quality of prepared samples.
Collapse
Affiliation(s)
- Mateusz Olek
- Electron Bio-Imaging Centre, Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, UK; Department of Chemistry, University of York, York, UK
| | - Kevin Cowtan
- Department of Chemistry, University of York, York, UK
| | - Donovan Webb
- Electron Bio-Imaging Centre, Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, UK
| | - Yuriy Chaban
- Electron Bio-Imaging Centre, Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, UK.
| | - Peijun Zhang
- Electron Bio-Imaging Centre, Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, UK; Division of Structural Biology, Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK; Chinese Academy of Medical Sciences Oxford Institute, University of Oxford, Oxford OX3 7BN, UK.
| |
Collapse
|
4
|
Joseph AP, Olek M, Malhotra S, Zhang P, Cowtan K, Burnley T, Winn MD. Atomic model validation using the CCP-EM software suite. Acta Crystallogr D Struct Biol 2022; 78:152-161. [PMID: 35102881 PMCID: PMC8805302 DOI: 10.1107/s205979832101278x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Accepted: 12/01/2021] [Indexed: 12/02/2022] Open
Abstract
Recently, there has been a dramatic improvement in the quality and quantity of data derived using cryogenic electron microscopy (cryo-EM). This is also associated with a large increase in the number of atomic models built. Although the best resolutions that are achievable are improving, often the local resolution is variable, and a significant majority of data are still resolved at resolutions worse than 3 Å. Model building and refinement is often challenging at these resolutions, and hence atomic model validation becomes even more crucial to identify less reliable regions of the model. Here, a graphical user interface for atomic model validation, implemented in the CCP-EM software suite, is presented. It is aimed to develop this into a platform where users can access multiple complementary validation metrics that work across a range of resolutions and obtain a summary of evaluations. Based on the validation estimates from atomic models associated with cryo-EM structures from SARS-CoV-2, it was observed that models typically favor adopting the most common conformations over fitting the observations when compared with the model agreement with data. At low resolutions, the stereochemical quality may be favored over data fit, but care should be taken to ensure that the model agrees with the data in terms of resolvable features. It is demonstrated that further re-refinement can lead to improvement of the agreement with data without the loss of geometric quality. This also highlights the need for improved resolution-dependent weight optimization in model refinement and an effective test for overfitting that would help to guide the refinement process.
Collapse
Affiliation(s)
- Agnel Praveen Joseph
- Scientific Computing Department, Science and Technology Facilities Council, Didcot, United Kingdom
| | - Mateusz Olek
- Department of Chemistry, University of York, York, United Kingdom
- Electron BioImaging Center, Diamond Light Source, Rutherford Appleton Laboratory, Didcot, United Kingdom
| | - Sony Malhotra
- Scientific Computing Department, Science and Technology Facilities Council, Didcot, United Kingdom
| | - Peijun Zhang
- Electron BioImaging Center, Diamond Light Source, Rutherford Appleton Laboratory, Didcot, United Kingdom
| | - Kevin Cowtan
- Department of Chemistry, University of York, York, United Kingdom
| | - Tom Burnley
- Scientific Computing Department, Science and Technology Facilities Council, Didcot, United Kingdom
| | - Martyn D. Winn
- Scientific Computing Department, Science and Technology Facilities Council, Didcot, United Kingdom
| |
Collapse
|
5
|
Alharbi E, Bond P, Calinescu R, Cowtan K. Predicting the performance of automated crystallographic model-building pipelines. Acta Crystallogr D Struct Biol 2021; 77:1591-1601. [PMID: 34866614 PMCID: PMC8647178 DOI: 10.1107/s2059798321010500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Accepted: 10/10/2021] [Indexed: 12/02/2022] Open
Abstract
Proteins are macromolecules that perform essential biological functions which depend on their three-dimensional structure. Determining this structure involves complex laboratory and computational work. For the computational work, multiple software pipelines have been developed to build models of the protein structure from crystallographic data. Each of these pipelines performs differently depending on the characteristics of the electron-density map received as input. Identifying the best pipeline to use for a protein structure is difficult, as the pipeline performance differs significantly from one protein structure to another. As such, researchers often select pipelines that do not produce the best possible protein models from the available data. Here, a software tool is introduced which predicts key quality measures of the protein structures that a range of pipelines would generate if supplied with a given crystallographic data set. These measures are crystallographic quality-of-fit indicators based on included and withheld observations, and structure completeness. Extensive experiments carried out using over 2500 data sets show that the tool yields accurate predictions for both experimental phasing data sets (at resolutions between 1.2 and 4.0 Å) and molecular-replacement data sets (at resolutions between 1.0 and 3.5 Å). The tool can therefore provide a recommendation to the user concerning the pipelines that should be run in order to proceed most efficiently to a depositable model.
Collapse
Affiliation(s)
- Emad Alharbi
- Department of Computer Science, University of York, Heslington, York YO10 5GH, United Kingdom
- Department of Information Technology, University of Tabuk, Tabuk, Saudi Arabia
| | - Paul Bond
- Department of Chemistry, University of York, Heslington, York YO10 5DD, United Kingdom
| | - Radu Calinescu
- Department of Computer Science, University of York, Heslington, York YO10 5GH, United Kingdom
| | - Kevin Cowtan
- Department of Chemistry, University of York, Heslington, York YO10 5DD, United Kingdom
| |
Collapse
|
6
|
Cowtan K, Bond P, Hoh S. Macromolecular refinement at any resolution using shift field optimization and regularization. Acta Crystallogr A Found Adv 2021. [DOI: 10.1107/s0108767321089248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
|
7
|
Lawson CL, Kryshtafovych A, Adams PD, Afonine PV, Baker ML, Barad BA, Bond P, Burnley T, Cao R, Cheng J, Chojnowski G, Cowtan K, Dill KA, DiMaio F, Farrell DP, Fraser JS, Herzik MA, Hoh SW, Hou J, Hung LW, Igaev M, Joseph AP, Kihara D, Kumar D, Mittal S, Monastyrskyy B, Olek M, Palmer CM, Patwardhan A, Perez A, Pfab J, Pintilie GD, Richardson JS, Rosenthal PB, Sarkar D, Schäfer LU, Schmid MF, Schröder GF, Shekhar M, Si D, Singharoy A, Terashi G, Terwilliger TC, Vaiana A, Wang L, Wang Z, Wankowicz SA, Williams CJ, Winn M, Wu T, Yu X, Zhang K, Berman HM, Chiu W. Cryo-EM model validation recommendations based on outcomes of the 2019 EMDataResource challenge. Nat Methods 2021; 18:156-164. [PMID: 33542514 PMCID: PMC7864804 DOI: 10.1038/s41592-020-01051-w] [Citation(s) in RCA: 57] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Accepted: 12/21/2020] [Indexed: 01/30/2023]
Abstract
This paper describes outcomes of the 2019 Cryo-EM Model Challenge. The goals were to (1) assess the quality of models that can be produced from cryogenic electron microscopy (cryo-EM) maps using current modeling software, (2) evaluate reproducibility of modeling results from different software developers and users and (3) compare performance of current metrics used for model evaluation, particularly Fit-to-Map metrics, with focus on near-atomic resolution. Our findings demonstrate the relatively high accuracy and reproducibility of cryo-EM models derived by 13 participating teams from four benchmark maps, including three forming a resolution series (1.8 to 3.1 Å). The results permit specific recommendations to be made about validating near-atomic cryo-EM structures both in the context of individual experiments and structure data archives such as the Protein Data Bank. We recommend the adoption of multiple scoring parameters to provide full and objective annotation and assessment of the model, reflective of the observed cryo-EM map density.
Collapse
Affiliation(s)
- Catherine L. Lawson
- grid.430387.b0000 0004 1936 8796Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ USA
| | - Andriy Kryshtafovych
- grid.27860.3b0000 0004 1936 9684Genome Center, University of California, Davis, CA USA
| | - Paul D. Adams
- grid.184769.50000 0001 2231 4551Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA USA ,grid.47840.3f0000 0001 2181 7878Department of Bioengineering, University of California Berkeley, Berkeley, CA USA
| | - Pavel V. Afonine
- grid.184769.50000 0001 2231 4551Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA USA
| | - Matthew L. Baker
- grid.267308.80000 0000 9206 2401Department of Biochemistry and Molecular Biology, The University of Texas Health Science Center at Houston, Houston, TX USA
| | - Benjamin A. Barad
- grid.214007.00000000122199231Department of Integrated Computational Structural Biology, The Scripps Research Institute, La Jolla, CA USA
| | - Paul Bond
- grid.5685.e0000 0004 1936 9668York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Tom Burnley
- grid.465239.fScientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Renzhi Cao
- grid.261584.c0000 0001 0492 9915Department of Computer Science, Pacific Lutheran University, Tacoma, WA USA
| | - Jianlin Cheng
- grid.134936.a0000 0001 2162 3504Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO USA
| | - Grzegorz Chojnowski
- grid.475756.20000 0004 0444 5410European Molecular Biology Laboratory, c/o DESY, Hamburg, Germany
| | - Kevin Cowtan
- grid.5685.e0000 0004 1936 9668York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Ken A. Dill
- grid.36425.360000 0001 2216 9681Laufer Center, Stony Brook University, Stony Brook, NY USA
| | - Frank DiMaio
- grid.34477.330000000122986657Department of Biochemistry and Institute for Protein Design, University of Washington, Seattle, WA USA
| | - Daniel P. Farrell
- grid.34477.330000000122986657Department of Biochemistry and Institute for Protein Design, University of Washington, Seattle, WA USA
| | - James S. Fraser
- grid.266102.10000 0001 2297 6811Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA USA
| | - Mark A. Herzik
- grid.266100.30000 0001 2107 4242Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA USA
| | - Soon Wen Hoh
- grid.5685.e0000 0004 1936 9668York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Jie Hou
- grid.262962.b0000 0004 1936 9342Department of Computer Science, Saint Louis University, St. Louis, MO USA
| | - Li-Wei Hung
- grid.148313.c0000 0004 0428 3079Los Alamos National Laboratory, Los Alamos, NM USA
| | - Maxim Igaev
- grid.418140.80000 0001 2104 4211Theoretical and Computational Biophysics, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | - Agnel P. Joseph
- grid.465239.fScientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Daisuke Kihara
- grid.169077.e0000 0004 1937 2197Department of Biological Sciences, Purdue University, West Lafayette, IN USA ,grid.169077.e0000 0004 1937 2197Department of Computer Science, Purdue University, West Lafayette, IN USA
| | - Dilip Kumar
- grid.39382.330000 0001 2160 926XVerna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX USA
| | - Sumit Mittal
- grid.215654.10000 0001 2151 2636Biodesign Institute, Arizona State University, Tempe, AZ USA ,grid.411530.20000 0001 0694 3745School of Advanced Sciences and Languages, VIT Bhopal University, Bhopal, India
| | - Bohdan Monastyrskyy
- grid.27860.3b0000 0004 1936 9684Genome Center, University of California, Davis, CA USA
| | - Mateusz Olek
- grid.5685.e0000 0004 1936 9668York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Colin M. Palmer
- grid.465239.fScientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Ardan Patwardhan
- grid.225360.00000 0000 9709 7726The European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Alberto Perez
- grid.15276.370000 0004 1936 8091Department of Chemistry, University of Florida, Gainesville, FL USA
| | - Jonas Pfab
- grid.462982.30000 0000 8883 2602Division of Computing & Software Systems, University of Washington, Bothell, WA USA
| | - Grigore D. Pintilie
- grid.168010.e0000000419368956Department of Bioengineering, Stanford University, Stanford, CA USA
| | - Jane S. Richardson
- grid.26009.3d0000 0004 1936 7961Department of Biochemistry, Duke University, Durham, NC USA
| | - Peter B. Rosenthal
- grid.451388.30000 0004 1795 1830Structural Biology of Cells and Viruses Laboratory, Francis Crick Institute, London, UK
| | - Daipayan Sarkar
- grid.169077.e0000 0004 1937 2197Department of Biological Sciences, Purdue University, West Lafayette, IN USA ,grid.215654.10000 0001 2151 2636Biodesign Institute, Arizona State University, Tempe, AZ USA
| | - Luisa U. Schäfer
- grid.8385.60000 0001 2297 375XInstitute of Biological Information Processing (IBI-7: Structural Biochemistry) and Jülich Centre for Structural Biology (JuStruct), Forschungszentrum Jülich, Jülich, Germany
| | - Michael F. Schmid
- grid.168010.e0000000419368956Division of CryoEM and Biomaging, SSRL, SLAC National Accelerator Laboratory, Stanford University, Menlo Park, CA USA
| | - Gunnar F. Schröder
- grid.8385.60000 0001 2297 375XInstitute of Biological Information Processing (IBI-7: Structural Biochemistry) and Jülich Centre for Structural Biology (JuStruct), Forschungszentrum Jülich, Jülich, Germany ,grid.411327.20000 0001 2176 9917Physics Department, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Mrinal Shekhar
- grid.215654.10000 0001 2151 2636Biodesign Institute, Arizona State University, Tempe, AZ USA ,grid.66859.34Center for Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA USA
| | - Dong Si
- grid.462982.30000 0000 8883 2602Division of Computing & Software Systems, University of Washington, Bothell, WA USA
| | - Abishek Singharoy
- grid.215654.10000 0001 2151 2636Biodesign Institute, Arizona State University, Tempe, AZ USA
| | - Genki Terashi
- grid.418140.80000 0001 2104 4211Theoretical and Computational Biophysics, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | | | - Andrea Vaiana
- grid.418140.80000 0001 2104 4211Theoretical and Computational Biophysics, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | - Liguo Wang
- grid.34477.330000000122986657Department of Biological Structure, University of Washington, Seattle, WA USA
| | - Zhe Wang
- grid.225360.00000 0000 9709 7726The European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Stephanie A. Wankowicz
- grid.266102.10000 0001 2297 6811Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA USA ,grid.266102.10000 0001 2297 6811Biophysics Graduate Program, University of California, San Francisco, CA USA
| | | | - Martyn Winn
- grid.465239.fScientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Tianqi Wu
- grid.134936.a0000 0001 2162 3504Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO USA
| | - Xiaodi Yu
- grid.497530.c0000 0004 0389 4927SMPS, Janssen Research and Development, Spring House, PA USA
| | - Kaiming Zhang
- grid.168010.e0000000419368956Department of Bioengineering, Stanford University, Stanford, CA USA
| | - Helen M. Berman
- grid.430387.b0000 0004 1936 8796Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ USA ,grid.42505.360000 0001 2156 6853Department of Biological Sciences and Bridge Institute, University of Southern California, Los Angeles, CA USA
| | - Wah Chiu
- grid.168010.e0000000419368956Department of Bioengineering, Stanford University, Stanford, CA USA ,grid.168010.e0000000419368956Division of CryoEM and Biomaging, SSRL, SLAC National Accelerator Laboratory, Stanford University, Menlo Park, CA USA
| |
Collapse
|
8
|
Cowtan K, Metcalfe S, Bond P. Shift-field refinement of macromolecular atomic models. Acta Crystallogr D Struct Biol 2020; 76:1192-1200. [PMID: 33263325 PMCID: PMC7709196 DOI: 10.1107/s2059798320013170] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/08/2020] [Accepted: 09/29/2020] [Indexed: 11/11/2022]
Abstract
The aim of crystallographic structure solution is typically to determine an atomic model which accurately accounts for an observed diffraction pattern. A key step in this process is the refinement of the parameters of an initial model, which is most often determined by molecular replacement using another structure which is broadly similar to the structure of interest. In macromolecular crystallography, the resolution of the data is typically insufficient to determine the positional and uncertainty parameters for each individual atom, and so stereochemical information is used to supplement the observational data. Here, a new approach to refinement is evaluated in which a `shift field' is determined which describes changes to model parameters affecting whole regions of the model rather than individual atoms only, with the size of the affected region being a key parameter of the calculation which can be changed in accordance with the resolution of the data. It is demonstrated that this approach can improve the radius of convergence of the refinement calculation while also dramatically reducing the calculation time.
Collapse
Affiliation(s)
- K Cowtan
- Department of Chemistry, University of York, York, United Kingdom
| | - S Metcalfe
- Derpartment of Mechanical Engineering, McGill University, Montréal, Canada
| | - P Bond
- Department of Chemistry, University of York, York, United Kingdom
| |
Collapse
|
9
|
Cowtan K. Structural barriers to scientific progress. Acta Crystallogr D Struct Biol 2020; 76:908-911. [PMID: 33021492 PMCID: PMC7543655 DOI: 10.1107/s2059798320011201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Accepted: 08/15/2020] [Indexed: 11/11/2022]
Abstract
Structural biases, which are intrinsic in the social structures in which we function, play a key role in maintaining boundaries between traditionally privileged and underprivileged groups; however, they are particularly difficult to identify from within those societies. Two instances are highlighted in which the social structures of science appear to have discouraged collaboration, to the disadvantage of software and data users. Possible links are suggested to the strongly hierarchical structure of science and other factors which may in turn also serve to maintain sex and/or gender disparities in participation in the scientific endeavour.
Collapse
Affiliation(s)
- K Cowtan
- Department of Chemistry, University of York, York YO10 5DD, United Kingdom
| |
Collapse
|
10
|
Alharbi E, Calinescu R, Cowtan K. Pairwise running of automated crystallographic model-building pipelines. Acta Crystallogr D Struct Biol 2020; 76:814-823. [PMID: 32876057 PMCID: PMC7466752 DOI: 10.1107/s2059798320010542] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2020] [Accepted: 07/31/2020] [Indexed: 11/11/2022] Open
Abstract
For the last two decades, researchers have worked independently to automate protein model building, and four widely used software pipelines have been developed for this purpose: ARP/wARP, Buccaneer, Phenix AutoBuild and SHELXE. Here, the usefulness of combining these pipelines to improve the built protein structures by running them in pairwise combinations is examined. The results show that integrating these pipelines can lead to significant improvements in structure completeness and Rfree. In particular, running Phenix AutoBuild after Buccaneer improved structure completeness for 29% and 75% of the data sets that were examined at the original resolution and at a simulated lower resolution, respectively, compared with running Phenix AutoBuild on its own. In contrast, Phenix AutoBuild alone produced better structure completeness than the two pipelines combined for only 7% and 3% of these data sets.
Collapse
Affiliation(s)
- Emad Alharbi
- Department of Computer Science, University of York, Heslington, York YO10 5GH, United Kingdom
- Department of Information Technology, University of Tabuk, Tabuk, Saudi Arabia
| | - Radu Calinescu
- Department of Computer Science, University of York, Heslington, York YO10 5GH, United Kingdom
| | - Kevin Cowtan
- Department of Chemistry, University of York, Heslington, York YO10 5DD, United Kingdom
| |
Collapse
|
11
|
Hoh SW, Burnley T, Cowtan K. Current approaches for automated model building into cryo-EM maps using Buccaneer with CCP-EM. Acta Crystallogr D Struct Biol 2020; 76:531-541. [PMID: 32496215 PMCID: PMC7271950 DOI: 10.1107/s2059798320005513] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Accepted: 04/20/2020] [Indexed: 11/11/2022] Open
Abstract
This work focuses on the use of the existing protein-model-building software Buccaneer to provide structural interpretation of electron cryo-microscopy (cryo-EM) maps. Originally developed for application to X-ray crystallography, the necessary steps to optimise the usage of Buccaneer with cryo-EM maps are shown. This approach has been applied to the data sets of 208 cryo-EM maps with resolutions of better than 4 Å. The results obtained also show an evident improvement in the sequencing step when the initial reference map and model used for crystallographic cases are replaced by a cryo-EM reference. All other necessary changes to settings in Buccaneer are implemented in the model-building pipeline from within the CCP-EM interface (as of version 1.4.0).
Collapse
Affiliation(s)
- Soon Wen Hoh
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, United Kingdom
| | - Tom Burnley
- Scientific Computing Department, Science and Technology Facilities Council, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom
| | - Kevin Cowtan
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, United Kingdom
| |
Collapse
|
12
|
Alharbi E, Bond PS, Calinescu R, Cowtan K. Comparison of automated crystallographic model-building pipelines. Acta Crystallogr D Struct Biol 2019; 75:1119-1128. [PMID: 31793905 DOI: 10.1107/s2059798319014918] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2019] [Accepted: 11/04/2019] [Indexed: 11/10/2022]
Abstract
A comparison of four protein model-building pipelines (ARP/wARP, Buccaneer, PHENIX AutoBuild and SHELXE) was performed using data sets from 202 experimentally phased cases, both with the data as observed and truncated to simulate lower resolutions. All pipelines were run using default parameters. Additionally, an ARP/wARP run was completed using models from Buccaneer. All pipelines achieved nearly complete protein structures and low Rwork/Rfree at resolutions between 1.2 and 1.9 Å, with PHENIX AutoBuild and ARP/wARP producing slightly lower R factors. At lower resolutions, Buccaneer leads to significantly more complete models.
Collapse
Affiliation(s)
- Emad Alharbi
- Department of Computer Science, University of York, Heslington, York YO10 5GH, England
| | - Paul S Bond
- Department of Chemistry, University of York, Heslington, York YO10 5DD, England
| | - Radu Calinescu
- Department of Computer Science, University of York, Heslington, York YO10 5GH, England
| | - Kevin Cowtan
- Department of Chemistry, University of York, Heslington, York YO10 5DD, England
| |
Collapse
|
13
|
Cowtan K, Agirre J, Metcalfe S. Shift fields: a new approach to refinement using non-atomic parameterizations. Acta Crystallogr A Found Adv 2018. [DOI: 10.1107/s2053273318093026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
|
14
|
Abstract
Refinement is a critical step in the determination of a model which explains the crystallographic observations and thus best accounts for the missing phase components. The scattering density is usually described in terms of atomic parameters; however, in macromolecular crystallography the resolution of the data is generally insufficient to determine the values of these parameters for individual atoms. Stereochemical and geometric restraints are used to provide additional information, but produce interrelationships between parameters which slow convergence, resulting in longer refinement times. An alternative approach is proposed in which parameters are not attached to atoms, but to regions of the electron-density map. These parameters can move the density or change the local temperature factor to better explain the structure factors. Varying the size of the region which determines the parameters at a particular position in the map allows the method to be applied at different resolutions without the use of restraints. Potential applications include initial refinement of molecular-replacement models with domain motions, and potentially the use of electron density from other sources such as electron cryo-microscopy (cryo-EM) as the refinement model.
Collapse
Affiliation(s)
- Kevin Cowtan
- Department of Chemistry, University of York, York, England
| | - Jon Agirre
- Department of Chemistry, University of York, York, England
| |
Collapse
|
15
|
Potterton L, Agirre J, Ballard C, Cowtan K, Dodson E, Evans PR, Jenkins HT, Keegan R, Krissinel E, Stevenson K, Lebedev A, McNicholas SJ, Nicholls RA, Noble M, Pannu NS, Roth C, Sheldrick G, Skubak P, Turkenburg J, Uski V, von Delft F, Waterman D, Wilson K, Winn M, Wojdyr M. CCP4i2: the new graphical user interface to the CCP4 program suite. Acta Crystallogr D Struct Biol 2018; 74:68-84. [PMID: 29533233 PMCID: PMC5947771 DOI: 10.1107/s2059798317016035] [Citation(s) in RCA: 307] [Impact Index Per Article: 51.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2017] [Accepted: 11/06/2017] [Indexed: 11/14/2022] Open
Abstract
The CCP4 (Collaborative Computational Project, Number 4) software suite for macromolecular structure determination by X-ray crystallography groups brings together many programs and libraries that, by means of well established conventions, interoperate effectively without adhering to strict design guidelines. Because of this inherent flexibility, users are often presented with diverse, even divergent, choices for solving every type of problem. Recently, CCP4 introduced CCP4i2, a modern graphical interface designed to help structural biologists to navigate the process of structure determination, with an emphasis on pipelining and the streamlined presentation of results. In addition, CCP4i2 provides a framework for writing structure-solution scripts that can be built up incrementally to create increasingly automatic procedures.
Collapse
Affiliation(s)
- Liz Potterton
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, England
| | - Jon Agirre
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, England
| | - Charles Ballard
- STFC Rutherford Appleton Laboratory, Chilton, Didcot OX11 0QX, England
| | - Kevin Cowtan
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, England
| | - Eleanor Dodson
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, England
| | - Phil R. Evans
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, England
| | - Huw T. Jenkins
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, England
| | - Ronan Keegan
- STFC Rutherford Appleton Laboratory, Chilton, Didcot OX11 0QX, England
| | - Eugene Krissinel
- STFC Rutherford Appleton Laboratory, Chilton, Didcot OX11 0QX, England
| | - Kyle Stevenson
- STFC Rutherford Appleton Laboratory, Chilton, Didcot OX11 0QX, England
| | - Andrey Lebedev
- STFC Rutherford Appleton Laboratory, Chilton, Didcot OX11 0QX, England
| | - Stuart J. McNicholas
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, England
| | - Robert A. Nicholls
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, England
| | - Martin Noble
- University of Newcastle upon Tyne, Northern Institute for Cancer Research, Framlington Place, Newcastle upon Tyne NE2 4HH, England
| | - Navraj S. Pannu
- Biophysical Structural Chemistry, Leiden University, PO Box 9502, 2300 RA Leiden, The Netherlands
| | - Christian Roth
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, England
| | - George Sheldrick
- Department of Structural Chemistry, Georg-August-Universität Göttingen, Tammannstrasse 4, 37077 Göttingen, Germany
| | - Pavol Skubak
- Biophysical Structural Chemistry, Leiden University, PO Box 9502, 2300 RA Leiden, The Netherlands
| | - Johan Turkenburg
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, England
| | - Ville Uski
- STFC Rutherford Appleton Laboratory, Chilton, Didcot OX11 0QX, England
| | - Frank von Delft
- Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, England
- Diamond Light Source Ltd, Harwell Science and Innovation Campus, Didcot OX11 0QX, England
| | - David Waterman
- STFC Rutherford Appleton Laboratory, Chilton, Didcot OX11 0QX, England
| | - Keith Wilson
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, England
| | - Martyn Winn
- STFC Rutherford Appleton Laboratory, Chilton, Didcot OX11 0QX, England
| | - Marcin Wojdyr
- STFC Rutherford Appleton Laboratory, Chilton, Didcot OX11 0QX, England
| |
Collapse
|
16
|
McNicholas S, Croll T, Burnley T, Palmer CM, Hoh SW, Jenkins HT, Dodson E, Cowtan K, Agirre J. Automating tasks in protein structure determination with the clipper python module. Protein Sci 2018; 27:207-216. [PMID: 28901669 PMCID: PMC5734304 DOI: 10.1002/pro.3299] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2017] [Revised: 09/08/2017] [Accepted: 09/11/2017] [Indexed: 11/06/2022]
Abstract
Scripting programming languages provide the fastest means of prototyping complex functionality. Those with a syntax and grammar resembling human language also greatly enhance the maintainability of the produced source code. Furthermore, the combination of a powerful, machine-independent scripting language with binary libraries tailored for each computer architecture allows programs to break free from the tight boundaries of efficiency traditionally associated with scripts. In the present work, we describe how an efficient C++ crystallographic library such as Clipper can be wrapped, adapted and generalized for use in both crystallographic and electron cryo-microscopy applications, scripted with the Python language. We shall also place an emphasis on best practices in automation, illustrating how this can be achieved with this new Python module.
Collapse
Affiliation(s)
- Stuart McNicholas
- Department of Chemistry, York Structural Biology LaboratoryThe University of YorkYorkYO10 5DDUnited Kingdom
| | - Tristan Croll
- Department of Haematology, Cambridge Institute for Medical ResearchUniversity of CambridgeCambridgeCB2 0XYUnited Kingdom
| | - Tom Burnley
- STFC Rutherford Appleton Laboratory OX11 0QXCollaborative Computational Project for Electron cryo‐Microscopy (CCP‐EM)United Kingdom
| | - Colin M. Palmer
- STFC Rutherford Appleton Laboratory OX11 0QXCollaborative Computational Project for Electron cryo‐Microscopy (CCP‐EM)United Kingdom
| | - Soon Wen Hoh
- Department of Chemistry, York Structural Biology LaboratoryThe University of YorkYorkYO10 5DDUnited Kingdom
| | - Huw T. Jenkins
- Department of Chemistry, York Structural Biology LaboratoryThe University of YorkYorkYO10 5DDUnited Kingdom
| | - Eleanor Dodson
- Department of Chemistry, York Structural Biology LaboratoryThe University of YorkYorkYO10 5DDUnited Kingdom
| | - Kevin Cowtan
- Department of Chemistry, York Structural Biology LaboratoryThe University of YorkYorkYO10 5DDUnited Kingdom
| | - Jon Agirre
- Department of Chemistry, York Structural Biology LaboratoryThe University of YorkYorkYO10 5DDUnited Kingdom
| |
Collapse
|
17
|
Cowtan K, Hausfather Z. Stay Out of Scientists' E-mails. Sci Am 2017; 316:12. [DOI: 10.1038/scientificamerican0417-12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
18
|
Hausfather Z, Cowtan K, Clarke DC, Jacobs P, Richardson M, Rohde R. Assessing recent warming using instrumentally homogeneous sea surface temperature records. Sci Adv 2017; 3:e1601207. [PMID: 28070556 PMCID: PMC5216687 DOI: 10.1126/sciadv.1601207] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2016] [Accepted: 11/22/2016] [Indexed: 06/02/2023]
Abstract
Sea surface temperature (SST) records are subject to potential biases due to changing instrumentation and measurement practices. Significant differences exist between commonly used composite SST reconstructions from the National Oceanic and Atmospheric Administration's Extended Reconstruction Sea Surface Temperature (ERSST), the Hadley Centre SST data set (HadSST3), and the Japanese Meteorological Agency's Centennial Observation-Based Estimates of SSTs (COBE-SST) from 2003 to the present. The update from ERSST version 3b to version 4 resulted in an increase in the operational SST trend estimate during the last 19 years from 0.07° to 0.12°C per decade, indicating a higher rate of warming in recent years. We show that ERSST version 4 trends generally agree with largely independent, near-global, and instrumentally homogeneous SST measurements from floating buoys, Argo floats, and radiometer-based satellite measurements that have been developed and deployed during the past two decades. We find a large cooling bias in ERSST version 3b and smaller but significant cooling biases in HadSST3 and COBE-SST from 2003 to the present, with respect to most series examined. These results suggest that reported rates of SST warming in recent years have been underestimated in these three data sets.
Collapse
Affiliation(s)
- Zeke Hausfather
- Energy and Resources Group, University of California, Berkeley, Berkeley, CA 94720, USA
- Berkeley Earth, Berkeley, CA 94705, USA
| | - Kevin Cowtan
- Department of Chemistry, University of York, York, U.K
| | | | - Peter Jacobs
- Department of Environmental Science and Policy, George Mason University, Fairfax, VA 22030, USA
| | - Mark Richardson
- NASA Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA 91109, USA
| | | |
Collapse
|
19
|
Agirre J, Cowtan K. Refinement without a model. Acta Crystallogr A Found Adv 2016. [DOI: 10.1107/s2053273316099654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
20
|
Agirre J, Cowtan K. Deriving a chemical context for protein-bound monosaccharides. Acta Crystallogr A Found Adv 2015. [DOI: 10.1107/s2053273315097569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
21
|
Agirre J, Davies G, Wilson K, Cowtan K. Erratum: Carbohydrate anomalies in the PDB. Nat Chem Biol 2015; 11:532. [DOI: 10.1038/nchembio0715-532a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
22
|
|
23
|
Abstract
The crystallographic structure solution of nucleotides and nucleotide complexes is now commonplace. The resulting electron-density maps are often poorer than for proteins, and as a result interpretation in terms of an atomic model can require significant effort, particularly in the case of large structures. While model building can be performed automatically, as with proteins, the process is time-consuming, taking minutes to days depending on the software and the size of the structure. A method is presented for the automatic building of nucleotide chains into electron density which is fast enough to be used in interactive model-building software, with extended chain fragments built around the current view position in a fraction of a second. The speed of the method arises from the determination of the 'fingerprint' of the sugar and phosphate groups in terms of conserved high-density and low-density features, coupled with a highly efficient scoring algorithm. Use cases include the rapid evaluation of an initial electron-density map, addition of nucleotide fragments to prebuilt protein structures, and in favourable cases the completion of the structure while automated model-building software is still running. The method has been incorporated into the Coot software package.
Collapse
Affiliation(s)
- Kevin Cowtan
- Department of Chemistry, University of York, York YO1 5DD, England
| |
Collapse
|
24
|
Agirre J, Cowtan K. Validation of carbohydrate structures: not just nomenclature. Acta Crystallogr A Found Adv 2014. [DOI: 10.1107/s2053273314085180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Despite the key implications carbohydrates have in a multitude of pathological processes, a large number of the sugar-containing structures deposited into the Protein Data Bank (PDB) show nomenclature errors [1] that persist even after the remediation of the PDB archive [2]. Here we present the results from a systematic study of the conformation and ring distortion of cyclic carbohydrate models for which structure factors have been deposited into the PDB. These models have also been scored using a real-space correlation coefficient calculated between model and experimental electron density. The results have enabled us to produce a database of well-refined carbohydrate structures for use in the framework of an automated sugar-detecting software, to be announced shortly.
Collapse
|
25
|
|
26
|
Cowtan K. Completion of autobuilt protein models using a database of protein fragments. Acta Crystallogr D Biol Crystallogr 2012; 68:328-35. [PMID: 22505253 PMCID: PMC3322592 DOI: 10.1107/s0907444911039655] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/02/2011] [Accepted: 09/27/2011] [Indexed: 12/05/2022]
Abstract
Two developments in the process of automated protein model building in the Buccaneer software are described: the use of a database of protein fragments in improving the model completeness and the assembly of disconnected chain fragments into complete molecules. Two developments in the process of automated protein model building in the Buccaneer software are presented. A general-purpose library for protein fragments of arbitrary size is described, with a highly optimized search method allowing the use of a larger database than in previous work. The problem of assembling an autobuilt model into complete chains is discussed. This involves the assembly of disconnected chain fragments into complete molecules and the use of the database of protein fragments in improving the model completeness. Assembly of fragments into molecules is a standard step in existing model-building software, but the methods have not received detailed discussion in the literature.
Collapse
Affiliation(s)
- Kevin Cowtan
- Department of Chemistry, University of York, Heslington, York YO10 5DD, England.
| |
Collapse
|
27
|
Abstract
An introduction to the proceedings of the CCP4 study weekend is given.
Collapse
Affiliation(s)
- Kevin Cowtan
- Structural Biology Laboratory, Department of Chemistry, University of York, Heslington, York, United Kingdom
| | | | | |
Collapse
|
28
|
Abstract
Classical density-modification techniques (as opposed to statistical approaches) offer a computationally cheap method for improving phase estimates in order to provide a good electron-density map for model building. The rise of statistical methods has lead to a shift in focus away from the classical approaches; as a result, some recent developments have not made their way into classical density-modification software. This paper describes the application of some recent techniques, including most importantly the use of prior phase information in the likelihood estimation of phase errors within a classical density-modification framework. The resulting software gives significantly better results than comparable classical methods, while remaining nearly two orders of magnitude faster than statistical methods.
Collapse
Affiliation(s)
- Kevin Cowtan
- Department of Chemistry, University of York, Heslington, York YO10 5DD, England.
| |
Collapse
|
29
|
Emsley P, Lohkamp B, Scott WG, Cowtan K. Features and development of Coot. Acta Crystallogr D Biol Crystallogr 2010; 66:486-501. [PMID: 20383002 PMCID: PMC2852313 DOI: 10.1107/s0907444910007493] [Citation(s) in RCA: 19883] [Impact Index Per Article: 1420.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/09/2009] [Accepted: 02/26/2010] [Indexed: 11/12/2022]
Abstract
Coot is a molecular-graphics program designed to assist in the building of protein and other macromolecular models. The current state of development and available features are presented. Coot is a molecular-graphics application for model building and validation of biological macromolecules. The program displays electron-density maps and atomic models and allows model manipulations such as idealization, real-space refinement, manual rotation/translation, rigid-body fitting, ligand search, solvation, mutations, rotamers and Ramachandran idealization. Furthermore, tools are provided for model validation as well as interfaces to external programs for refinement, validation and graphics. The software is designed to be easy to learn for novice users, which is achieved by ensuring that tools for common tasks are ‘discoverable’ through familiar user-interface elements (menus and toolbars) or by intuitive behaviour (mouse controls). Recent developments have focused on providing tools for expert users, with customisable key bindings, extensions and an extensive scripting interface. The software is under rapid development, but has already achieved very widespread use within the crystallographic community. The current state of the software is presented, with a description of the facilities available and of some of the underlying methods employed.
Collapse
Affiliation(s)
- P Emsley
- Department of Biochemistry, University of Oxford, South Parks Road, Oxford OX1 3QU, England.
| | | | | | | |
Collapse
|
30
|
Cowtan K. New developments in model building with Buccaneer. Acta Crystallogr A 2009. [DOI: 10.1107/s0108767309099401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
|
31
|
Watts D, Cowtan K, Wilson J. Automated classification of crystallization experiments using wavelets and statistical texture characterization techniques. J Appl Crystallogr 2008. [DOI: 10.1107/s0021889807049308] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
A method is presented for the classification of protein crystallization images based on image decomposition using the wavelet transform. The distribution of wavelet coefficient values in each sub-band image is modelled by a generalized Gaussian distribution to provide discriminatory variables. These statistical descriptors, together with second-order statistics obtained from joint probability distributions, are used with learning vector quantization to classify protein crystallization images.
Collapse
|
32
|
Abstract
A number of techniques for the location of small and medium-sized model fragments in experimentally phased electron-density maps are explored. The application of one of these techniques to automated model building is discussed. Molecular replacement is a powerful tool for the location of large models using structure-factor magnitudes alone. When phase information is available, it becomes possible to locate smaller fragments of the structure ranging in size from a few atoms to a single domain. The calculation is demanding, requiring a six-dimensional rotation and translation search. A number of approaches have been developed to this problem and a selection of these are reviewed in this paper. The application of one of these techniques to the problem of automated model building is explored in more detail, with particular reference to the problem of sequencing a protein main-chain trace.
Collapse
Affiliation(s)
- Kevin Cowtan
- Department of Chemistry, University of York, Heslington, York YO10 5DD, England.
| |
Collapse
|
33
|
Cowtan K. Automated model building at lower resolutions. Acta Crystallogr A 2007. [DOI: 10.1107/s0108767307098248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
34
|
Abstract
A new technique for the automated tracing of protein chains in experimental electron-density maps is described. The technique relies on the repeated application of an oriented electron-density likelihood target function to identify likely C(alpha) positions. This function is applied both in the location of a few promising ;seed' positions in the map and to grow those initial C(alpha) positions into extended chain fragments. Techniques for assembling the chain fragments into an initial chain trace are discussed.
Collapse
Affiliation(s)
- Kevin Cowtan
- Department of Chemistry, University of York, Heslington, York YO10 5DD, England.
| |
Collapse
|
35
|
|
36
|
Abstract
A method for the weighting of structure factors from an incomplete and inaccurate model is described which relies on the fitting of smooth spline functions of resolution. The use of smooth spline functions avoids the problems of discontinuities introduced when performing calculations in resolution shells. The complexity of the functions to be fit may be varied by changing the number of spline parameters. This approach is used to investigate the stability of the problem when data are limited.
Collapse
|
37
|
Emsley P, Cowtan K. Coot: model-building tools for molecular graphics. Acta Crystallogr D Biol Crystallogr 2004; 60:2126-32. [PMID: 15572765 DOI: 10.1107/s0907444904019158] [Citation(s) in RCA: 24371] [Impact Index Per Article: 1218.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Received: 02/26/2004] [Accepted: 08/04/2004] [Indexed: 11/10/2022]
Abstract
CCP4mg is a project that aims to provide a general-purpose tool for structural biologists, providing tools for X-ray structure solution, structure comparison and analysis, and publication-quality graphics. The map-fitting tools are available as a stand-alone package, distributed as 'Coot'.
Collapse
Affiliation(s)
- Paul Emsley
- York Structural Biology Laboratory, University of York, Heslington, York YO10 5YW, England.
| | | |
Collapse
|
38
|
Potterton L, McNicholas S, Krissinel E, Gruber J, Cowtan K, Emsley P, Murshudov GN, Cohen S, Perrakis A, Noble M. Developments in the CCP4 molecular-graphics project. Acta Crystallogr D Biol Crystallogr 2004; 60:2288-94. [PMID: 15572783 DOI: 10.1107/s0907444904023716] [Citation(s) in RCA: 492] [Impact Index Per Article: 24.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2004] [Accepted: 09/22/2004] [Indexed: 11/11/2022]
Abstract
Progress towards structure determination that is both high-throughput and high-value is dependent on the development of integrated and automatic tools for electron-density map interpretation and for the analysis of the resulting atomic models. Advances in map-interpretation algorithms are extending the resolution regime in which fully automatic tools can work reliably, but at present human intervention is required to interpret poor regions of macromolecular electron density, particularly where crystallographic data is only available to modest resolution [for example, I/sigma(I) < 2.0 for minimum resolution 2.5 A]. In such cases, a set of manual and semi-manual model-building molecular-graphics tools is needed. At the same time, converting the knowledge encapsulated in a molecular structure into understanding is dependent upon visualization tools, which must be able to communicate that understanding to others by means of both static and dynamic representations. CCP4 mg is a program designed to meet these needs in a way that is closely integrated with the ongoing development of CCP4 as a program suite suitable for both low- and high-intervention computational structural biology. As well as providing a carefully designed user interface to advanced algorithms of model building and analysis, CCP4 mg is intended to present a graphical toolkit to developers of novel algorithms in these fields.
Collapse
Affiliation(s)
- Liz Potterton
- Department of Chemistry, University of York, England.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Cowtan K. An Overview of Some Developments in Crystallographic Computing Methods Worldwide. CRYSTALLOGR REV 2003. [DOI: 10.1080/0889311031000069326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
40
|
Abstract
A generalized approach is described for evaluating arbitrary functions of position in reciprocal space. This is a generalization which subsumes a whole range of calculations that form a part of almost every crystallographic software application. Examples include scaling of structure factors, the calculation of structure-factor statistics, and some simple likelihood calculations for a single parameter. The generalized approach has a number of advantages: all these calculations may now be performed by a single software routine which need only be debugged and optimized once; the existing approach of dividing reciprocal space into resolution shells with discontinuities at the boundaries is no longer necessary; the implementation provided makes employing the new functionality extremely simple and concise. The calculation is split into three standard components, for which a number of implementations are provided for different tasks. A `basis function' describes some function of position in reciprocal space, the shape of which is determined by a small number of parameters. A `target function' describes the property for which a functional representation is required, for example \langle |F|^2\rangle. An `evaluator' takes a basis and target function and optimizes the parameters of the basis function to fit the target function. Ideally the components should be usable in any combination.
Collapse
|
41
|
Potterton E, McNicholas S, Krissinel E, Cowtan K, Noble M. The CCP4 molecular-graphics project. Acta Crystallogr D Biol Crystallogr 2002; 58:1955-7. [PMID: 12393928 DOI: 10.1107/s0907444902015391] [Citation(s) in RCA: 183] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2002] [Accepted: 08/28/2002] [Indexed: 11/10/2022]
Abstract
This new package will provide easy-to-use access to crystallographic structure solution, model building and structure analysis. It will be possible for any developer to integrate scientific software into the system.
Collapse
Affiliation(s)
- Elizabeth Potterton
- Structural Biology Laboratory, Department of Chemistry, University of York, Heslington, York YO10 5DD, England.
| | | | | | | | | |
Collapse
|
42
|
Abstract
Various approaches have been demonstrated for the automatic interpretation of crystallographic data in terms of atomic models. The use of a masked Fourier-based search function has some benefits for this task. The application and optimization of this procedure is discussed in detail. The search function also acquires a statistical significance when used with an appropriate electron-density target and weighting, giving rise to improved results at low resolutions. Methods are discussed for building a library of protein fragments suitable for use with this procedure. These methods are demonstrated with the construction of a statistical target for the identification of short helical fragments in the electron density.
Collapse
Affiliation(s)
- K Cowtan
- Department of Chemistry, University of York, Heslington, York YO10 5DD, England.
| |
Collapse
|
43
|
Naismith J, Cowtan K, Ashton A. Molecular replacement and its relatives. Acta Crystallogr D Biol Cryst 2001. [DOI: 10.1107/s0907444901014056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
44
|
Cowtan K. General quadratic functions in real and reciprocal space and their application to likelihood phasing. Acta Crystallogr D Biol Crystallogr 2000; 56:1612-21. [PMID: 11092927 DOI: 10.1107/s0907444900013263] [Citation(s) in RCA: 58] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/26/2000] [Accepted: 09/26/2000] [Indexed: 11/10/2022]
Abstract
A general multivariate quadratic function of the structure factors is constructed and transformed to obtain a quadratic function of the continuous electron density. Two special cases, where structure factors are independent and where electron-density values are independent, are examined. These results are related to the new likelihood-based framework of Terwilliger [Terwilliger (1999), Acta Cryst. D55, pp. 1863-1871] for employing structural information which was previously exploited by means of conventional density-modification calculations. The treatment here involves different assumptions and highlights new features of Terwilliger's calculation. The generalization quadratic construction allows the generation of cross terms relating all reflections and electron densities. Other applications of this approach are considered.
Collapse
Affiliation(s)
- K Cowtan
- Department of Chemistry, University of York, Heslington, York YO10 5DD, England.
| |
Collapse
|
45
|
Cowtan K, Ten Eyck LF. Eigensystem analysis of the refinement of a small metalloprotein. Acta Crystallogr D Biol Crystallogr 2000; 56:842-56. [PMID: 10930831 DOI: 10.1107/s0907444900004856] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/01/1999] [Accepted: 03/30/2000] [Indexed: 11/11/2022]
Abstract
The eigenvalues and eigenvectors of the least-squares normal matrix for the full-matrix refinement problem contain a great deal of information about the quality of a model; in particular the precision of the model parameters and correlations between those parameters. They also allow the isolation of those parameters or combinations of parameters which are not determined by the available data. Since a protein refinement is usually under-determined without the application of geometric restraints, such indicators of the reliability of a model offer an important contribution to structural knowledge. Eigensystem analysis is applied to the normal matrices for the refinement of a small metalloprotein using two data sets and models determined at different resolutions. The eigenvalue spectra reveal considerable information about the conditioning of the problem as the resolution varies. In the case of a restrained refinement, it also provides information about the impact of various restraints on the refinement. Initial results support conclusions drawn from the free R factor. Examination of the eigenvectors provides information about which regions of the model are poorly determined. In the case of a restrained refinement, it is also possible to isolate places where X-ray and geometric restraints are in disagreement, usually indicating a problem in the model.
Collapse
Affiliation(s)
- K Cowtan
- University of York, Heslington, York YO10 5DD, England
| | | |
Collapse
|
46
|
Abstract
With the rise of Bayesian methods in crystallography, the error estimates attached to estimated phases are becoming as important as the phase estimates themselves. Phase improvement by density modification can cause problems in this environment because the quality of the resulting phases is usually overestimated. This problem is addressed by an extension of the gamma correction [Abrahams (1997). Acta Cryst. D53, 371-376] to arbitrary density-modification techniques. The degree to which the improved phases are biased by the features of the initial map is investigated in order to determine the limits of the resulting procedure and the quality of the phase-error estimates.
Collapse
Affiliation(s)
- K Cowtan
- University of York, Heslington, York YO1 5DD, England.
| |
Collapse
|
47
|
Abstract
Direct methods at high resolution have depended on the resolution of atomic like features in the map. At data resolutions more typical for protein structures (2-3 A) individual atoms may not be resolved, so larger features must be identified. At one extreme the whole molecule may be located using the diffraction magnitudes alone by the molecular-replacement method. At the other extreme it is possible to locate individual residues in a well phased map. In this paper an intermediate problem is addressed: the location of multi-residue fragments on the basis of weak phase information. An agreement function based on the mean-squared difference between model and map over a masked region is shown to be more effective than a simple overlap integral, and may be efficiently calculated by Fourier methods. The techniques are compared using poorly phased electron-density maps at approximately 3 A for the proteins RNAse and O6-methylguanine-DNA-methyltransferase.
Collapse
Affiliation(s)
- K Cowtan
- Department of Chemistry, University of York, Heslington, York YO1 5DD, England.
| |
Collapse
|
48
|
Abstract
Various algorithms are described, developed for the dm density modification package, which have not been described elsewhere. Methods are described for the following problems: determination of the absolute scale and overall temperature factor of a data set, by a method which is less dependent on data resolution than Wilson statistics; an efficient interpolation algorithm for averaging and its application to refinement of averaging operators; a method for the automatic determination of averaging masks.
Collapse
Affiliation(s)
- K Cowtan
- University of York, Heslington, York YO1 5DD, England.
| | | |
Collapse
|
49
|
Affiliation(s)
- K Y Zhang
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA
| | | | | |
Collapse
|
50
|
|