1
|
The amounts of thermal vibrations and static disorder in protein X-ray crystallographic B-factors. Proteins 2021; 89:1442-1457. [PMID: 34174110 DOI: 10.1002/prot.26165] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Revised: 05/31/2021] [Accepted: 06/06/2021] [Indexed: 12/20/2022]
Abstract
Crystallographic B-factors provide direct dynamical information on the internal mobility of proteins that is closely linked to function, and are also widely used as a benchmark in assessing elastic network models. A significant question in the field is: what is the exact amount of thermal vibrations in protein crystallographic B-factors? This work sets out to answer this question. First, we carry out a thorough, statistically sound analysis of crystallographic B-factors of over 10 000 structures. Second, by employing a highly accurate all-atom model based on the well-known CHARMM force field, we obtain computationally the magnitudes of thermal vibrations of nearly 1000 structures. Our key findings are: (i) the magnitude of thermal vibrations, surprisingly, is nearly protein-independent, as a corollary to the universality for the vibrational spectra of globular proteins established earlier; (ii) the magnitude of thermal vibrations is small, less than 0.1 Å2 at 100 K; (iii) the percentage of thermal vibrations in B-factors is the lowest at low resolution and low temperature (<10%) but increases to as high as 60% for structures determined at high resolution and at room temperature. The significance of this work is that it provides for the first time, using an extremely large dataset, a thorough analysis of B-factors and their thermal and static disorder components. The results clearly demonstrate that structures determined at high resolution and at room temperature have the richest dynamics information. Since such structures are relatively rare in the PDB database, the work naturally calls for more such structures to be determined experimentally.
Collapse
|
2
|
|
3
|
Computational Science in the Battle Against COVID-19. Comput Sci Eng 2020. [DOI: 10.1109/mcse.2020.3025398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
4
|
|
5
|
|
6
|
|
7
|
Memory effects in a random walk description of protein structure ensembles. J Chem Phys 2019; 150:064911. [DOI: 10.1063/1.5054887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
|
8
|
Verifiability in computer-aided research: the role of digital scientific notations at the human-computer interface. PeerJ Comput Sci 2018; 4:e158. [PMID: 33816811 PMCID: PMC7924627 DOI: 10.7717/peerj-cs.158] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2018] [Accepted: 06/21/2018] [Indexed: 06/12/2023]
Abstract
Most of today's scientific research relies on computers and software for processing scientific information. Examples of such computer-aided research are the analysis of experimental data or the simulation of phenomena based on theoretical models. With the rapid increase of computational power, scientific software has integrated more and more complex scientific knowledge in a black-box fashion. As a consequence, its users do not know, and do not even have a chance of finding out, which assumptions and approximations their computations are based on. This black-box nature of scientific software has made the verification of much computer-aided research close to impossible. The present work starts with an analysis of this situation from the point of view of human-computer interaction in scientific research. It identifies the key role of digital scientific notations at the human-computer interface, reviews the most popular ones in use today, and describes a proof-of-concept implementation of Leibniz, a language designed as a verifiable digital scientific notation for models formulated as mathematical equations.
Collapse
|
9
|
|
10
|
Code reviewing puts extra demands on referees. Nature 2018; 556:309. [PMID: 29670274 DOI: 10.1038/d41586-018-04628-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
11
|
|
12
|
Sustainable computational science: the ReScience initiative. PeerJ Comput Sci 2017; 3:e142. [PMID: 34722870 PMCID: PMC8530091 DOI: 10.7717/peerj-cs.142] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2017] [Accepted: 11/15/2017] [Indexed: 05/30/2023]
Abstract
Computer science offers a large set of tools for prototyping, writing, running, testing, validating, sharing and reproducing results; however, computational science lags behind. In the best case, authors may provide their source code as a compressed archive and they may feel confident their research is reproducible. But this is not exactly true. James Buckheit and David Donoho proposed more than two decades ago that an article about computational results is advertising, not scholarship. The actual scholarship is the full software environment, code, and data that produced the result. This implies new workflows, in particular in peer-reviews. Existing journals have been slow to adapt: source codes are rarely requested and are hardly ever actually executed to check that they produce the results advertised in the article. ReScience is a peer-reviewed journal that targets computational research and encourages the explicit replication of already published research, promoting new and open-source implementations in order to ensure that the original research can be replicated from its description. To achieve this goal, the whole publishing chain is radically different from other traditional scientific journals. ReScience resides on GitHub where each new implementation of a computational study is made available together with comments, explanations, and software tests.
Collapse
|
13
|
|
14
|
Communication: A multiscale Bayesian inference approach to analyzing subdiffusion in particle trajectories. J Chem Phys 2017; 145:151101. [PMID: 27782457 DOI: 10.1063/1.4965881] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Anomalous diffusion is characterized by its asymptotic behavior for t → ∞. This makes it difficult to detect and describe in particle trajectories from experiments or computer simulations, which are necessarily of finite length. We propose a new approach using Bayesian inference applied directly to the observed trajectories sampled at different time scales. We illustrate the performance of this approach using random trajectories with known statistical properties and then use it for analyzing the motion of lipid molecules in the plane of a lipid bilayer.
Collapse
|
15
|
|
16
|
|
17
|
|
18
|
The Approximation Tower in Computational Science: Why Testing Scientific Software Is Difficult. Comput Sci Eng 2015. [DOI: 10.1109/mcse.2015.75] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
19
|
Protein secondary-structure description with a coarse-grained model. ACTA ACUST UNITED AC 2015; 71:1411-22. [DOI: 10.1107/s1399004715007191] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2014] [Accepted: 04/10/2015] [Indexed: 01/25/2023]
Abstract
A coarse-grained geometrical model for protein secondary-structure description and analysis is presented which uses only the positions of the Cαatoms. A space curve connecting these positions by piecewise polynomial interpolation is constructed and the folding of the protein backbone is described by a succession of screw motions linking the Frenet frames at consecutive Cαpositions. Using the ASTRAL subset of the SCOPe database of protein structures, thresholds are derived for the screw parameters of secondary-structure elements and demonstrate that the latter can be reliably assigned on the basis of a Cαmodel. For this purpose, a comparative study with the widely usedDSSP(Define Secondary Structure of Proteins) algorithm was performed and it was shown that the parameter distribution corresponding to the ensemble of all pure Cαstructures in the RCSB Protein Data Bank matches that of the ASTRAL database. It is expected that this approach will be useful in the development of structure-refinement techniques for low-resolution data.
Collapse
|
20
|
|
21
|
Abstract
The lack of replicability and reproducibility of scientific studies based on computational methods has lead to serious mistakes in published scientific findings, some of which have been discovered and publicized recently. Many strategies are currently pursued to improve the situation. This article reports the first conclusions from the ActivePapers project, whose goal is the development and application of a computational platform that allows the publication of computational research in a form that enables installation-free deployment, encourages reuse, and permits the full integration of datasets and software into the scientific record. The main finding is that these goals can be achieved with existing technology, but that there is no straightforward way to adapt legacy software to such a framework.
Collapse
|
22
|
Construction and validation of an atomic model for bacterial TSPO from electron microscopy density, evolutionary constraints, and biochemical and biophysical data. BIOCHIMICA ET BIOPHYSICA ACTA-BIOMEMBRANES 2014; 1848:568-80. [PMID: 25450341 DOI: 10.1016/j.bbamem.2014.10.028] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2014] [Revised: 10/01/2014] [Accepted: 10/20/2014] [Indexed: 11/30/2022]
Abstract
The 18 kDa protein TSPO is a highly conserved transmembrane protein found in bacteria, yeast, animals and plants. TSPO is involved in a wide range of physiological functions, among which the transport of several molecules. The atomic structure of monomeric ligand-bound mouse TSPO in detergent has been published recently. A previously published low-resolution structure of Rhodobacter sphaeroides TSPO, obtained from tubular crystals with lipids and observed in cryo-electron microscopy, revealed an oligomeric structure without any ligand. We analyze this electron microscopy density in view of available biochemical and biophysical data, building a matching atomic model for the monomer and then the entire crystal. We compare its intra- and inter-molecular contacts with those predicted by amino acid covariation in TSPO proteins from evolutionary sequence analysis. The arrangement of the five transmembrane helices in a monomer of our model is different from that observed for the mouse TSPO. We analyze possible ligand binding sites for protoporphyrin, for the high-affinity ligand PK 11195, and for cholesterol in TSPO monomers and/or oligomers, and we discuss possible functional implications.
Collapse
|
23
|
Abstract
Computational techniques have revolutionized many aspects of scientific research over the last few decades. Experimentalists use computation for data analysis, processing ever bigger data sets. Theoreticians compute predictions from ever more complex models. However, traditional articles do not permit the publication of big data sets or complex models. As a consequence, these crucial pieces of information no longer enter the scientific record. Moreover, they have become prisoners of scientific software: many models exist only as software implementations, and the data are often stored in proprietary formats defined by the software. In this article, I argue that this emphasis on software tools over models and data is detrimental to science in the long term, and I propose a means by which this can be reversed.
Collapse
|
24
|
Benchmarking Collective Motion Predictions of Elastic Network Models. Biophys J 2014. [DOI: 10.1016/j.bpj.2013.11.2618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022] Open
|
25
|
|
26
|
Evaluation of Protein Elastic Network Models Based on an Analysis of Collective Motions. J Chem Theory Comput 2013; 9:5618-28. [PMID: 26592296 DOI: 10.1021/ct400399x] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
Elastic network models (ENMs) are valuable tools for investigating collective motions of proteins, and a rich variety of simple models have been proposed over the past decade. A good representation of the collective motions requires a good approximation of the covariances between the fluctuations of the individual atoms. Nevertheless, most studies have validated such models only by the magnitudes of the single-atom fluctuations they predict. In the present study, we have quantified the agreement between the covariance structure predicted by molecular dynamics (MD) simulations and those predicted by a representative selection of proposed coarse-grained ENMs. We then contrast this approach with the comparison to MD-predicted atomic fluctuations and comparison to crystallographic B-factors. While all the ENMs yield approximations to the MD-predicted covariance structure, we report large and consistent differences between proposed models. We also find that the ability of the ENMs to predict atomic fluctuations is correlated with their ability to capture the covariance structure. In contrast, we find that the models that agree best with B-factors model collective motions less reliably and recommend against using B-factors as a benchmark.
Collapse
|
27
|
|
28
|
A comparison of reduced coordinate sets for describing protein structure. J Chem Phys 2013; 139:124115. [DOI: 10.1063/1.4821598] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
|
29
|
|
30
|
|
31
|
|
32
|
|
33
|
Structure and Charge-State Dependence of the Gas-Phase Ionization Energy of Proteins. Angew Chem Int Ed Engl 2012; 51:9552-6. [DOI: 10.1002/anie.201204435] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2012] [Indexed: 11/09/2022]
|
34
|
Structure and Charge-State Dependence of the Gas-Phase Ionization Energy of Proteins. Angew Chem Int Ed Engl 2012. [DOI: 10.1002/ange.201204435] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
35
|
A path-integral Langevin equation treatment of low-temperature doped helium clusters. J Chem Phys 2012; 136:224309. [DOI: 10.1063/1.4726507] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
36
|
nMoldyn 3: Using task farming for a parallel spectroscopy-oriented analysis of molecular dynamics simulations. J Comput Chem 2012; 33:2043-8. [DOI: 10.1002/jcc.23035] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2012] [Revised: 05/11/2012] [Accepted: 05/19/2012] [Indexed: 11/06/2022]
|
37
|
Communication: A minimal model for the diffusion-relaxation backbone dynamics of proteins. J Chem Phys 2012; 136:191101. [DOI: 10.1063/1.4718380] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
38
|
Managing State. Comput Sci Eng 2012. [DOI: 10.1109/mcse.2012.11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
39
|
Least constraint approach to the extraction of internal motions from molecular dynamics trajectories of flexible macromolecules. J Chem Phys 2011; 135:084110. [DOI: 10.1063/1.3626275] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
|
40
|
nMoldyn - Interfacing spectroscopic experiments, molecular dynamics simulations and models for time correlation functions. ACTA ACUST UNITED AC 2011. [DOI: 10.1051/sfn/201112010] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
41
|
|
42
|
|
43
|
|
44
|
Abstract
Electron microscopy (EM) has made possible to solve the structure of many proteins. However, the resolution of some of the EM maps is too low for interpretation at the atomic level, which is particularly important to describe function. We describe methods that combine low-resolution EM data with atomic structures for different conformations of the same protein in order to produce atomic models compatible with the EM map.We illustrate these methods with EM data from decavanadate-induced tubular crystals of a pseudo-phosphorylated intermediate of Ca-ATPase and the various atomic structures of other intermediates available in the Protein Data Bank (PDB). Determination of atomic structure permits not only to analyse protein-protein interactions in the crystals, but also to localize residues in the proximity of the crystallizing agent both within Ca-ATPase and between Ca-ATPase molecules.
Collapse
|
45
|
|
46
|
Quantitative model for the heterogeneity of atomic position fluctuations in proteins: a simulation study. J Chem Phys 2009; 131:045104. [PMID: 19655925 DOI: 10.1063/1.3170941] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We propose a simple analytical model for the elastic incoherent structure factor of proteins measured by neutron scattering, which allows extracting the distribution of atomic position fluctuations from a fit of the model to the experimental data. The method is validated by applying it to elastic incoherent structure factors of lysozyme which have been obtained by molecular dynamics simulation and by normal mode analysis, respectively, and for which distributions of the atomic position fluctuations can be generated numerically for direct comparison with the predictions of the model. The comparison shows a remarkable agreement, in particular, concerning the lower limit for the position fluctuations, which is pronounced in the numerical data.
Collapse
|
47
|
|
48
|
Relaxation dynamics of lysozyme in solution under pressure: Combining molecular dynamics simulations and quasielastic neutron scattering. Chem Phys 2008. [DOI: 10.1016/j.chemphys.2007.07.018] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
49
|
Abstract
MOTIVATION In the study of the structural flexibility of proteins, crystallographic Debye-Waller factors are the most important experimental information used in the calibration and validation of computational models, such as the very successful elastic network models (ENMs). However, these models are applied to single protein molecules, whereas the experiments are performed on crystals. Moreover, the energy scale in standard ENMs is undefined and must be obtained by fitting to the same data that the ENM is trying to predict, reducing the predictive power of the model. RESULTS We develop an elastic network model for the whole protein crystal in order to study the influence of crystal packing and lattice vibrations on the thermal fluctuations of the atom positions. We use experimental values for the compressibility of the crystal to establish the energy scale of our model. We predict the elastic constants of the crystal and compare with experimental data. Our main findings are (1) crystal packing modifies the atomic fluctuations considerably and (2) thermal fluctuations are not the dominant contribution to crystallographic Debye-Waller factors. AVAILABILITY The programs developed for this work are available as supplementary material at Bioinformatics Online.
Collapse
|
50
|
|