1
|
Mufassirin MMM, Newton MAH, Sattar A. Artificial intelligence for template-free protein structure prediction: a comprehensive review. Artif Intell Rev 2022. [DOI: 10.1007/s10462-022-10350-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
2
|
Zhang L, Ma H, Qian W, Li H. Protein structure optimization using improved simulated annealing algorithm on a three-dimensional AB off-lattice model. Comput Biol Chem 2020; 85:107237. [PMID: 32109854 DOI: 10.1016/j.compbiolchem.2020.107237] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Revised: 02/11/2020] [Accepted: 02/15/2020] [Indexed: 01/01/2023]
Abstract
This paper proposed an improved simulated annealing (ISA) algorithm for protein structure optimization based on a three-dimensional AB off-lattice model. In the algorithm, we provided a general formula used for producing initial solution, and designed a multivariable disturbance term, relating to the parameters of simulated annealing and a tuned constant, to generate neighborhood solution. To avoid missing optimal solution, storage operation was performed in searching process. We applied the algorithm to test artificial protein sequences from literature and constructed a benchmark dataset consisting of 10 real protein sequences from the Protein Data Bank (PDB). Otherwise, we generated Cα space-filling model to represent protein folding conformation. The results indicate our algorithm outperforms the five methods before in searching lower energies of artificial protein sequences. In the testing on real proteins, our method can achieve the energy conformations with Cα-RMSD less than 3.0 Å from the PDB structures. Moreover, Cα space-filling model may simulate dynamic change of protein folding conformation at atomic level.
Collapse
Affiliation(s)
- Lizhong Zhang
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110169, China; College of Computer Science and Technology, Shenyang University of Chemical Technology, Shenyang 110142, China
| | - He Ma
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110169, China; Key Laboratory of Medical Image Computing (Northeastern University), Ministry of Education, Shenyang 110169, China.
| | - Wei Qian
- Department of Electrical and Computer Engineering, College of Engineering, University of Texas, El Paso TX 79968, USA
| | - Haiyan Li
- College of Pharmaceutical and Bioengineering, Shenyang University of Chemical Technology, Shenyang 110142, China
| |
Collapse
|
3
|
Svensson O, Gilski M, Nurizzo D, Bowler MW. A comparative anatomy of protein crystals: lessons from the automatic processing of 56 000 samples. IUCRJ 2019; 6:822-831. [PMID: 31576216 PMCID: PMC6760449 DOI: 10.1107/s2052252519008017] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2019] [Accepted: 06/04/2019] [Indexed: 05/12/2023]
Abstract
The fully automatic processing of crystals of macromolecules has presented a unique opportunity to gather information on the samples that is not usually recorded. This has proved invaluable in improving sample-location, characterization and data-collection algorithms. After operating for four years, MASSIF-1 has now processed over 56 000 samples, gathering information at each stage, from the volume of the crystal to the unit-cell dimensions, the space group, the quality of the data collected and the reasoning behind the decisions made in data collection. This provides an unprecedented opportunity to analyse these data together, providing a detailed landscape of macromolecular crystals, intimate details of their contents and, importantly, how the two are related. The data show that mosaic spread is unrelated to the size or shape of crystals and demonstrate experimentally that diffraction intensities scale in proportion to crystal volume and molecular weight. It is also shown that crystal volume scales inversely with molecular weight. The results set the scene for the development of X-ray crystallography in a changing environment for structural biology.
Collapse
Affiliation(s)
- Olof Svensson
- European Synchrotron Radiation Facility, 71 Avenue des Martyrs, CS 40220, F-38043 Grenoble, France
| | - Maciej Gilski
- European Molecular Biology Laboratory, Grenoble Outstation, 71 Avenue des Martyrs, CS 90181, F-38042 Grenoble, France
| | - Didier Nurizzo
- European Synchrotron Radiation Facility, 71 Avenue des Martyrs, CS 40220, F-38043 Grenoble, France
| | - Matthew W. Bowler
- European Molecular Biology Laboratory, Grenoble Outstation, 71 Avenue des Martyrs, CS 90181, F-38042 Grenoble, France
| |
Collapse
|
4
|
Grabowski M, Langner KM, Cymborowski M, Porebski PJ, Sroka P, Zheng H, Cooper DR, Zimmerman MD, Elsliger MA, Burley SK, Minor W. A public database of macromolecular diffraction experiments. Acta Crystallogr D Struct Biol 2016; 72:1181-1193. [PMID: 27841751 PMCID: PMC5108346 DOI: 10.1107/s2059798316014716] [Citation(s) in RCA: 96] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2016] [Accepted: 09/17/2016] [Indexed: 12/28/2022] Open
Abstract
The low reproducibility of published experimental results in many scientific disciplines has recently garnered negative attention in scientific journals and the general media. Public transparency, including the availability of `raw' experimental data, will help to address growing concerns regarding scientific integrity. Macromolecular X-ray crystallography has led the way in requiring the public dissemination of atomic coordinates and a wealth of experimental data, making the field one of the most reproducible in the biological sciences. However, there remains no mandate for public disclosure of the original diffraction data. The Integrated Resource for Reproducibility in Macromolecular Crystallography (IRRMC) has been developed to archive raw data from diffraction experiments and, equally importantly, to provide related metadata. Currently, the database of our resource contains data from 2920 macromolecular diffraction experiments (5767 data sets), accounting for around 3% of all depositions in the Protein Data Bank (PDB), with their corresponding partially curated metadata. IRRMC utilizes distributed storage implemented using a federated architecture of many independent storage servers, which provides both scalability and sustainability. The resource, which is accessible via the web portal at http://www.proteindiffraction.org, can be searched using various criteria. All data are available for unrestricted access and download. The resource serves as a proof of concept and demonstrates the feasibility of archiving raw diffraction data and associated metadata from X-ray crystallographic studies of biological macromolecules. The goal is to expand this resource and include data sets that failed to yield X-ray structures in order to facilitate collaborative efforts that will improve protein structure-determination methods and to ensure the availability of `orphan' data left behind for various reasons by individual investigators and/or extinct structural genomics projects.
Collapse
Affiliation(s)
- Marek Grabowski
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22904, USA
| | - Karol M. Langner
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22904, USA
| | - Marcin Cymborowski
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22904, USA
| | - Przemyslaw J. Porebski
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22904, USA
- Jerzy Haber Institute of Catalysis and Surface Chemistry, Polish Academy of Sciences, Niezapominajek 8, 30-239 Cracow, Poland
| | - Piotr Sroka
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22904, USA
| | - Heping Zheng
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22904, USA
| | - David R. Cooper
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22904, USA
| | - Matthew D. Zimmerman
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22904, USA
| | - Marc-André Elsliger
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 90237, USA
| | - Stephen K. Burley
- RCSB Protein Data Bank; Center for Integrative Proteomics Research; Institute for Quantitative Biomedicine; Rutgers Cancer Institute of New Jersey; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- San Diego Supercomputer Center and Skaggs School of Pharmacological Sciences, University of California, San Diego, La Jolla, CA 92093, USA
| | - Wladek Minor
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22904, USA
| |
Collapse
|
5
|
Li DW, Brüschweiler R. Protocol To Make Protein NMR Structures Amenable to Stable Long Time Scale Molecular Dynamics Simulations. J Chem Theory Comput 2015; 10:1781-7. [PMID: 26580385 DOI: 10.1021/ct4010646] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
A robust protocol for the treatment of NMR protein structures is presented that makes them amenable to long time scale molecular dynamics (MD) simulations that are stable. The protocol embeds an NMR structure in a native low energy region of the recently developed ff99SB_φψ(g24;CS) molecular mechanics force field. Extended MD trajectories that start from these structures show good consistency with proton-proton nuclear Overhauser effect data, and they reproduce NMR chemical shift data better than the original NMR structures as is demonstrated for four protein systems. Moreover, for all proteins studied here the simulations spontaneously approach the X-ray crystal structures, thereby improving the effective resolution of the initial structural models.
Collapse
Affiliation(s)
- Da-Wei Li
- Campus Chemical Instrument Center and Department of Chemistry and Biochemistry, The Ohio State University , Columbus, Ohio 43210, United States.,Department of Chemistry and Biochemistry and National High Magnetic Field Laboratory, Florida State University , Tallahassee, Florida 32306, United States
| | - Rafael Brüschweiler
- Campus Chemical Instrument Center and Department of Chemistry and Biochemistry, The Ohio State University , Columbus, Ohio 43210, United States.,Department of Chemistry and Biochemistry and National High Magnetic Field Laboratory, Florida State University , Tallahassee, Florida 32306, United States
| |
Collapse
|
6
|
Protein folding optimization based on 3D off-lattice model via an improved artificial bee colony algorithm. J Mol Model 2015; 21:261. [DOI: 10.1007/s00894-015-2806-y] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2015] [Accepted: 08/30/2015] [Indexed: 12/30/2022]
|
7
|
Buchner L, Güntert P. Increased reliability of nuclear magnetic resonance protein structures by consensus structure bundles. Structure 2015; 23:425-34. [PMID: 25579816 DOI: 10.1016/j.str.2014.11.014] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2014] [Revised: 11/17/2014] [Accepted: 11/17/2014] [Indexed: 11/17/2022]
Abstract
Nuclear magnetic resonance (NMR) structures are represented by bundles of conformers calculated from different randomized initial structures using identical experimental input data. The spread among these conformers indicates the precision of the atomic coordinates. However, there is as yet no reliable measure of structural accuracy, i.e., how close NMR conformers are to the "true" structure. Instead, the precision of structure bundles is widely (mis)interpreted as a measure of structural quality. Attempts to increase precision often overestimate accuracy by tight bundles of high precision but much lower accuracy. To overcome this problem, we introduce a protocol for NMR structure determination with the software package CYANA, which produces, like the traditional method, bundles of conformers in agreement with a common set of conformational restraints but with a realistic precision that is, throughout a variety of proteins and NMR data sets, a much better estimate of structural accuracy than the precision of conventional structure bundles.
Collapse
Affiliation(s)
- Lena Buchner
- Institute of Biophysical Chemistry and Center for Biomolecular Magnetic Resonance, Goethe University Frankfurt, Max-von-Laue-Straße 9, 60438 Frankfurt am Main, Germany
| | - Peter Güntert
- Institute of Biophysical Chemistry and Center for Biomolecular Magnetic Resonance, Goethe University Frankfurt, Max-von-Laue-Straße 9, 60438 Frankfurt am Main, Germany; Frankfurt Institute of Advanced Studies, Goethe University Frankfurt, Ruth-Moufang-Straße 1, 60438 Frankfurt am Main, Germany; Department of Chemistry, Graduate School of Science and Engineering, Tokyo Metropolitan University, 1-1 Minami-Ohsawa, Hachioji, Tokyo 192-0397, Japan.
| |
Collapse
|
8
|
Li B, Chiong R, Lin M. A balance-evolution artificial bee colony algorithm for protein structure optimization based on a three-dimensional AB off-lattice model. Comput Biol Chem 2014; 54:1-12. [PMID: 25463349 DOI: 10.1016/j.compbiolchem.2014.11.004] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2014] [Revised: 11/16/2014] [Accepted: 11/19/2014] [Indexed: 10/24/2022]
Abstract
Protein structure prediction is a fundamental issue in the field of computational molecular biology. In this paper, the AB off-lattice model is adopted to transform the original protein structure prediction scheme into a numerical optimization problem. We present a balance-evolution artificial bee colony (BE-ABC) algorithm to address the problem, with the aim of finding the structure for a given protein sequence with the minimal free-energy value. This is achieved through the use of convergence information during the optimization process to adaptively manipulate the search intensity. Besides that, an overall degradation procedure is introduced as part of the BE-ABC algorithm to prevent premature convergence. Comprehensive simulation experiments based on the well-known artificial Fibonacci sequence set and several real sequences from the database of Protein Data Bank have been carried out to compare the performance of BE-ABC against other algorithms. Our numerical results show that the BE-ABC algorithm is able to outperform many state-of-the-art approaches and can be effectively employed for protein structure optimization.
Collapse
Affiliation(s)
- Bai Li
- School of Control Science and Engineering, Zhejiang University, Hangzhou 310027, PR China; School of Advanced Engineering, Beihang University, Beijing 100191, PR China.
| | - Raymond Chiong
- School of Design, Communication and Information Technology, The University of Newcastle, Callaghan, NSW 2308, Australia.
| | - Mu Lin
- College of Biomedical Engineering & Instrument Science, Zhejiang University, Hangzhou 310027, PR China.
| |
Collapse
|
9
|
Rosato A, Tejero R, Montelione GT. Quality assessment of protein NMR structures. Curr Opin Struct Biol 2013; 23:715-24. [PMID: 24060334 DOI: 10.1016/j.sbi.2013.08.005] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2013] [Accepted: 08/14/2013] [Indexed: 10/26/2022]
Abstract
Biomolecular NMR structures are now routinely used in biology, chemistry, and bioinformatics. Methods and metrics for assessing the accuracy and precision of protein NMR structures are beginning to be standardized across the biological NMR community. These include both knowledge-based assessment metrics, parameterized from the database of protein structures, and model versus data assessment metrics. On line servers are available that provide comprehensive protein structure quality assessment reports, and efforts are in progress by the world-wide Protein Data Bank (wwPDB) to develop a biomolecular NMR structure quality assessment pipeline as part of the structure deposition process. These quality assessment metrics and standards will aid NMR spectroscopists in determining more accurate structures, and increase the value and utility of these structures for the broad scientific community.
Collapse
Affiliation(s)
- Antonio Rosato
- Magnetic Resonance Center and Department of Chemistry, University of Florence, 50019 Sesto Fiorentino, Italy
| | | | | |
Collapse
|
10
|
Wlodawer A, Minor W, Dauter Z, Jaskolski M. Protein crystallography for aspiring crystallographers or how to avoid pitfalls and traps in macromolecular structure determination. FEBS J 2013; 280:5705-36. [PMID: 24034303 DOI: 10.1111/febs.12495] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2013] [Revised: 08/12/2013] [Accepted: 08/20/2013] [Indexed: 12/28/2022]
Abstract
The number of macromolecular structures deposited in the Protein Data Bank now approaches 100,000, with the vast majority of them determined by crystallographic methods. Thousands of papers describing such structures have been published in the scientific literature, and 20 Nobel Prizes in chemistry or medicine have been awarded for discoveries based on macromolecular crystallography. New hardware and software tools have made crystallography appear to be an almost routine (but still far from being analytical) technique and many structures are now being determined by scientists with very limited experience in the practical aspects of the field. However, this apparent ease is sometimes illusory and proper procedures need to be followed to maintain high standards of structure quality. In addition, many noncrystallographers may have problems with the critical evaluation and interpretation of structural results published in the scientific literature. The present review provides an outline of the technical aspects of crystallography for less experienced practitioners, as well as information that might be useful for users of macromolecular structures, aiming to show them how to interpret (but not overinterpret) the information present in the coordinate files and in their description. A discussion of the extent of information that can be gleaned from the atomic coordinates of structures solved at different resolution is provided, as well as problems and pitfalls encountered in structure determination and interpretation.
Collapse
Affiliation(s)
- Alexander Wlodawer
- Protein Structure Section, Macromolecular Crystallography Laboratory, NCI at Frederick, Frederick, MD, USA
| | | | | | | |
Collapse
|