1
|
Bongirwar V, Mokhade AS. Different methods, techniques and their limitations in protein structure prediction: A review. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2022; 173:72-82. [PMID: 35588858 DOI: 10.1016/j.pbiomolbio.2022.05.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Revised: 04/16/2022] [Accepted: 05/11/2022] [Indexed: 11/17/2022]
Abstract
Because of the increase in different types of diseases in human habitats, demands for designing various types of drugs are also increasing. Protein and its structure play a very important role in drug design. Therefore researchers from different areas like mathematics, medicines, and computer science are teaming up for getting better solutions in the said field. In this paper, we have discussed different methods of secondary and tertiary protein structure prediction (PSP), along with the limitations of different approaches. Different types of datasets used in PSP are also discussed here. This paper also tells about different performance measures to evaluate the prediction accuracy of PSP methods. Different software's/servers are available for download, which are used to find the protein structures for the input protein sequence. These softwares will also help to compare the performance of any new algorithm with other available methods. Details of those softwares are also mentioned in this paper.
Collapse
Affiliation(s)
| | - A S Mokhade
- Visvesvaraya National Institute of Technology, Nagpur, India
| |
Collapse
|
2
|
Pang X, Ning Y. Fuzzy control based on genetic algorithm in intelligent psychology teaching system. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2021. [DOI: 10.3233/jifs-189827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
The advancement of science has made computer technology and the education industry more and more closely related, and the development of intelligent teaching systems has also opened a new path for classroom teaching. This paper studies the application of fuzzy control based on genetic algorithms in the intelligent psychology teaching system. Facing the complicated variables in the teaching process, the improved genetic algorithm can better realize dynamic teaching decisions through fuzzy control. This article aims to improve the quality of psychology classroom teaching, and develops an intelligent psychology teaching system based on the fuzzy control theory of genetic algorithm. Combined with the current development of fuzzy control theory, the problems existing in the intelligent teaching system are studied and analyzed, and they have been optimized and improved. This paper proposes a control algorithm based on a teaching management system. The algorithm can implement fuzzy control on student models, knowledge organization structure, intelligent test papers and teaching decision-making. While restoring the real teaching process, it can better realize teaching students in accordance with their aptitude and improve teaching. The intelligence of the system. According to the system test data, the proportions of the difficulty of the system’s automatic test paper are 30.1%, 51.6%, 18.3%, which are in line with the designer’s set expectation of 3 : 5:2, which shows the improved genetic algorithm. It can realize the intelligent volume group function very well.
Collapse
Affiliation(s)
- Xiaojia Pang
- College of Education, Xi’an FANYI University, Xi’an, Shaanxi, China
| | - Yuwen Ning
- Information Technology Center, The Fourth Military Medical University, Xi’an, Shaanxi, China
| |
Collapse
|
3
|
Li B, Fooksa M, Heinze S, Meiler J. Finding the needle in the haystack: towards solving the protein-folding problem computationally. Crit Rev Biochem Mol Biol 2018; 53:1-28. [PMID: 28976219 PMCID: PMC6790072 DOI: 10.1080/10409238.2017.1380596] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2017] [Revised: 08/22/2017] [Accepted: 09/13/2017] [Indexed: 12/22/2022]
Abstract
Prediction of protein tertiary structures from amino acid sequence and understanding the mechanisms of how proteins fold, collectively known as "the protein folding problem," has been a grand challenge in molecular biology for over half a century. Theories have been developed that provide us with an unprecedented understanding of protein folding mechanisms. However, computational simulation of protein folding is still difficult, and prediction of protein tertiary structure from amino acid sequence is an unsolved problem. Progress toward a satisfying solution has been slow due to challenges in sampling the vast conformational space and deriving sufficiently accurate energy functions. Nevertheless, several techniques and algorithms have been adopted to overcome these challenges, and the last two decades have seen exciting advances in enhanced sampling algorithms, computational power and tertiary structure prediction methodologies. This review aims at summarizing these computational techniques, specifically conformational sampling algorithms and energy approximations that have been frequently used to study protein-folding mechanisms or to de novo predict protein tertiary structures. We hope that this review can serve as an overview on how the protein-folding problem can be studied computationally and, in cases where experimental approaches are prohibitive, help the researcher choose the most relevant computational approach for the problem at hand. We conclude with a summary of current challenges faced and an outlook on potential future directions.
Collapse
Affiliation(s)
- Bian Li
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| | - Michaela Fooksa
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
- Chemical and Physical Biology Graduate Program, Vanderbilt University, Nashville, TN, USA
| | - Sten Heinze
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| | - Jens Meiler
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| |
Collapse
|
4
|
Hidden Markov model and Chapman Kolmogrov for protein structures prediction from images. Comput Biol Chem 2017; 68:231-244. [DOI: 10.1016/j.compbiolchem.2017.04.003] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2017] [Revised: 03/11/2017] [Accepted: 04/11/2017] [Indexed: 11/20/2022]
|
5
|
Márquez-Chamorro AE, Asencio-Cortés G, Santiesteban-Toca CE, Aguilar-Ruiz JS. Soft computing methods for the prediction of protein tertiary structures: A survey. Appl Soft Comput 2015. [DOI: 10.1016/j.asoc.2015.06.024] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
6
|
Venkatesan A, Gopal J, Candavelou M, Gollapalli S, Karthikeyan K. Computational approach for protein structure prediction. Healthc Inform Res 2013; 19:137-47. [PMID: 23882419 PMCID: PMC3717437 DOI: 10.4258/hir.2013.19.2.137] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2012] [Revised: 03/30/2013] [Accepted: 04/01/2013] [Indexed: 11/23/2022] Open
Abstract
Objectives To predict the structure of protein, which dictates the function it performs, a newly designed algorithm is developed which blends the concept of self-organization and the genetic algorithm. Methods Among many other approaches, genetic algorithm is found to be a promising cooperative computational method to solve protein structure prediction in a reasonable time. To automate the right choice of parameter values the influence of self-organization is adopted to design a new genetic operator to optimize the process of prediction. Torsion angles, the local structural parameters which define the backbone of protein are considered to encode the chromosome that enhances the quality of the confirmation. Newly designed self-configured genetic operators are used to develop self-organizing genetic algorithm to facilitate the accurate structure prediction. Results Peptides are used to gauge the validity of the proposed algorithm. As a result, the structure predicted shows clear improvements in the root mean square deviation on overlapping the native indicates the overall performance of the algorithm. In addition, the Ramachandran plot results implies that the conformations of phi-psi angles in the predicted structure are better as compared to native and also free from steric hindrances. Conclusions The proposed algorithm is promising which contributes to the prediction of a native-like structure by eliminating the time constraint and effort demand. In addition, the energy of the predicted structure is minimized to a greater extent, which proves the stability of protein.
Collapse
Affiliation(s)
- Amouda Venkatesan
- Centre for Bioinformatics, Pondicherry University, Kalapet, Pondicherry, India
| | | | | | | | | |
Collapse
|
7
|
Brasil CRS, Delbem ACB, da Silva FLB. Multiobjective evolutionary algorithm with many tables for purelyab initioprotein structure prediction. J Comput Chem 2013; 34:1719-34. [DOI: 10.1002/jcc.23315] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2012] [Revised: 02/26/2013] [Accepted: 04/07/2013] [Indexed: 11/10/2022]
|
8
|
Particle swarm optimization approach for protein structure prediction in the 3D HP model. Interdiscip Sci 2013; 4:190-200. [DOI: 10.1007/s12539-012-0131-z] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2011] [Revised: 03/16/2012] [Accepted: 06/17/2012] [Indexed: 01/03/2023]
|
9
|
Ho HK, Zhang L, Ramamohanarao K, Martin S. A survey of machine learning methods for secondary and supersecondary protein structure prediction. Methods Mol Biol 2013; 932:87-106. [PMID: 22987348 DOI: 10.1007/978-1-62703-065-6_6] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
In this chapter we provide a survey of protein secondary and supersecondary structure prediction using methods from machine learning. Our focus is on machine learning methods applicable to β-hairpin and β-sheet prediction, but we also discuss methods for more general supersecondary structure prediction. We provide background on the secondary and supersecondary structures that we discuss, the features used to describe them, and the basic theory behind the machine learning methods used. We survey the machine learning methods available for secondary and supersecondary structure prediction and compare them where possible.
Collapse
Affiliation(s)
- Hui Kian Ho
- Department of Computer Science and Software Engineering, University of Melbourne, National ICT Australia, Parkville, VIC, Australia
| | | | | | | |
Collapse
|
10
|
Blum B, Jordan MI, Baker D. Feature space resampling for protein conformational search. Proteins 2010; 78:1583-93. [PMID: 20131376 DOI: 10.1002/prot.22677] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
De novo protein structure prediction requires location of the lowest energy state of the polypeptide chain among a vast set of possible conformations. Powerful approaches include conformational space annealing, in which search progressively focuses on the most promising regions of conformational space, and genetic algorithms, in which features of the best conformations thus far identified are recombined. We describe a new approach that combines the strengths of these two approaches. Protein conformations are projected onto a discrete feature space which includes backbone torsion angles, secondary structure, and beta pairings. For each of these there is one "native" value: the one found in the native structure. We begin with a large number of conformations generated in independent Monte Carlo structure prediction trajectories from Rosetta. Native values for each feature are predicted from the frequencies of feature value occurrences and the energy distribution in conformations containing them. A second round of structure prediction trajectories are then guided by the predicted native feature distributions. We show that native features can be predicted at much higher than background rates, and that using the predicted feature distributions improves structure prediction in a benchmark of 28 proteins. The advantages of our approach are that features from many different input structures can be combined simultaneously without producing atomic clashes or otherwise physically inviable models, and that the features being recombined have a relatively high chance of being correct.
Collapse
Affiliation(s)
- Ben Blum
- Department of Electrical Engineering and Computer Science, University of California, Berkeley, 94720, USA.
| | | | | |
Collapse
|
11
|
Ruiz-Blanco Yasser B, García Y, Sotomayor-Torres C, Yovani MP. New set of 2D/3D thermodynamic indices for proteins. A formalism based on “Molten Globule” theory. ACTA ACUST UNITED AC 2010. [DOI: 10.1016/j.phpro.2010.10.013] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
12
|
Chen P, Liu C, Burge L, Mahmood M, Southerland W, Gloster C. Protein fold classification with genetic algorithms and feature selection. J Bioinform Comput Biol 2009; 7:773-88. [PMID: 19785045 DOI: 10.1142/s0219720009004321] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2008] [Revised: 01/21/2009] [Accepted: 03/23/2009] [Indexed: 11/18/2022]
Abstract
Protein fold classification is a key step to predicting protein tertiary structures. This paper proposes a novel approach based on genetic algorithms and feature selection to classifying protein folds. Our dataset is divided into a training dataset and a test dataset. Each individual for the genetic algorithms represents a selection function of the feature vectors of the training dataset. A support vector machine is applied to each individual to evaluate the fitness value (fold classification rate) of each individual. The aim of the genetic algorithms is to search for the best individual that produces the highest fold classification rate. The best individual is then applied to the feature vectors of the test dataset and a support vector machine is built to classify protein folds based on selected features. Our experimental results on Ding and Dubchak's benchmark dataset of 27-class folds show that our approach achieves an accuracy of 71.28%, which outperforms current state-of-the-art protein fold predictors.
Collapse
Affiliation(s)
- Peng Chen
- Department of Systems and Computer Science, Howard University, 2300 Sixth Street, NW, Washington, DC 20059, USA.
| | | | | | | | | | | |
Collapse
|
13
|
|
14
|
Djurdjevic DP, Biggs MJ. Ab initio protein fold prediction using evolutionary algorithms: influence of design and control parameters on performance. J Comput Chem 2007; 27:1177-95. [PMID: 16752367 DOI: 10.1002/jcc.20440] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
True ab initio prediction of protein 3D structure requires only the protein primary structure, a physicochemical free energy model, and a search method for identifying the free energy global minimum. Various characteristics of evolutionary algorithms (EAs) mean they are in principle well suited to the latter. Studies to date have been less than encouraging, however. This is because of the limited consideration given to EA design and control parameter issues. A comprehensive study of these issues was, therefore, undertaken for ab initio protein fold prediction using a full atomistic protein model. The performance and optimal control parameter settings of twelve EA designs where first established using a 15-residue polyalanine molecule-design aspects varied include the encoding alphabet, crossover operator, and replacement strategy. It can be concluded that real encoding and multipoint crossover are superior, while both generational and steady-state replacement strategies have merits. The scaling between the optimal control parameter settings and polyalanine size was also identified for both generational and steady-state designs based on real encoding and multipoint crossover. Application of the steady-state design to met-enkephalin indicated that these scalings are potentially transferable to real proteins. Comparison of the performance of the steady state design for met-enkephalin with other ab initio methods indicates that EAs can be competitive provided the correct design and control parameter values are used.
Collapse
Affiliation(s)
- Dusan P Djurdjevic
- Institute for Materials and Processes, University of Edinburgh, King's Buildings, Mayfield Road, Edinburgh EH9 3JL, United Kingdom
| | | |
Collapse
|
15
|
Cutello V, Narzisi G, Nicosia G. A multi-objective evolutionary approach to the protein structure prediction problem. J R Soc Interface 2006; 3:139-51. [PMID: 16849226 PMCID: PMC1629082 DOI: 10.1098/rsif.2005.0083] [Citation(s) in RCA: 85] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The protein structure prediction (PSP) problem is concerned with the prediction of the folded, native, tertiary structure of a protein given its sequence of amino acids. It is a challenging and computationally open problem, as proven by the numerous methodological attempts and the research effort applied to it in the last few years. The potential energy functions used in the literature to evaluate the conformation of a protein are based on the calculations of two different interaction energies: local (bond atoms) and non-local (non-bond atoms). In this paper, we show experimentally that those types of interactions are in conflict, and do so by using the potential energy function Chemistry at HARvard Macromolecular Mechanics. A multi-objective formulation of the PSP problem is introduced and its applicability studied. We use a multi-objective evolutionary algorithm as a search procedure for exploring the conformational space of the PSP problem.
Collapse
|
16
|
Tuffery P, Derreumaux P. Dependency between consecutive local conformations helps assemble protein structures from secondary structures using Go potential and greedy algorithm. Proteins 2006; 61:732-40. [PMID: 16231300 DOI: 10.1002/prot.20698] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Discretization of protein conformational space and fragment assembly methods simplify the search of native structures. These methods, mostly of Monte Carlo and genetic-type, do not exploit, however, the fact that short fragments describing consecutive parts of proteins are conformation-dependent. Yet, this information should be useful in improving ab initio and comparative protein structure modeling. In a preliminary study, we have assessed the possibility of using greedy algorithms for protein structure reconstruction based on the assembly of fragments of four-residue length. Greedy algorithms differ from Monte Carlo and genetic approaches in that they grow a polypeptide chain one fragment after another. Here, we move one step further in complexity, and provide strong evidence that the dependence between consecutive local conformations during assembly makes possible the reconstruction of protein structures from their secondary structures using a Go potential. Overall our procedure can reproduce 20 protein structures of 50-164 amino acids within 2.7 to 6.5 A RMSd and is able to identify native topologies for all proteins, although some targets are stabilized by very long-range interactions.
Collapse
Affiliation(s)
- Pierre Tuffery
- Equipe de Bioinformatique Génomique et Moléculaire, INSERM U726, Paris, France.
| | | |
Collapse
|
17
|
Cutello V, Narzisi G, Nicosia G. A Class of Pareto Archived Evolution Strategy Algorithms Using Immune Inspired Operators for Ab-Initio Protein Structure Prediction. LECTURE NOTES IN COMPUTER SCIENCE 2005. [DOI: 10.1007/978-3-540-32003-6_6] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
|
18
|
De Sancho D, Prieto L, Rubio AM, Rey A. Evolutionary method for the assembly of rigid protein fragments. J Comput Chem 2004; 26:131-41. [PMID: 15584079 DOI: 10.1002/jcc.20150] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Genetic algorithms constitute a powerful optimization method that has already been used in the study of the protein folding problem. However, they often suffer from a lack of convergence in a reasonably short time for complex fitness functions. Here, we propose an evolutionary strategy that can reproducibly find structures close to the minimum of a potential function for a simplified protein model in an efficient way. The model reduces the number of degrees of freedom of the system by treating the protein structure as composed of rigid fragments. The search incorporates a double encoding procedure and a merging operation from subpopulations that evolve independently of one another, both contributing to the good performance of the full algorithm. We have tested it with protein structures of different degrees of complexity, and present our conclusions related to its possible application as an efficient tool for the analysis of folding potentials.
Collapse
Affiliation(s)
- David De Sancho
- Departamento de Química Física, Facultad de Ciencias Químicas, Universidad Complutense, E-28040 Madrid, Spain
| | | | | | | |
Collapse
|
19
|
Agostini L, Morosetti S. A simple procedure to weight empirical potentials in a fitness function so as to optimize its performance in ab initio protein-folding problem. Biophys Chem 2003; 105:105-18. [PMID: 12932583 DOI: 10.1016/s0301-4622(03)00130-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
In an approach to the protein folding problem by a Genetic Algorithm, the fitness function plays a critical role. Empirical potentials are generally used to build the fitness function, and they must be weighted to obtain a valuable one. The weights are generally found by the comparison with a set of misfolded structures (decoys), but a dependence of the obtained fitness generally arises on the used decoys. Here we describe a general procedure to find out, from a given set of potentials, their better linear combination that could either identify the wild structure or prove their powerlessness. We use topological considerations over the hyperspace of the potentials, and a multiple linear inequalities solver. The iterated method flows through the following steps: it determines a direction in the hyperspace of the potentials, which identifies the native structure as a vertex among a set of misfolded decoys. A multiple linear inequalities solver obtains the direction. A Genetic Algorithm, tailored to the specific problem, uses the fitness function defined by this direction and generally reaches a new structure better than the experimental one, which is added to the ensemble. The decoys so generated are not dependent on a deterministic criterion. This iterative procedure can be stopped either by identifying an effective fitness function or by proving the impossibility of its achievement. In order to test the method under the hardest conditions, we choose numerous and heterogeneous quantities as components of the fitness function. This method could be a useful tool for the scientific community in order to test any fitness proposed and to recognize the most important components on which it is built.
Collapse
Affiliation(s)
- Luigi Agostini
- Department of Chemistry, University of Rome La Sapienza, P.le A. Moro 5, Rome I-00185, Italy
| | | |
Collapse
|
20
|
Genetic algorithms in molecular modelling: a review. ACTA ACUST UNITED AC 2003. [DOI: 10.1016/s0922-3487(03)23004-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
|
21
|
Guvench O, Weiser J, Shenkin P, Kolossváry I, Still WC. Application of the frozen atom approximation to the GB/SA continuum model for solvation free energy. J Comput Chem 2002; 23:214-21. [PMID: 11924735 DOI: 10.1002/jcc.1167] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The generalized Born/surface area (GB/SA) continuum model for solvation free energy is a fast and accurate alternative to using discrete water molecules in molecular simulations of solvated systems. However, computational studies of large solvated molecular systems such as enzyme-ligand complexes can still be computationally expensive even with continuum solvation methods simply because of the large number of atoms in the solute molecules. Because in such systems often only a relatively small portion of the system such as the ligand binding site is under study, it becomes less attractive to calculate energies and derivatives for all atoms in the system. To curtail computation while still maintaining high energetic accuracy, atoms distant from the site of interest are often frozen; that is, their coordinates are made invariant. Such frozen atoms do not require energetic and derivative updates during the course of a simulation. Herein we describe methodology and results for applying the frozen atom approach to both the generalized Born (GB) and the solvent accessible surface area (SASA) parts of the GB/SA continuum model for solvation free energy. For strictly pairwise energetic terms, such as the Coulombic and van-der-Waals energies, contributions from pairs of frozen atoms can be ignored. This leaves energetic differences unaffected for conformations that vary only in the positions of nonfrozen atoms. Due to the nonlocal nature of the GB analytical form, however, excluding such pairs from a GB calculation leads to unacceptable inaccuracies. To apply a frozen-atom scheme to GB calculations, a buffer region within the frozen-atom zone is generated based on a user-definable cutoff distance from the nonfrozen atoms. Certain pairwise interactions between frozen atoms in the buffer region are retained in the GB computation. This allows high accuracy in conformational GB comparisons to be maintained while achieving significant savings in computational time compared to the full (nonfrozen) calculation. A similar approach for using a buffer region of frozen atoms is taken for the SASA calculation. The SASA calculation is local in nature, and thus exact SASA energies are maintained. With a buffer region of 8 A for the frozen-atom cases, excellent agreement in differences in energies for three different conformations of cytochrome P450 with a bound camphor ligand are obtained with respect to the nonfrozen cases. For various minimization protocols, simulations run 2 to 10.5 times faster and memory usage is reduced by a factor of 1.5 to 5. Application of the frozen atom method for GB/SA calculations thus can render computationally tractable biologically and medically important simulations such as those used to study ligand-receptor binding conformations and energies in a solvated environment.
Collapse
|
22
|
de la Cruz X, Sillitoe I, Orengo C. Use of structure comparison methods for the refinement of protein structure predictions. I. Identifying the structural family of a protein from low-resolution models. Proteins 2002; 46:72-84. [PMID: 11746704 DOI: 10.1002/prot.10002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Predicting the three-dimensional structure of proteins is still one of the most challenging problems in molecular biology. Despite its difficulty, several investigators have started to produce consistently low-resolution predictions for small proteins. However, in most of these cases, the prediction accuracy is still too low to make them useful. In the present article, we address the problem of obtaining better-quality predictions, starting from low-resolution models. To this end, we have devised a new procedure that uses these models, together with structure comparison methods, to identify the structural family of the target protein. This would allow, in a second step not described in the present work, to refine the predictions using conserved features of the identified family. In our approach, the structure database is investigated using predictions, at different accuracy levels, for a given protein. As query structures, we used both low-resolution versions of the native structures, as well as different sets of low accuracy predictions. In general, we found that for predictions with a resolution of > or =5-7 A, structure comparison methods were able to identify the fold of a protein in the top positions.
Collapse
Affiliation(s)
- Xavier de la Cruz
- Departmento de Bioquímica y Biología Molecular Facultad de Químicas; Universidad de Barcelona, Barcelona, Spain.
| | | | | |
Collapse
|
23
|
|
24
|
Abstract
A fast analytical formula (TDND) has been derived for the calculation of approximate atomic and molecular solvent-accessible surface areas (SASA), as well as the first and second derivatives of these quantities with respect to atomic coordinates. Extending the work of Stouten et al. (Molecular Simulation, 1993, Vol. 10, pp. 97-120), as well as our own (Journal of Computational Chemistry, 1999, Vol. 20, pp. 586-596), the method makes use of a Gaussian function to calculate the neighbor density in four tetrahedral directions in three-dimensional space, sometimes twice with different orientations. SASA and first derivatives of the 2366 heavy atoms of penicillopepsin are computed in 0.13 s on an SGI R10000/194 MHz processor. When second derivatives are computed as well, the total time is 0.23 s. This is considerably faster than timings reported previously for other algorithms. Based on a parameterization set of nineteen compounds of different size (11-4346 atoms) and class (organics, proteins, DNA, and various complexes) consisting of a total 23,197 atoms, the method exhibits relative errors in the range 0.2-12.6% for total molecular surface areas and average absolute atomic surface area deviations in the range 1.7 to 3.6 A(2).
Collapse
Affiliation(s)
- J Weiser
- Anterio Consult & Research GmbH, Augustaanlage 26, D-68165 Mannheim, Germany.
| | | | | |
Collapse
|
25
|
Huang ES, Samudrala R, Ponder JW. Ab initio fold prediction of small helical proteins using distance geometry and knowledge-based scoring functions. J Mol Biol 1999; 290:267-81. [PMID: 10388572 DOI: 10.1006/jmbi.1999.2861] [Citation(s) in RCA: 67] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The problem of protein tertiary structure prediction from primary sequence can be separated into two subproblems: generation of a library of possible folds and specification of a best fold given the library. A distance geometry procedure based on random pairwise metrization with good sampling properties was used to generate a library of 500 possible structures for each of 11 small helical proteins. The input to distance geometry consisted of sets of restraints to enforce predicted helical secondary structure and a generic range of 5 to 11 A between predicted contact residues on all pairs of helices. For each of the 11 targets, the resulting library contained structures with low RMSD versus the native structure. Near-native sampling was enhanced by at least three orders of magnitude compared to a random sampling of compact folds. All library members were scored with a combination of an all-atom distance-dependent function, a residue pair-potential, and a hydrophobicity function. In six of the 11 cases, the best-ranking fold was considered to be near native. Each library was also reduced to a final ab initio prediction via consensus distance geometry performed over the 50 best-ranking structures from the full set of 500. The consensus results were of generally higher quality, yielding six predictions within 6.5 A of the native fold. These favorable predictions corresponded to those for which the correlation between the RMSD and the scoring function were highest. The advantage of the reported methodology is its extreme simplicity and potential for including other types of structural restraints.
Collapse
Affiliation(s)
- E S Huang
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, Saint Louis, MO, 63110, USA
| | | | | |
Collapse
|
26
|
Optimization of Gaussian surface calculations and extension to solvent-accessible surface areas. J Comput Chem 1999; 20:688-703. [DOI: 10.1002/(sici)1096-987x(199905)20:7<688::aid-jcc4>3.0.co;2-f] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/1998] [Accepted: 12/22/1998] [Indexed: 11/07/2022]
|
27
|
|
28
|
|
29
|
|
30
|
Application of Reduced Models to Protein Structure Prediction. ACTA ACUST UNITED AC 1999. [DOI: 10.1016/s1380-7323(99)80086-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
|