1
|
Boumedine N, Bouroubi S. Protein folding in 3D lattice HP model using a combining cuckoo search with the Hill-Climbing algorithms. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.108564] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
2
|
Li B, Fooksa M, Heinze S, Meiler J. Finding the needle in the haystack: towards solving the protein-folding problem computationally. Crit Rev Biochem Mol Biol 2018; 53:1-28. [PMID: 28976219 PMCID: PMC6790072 DOI: 10.1080/10409238.2017.1380596] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2017] [Revised: 08/22/2017] [Accepted: 09/13/2017] [Indexed: 12/22/2022]
Abstract
Prediction of protein tertiary structures from amino acid sequence and understanding the mechanisms of how proteins fold, collectively known as "the protein folding problem," has been a grand challenge in molecular biology for over half a century. Theories have been developed that provide us with an unprecedented understanding of protein folding mechanisms. However, computational simulation of protein folding is still difficult, and prediction of protein tertiary structure from amino acid sequence is an unsolved problem. Progress toward a satisfying solution has been slow due to challenges in sampling the vast conformational space and deriving sufficiently accurate energy functions. Nevertheless, several techniques and algorithms have been adopted to overcome these challenges, and the last two decades have seen exciting advances in enhanced sampling algorithms, computational power and tertiary structure prediction methodologies. This review aims at summarizing these computational techniques, specifically conformational sampling algorithms and energy approximations that have been frequently used to study protein-folding mechanisms or to de novo predict protein tertiary structures. We hope that this review can serve as an overview on how the protein-folding problem can be studied computationally and, in cases where experimental approaches are prohibitive, help the researcher choose the most relevant computational approach for the problem at hand. We conclude with a summary of current challenges faced and an outlook on potential future directions.
Collapse
Affiliation(s)
- Bian Li
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| | - Michaela Fooksa
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
- Chemical and Physical Biology Graduate Program, Vanderbilt University, Nashville, TN, USA
| | - Sten Heinze
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| | - Jens Meiler
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| |
Collapse
|
3
|
Yanev N, Traykov M, Milanov P, Yurukov B. Protein Folding Prediction in a Cubic Lattice in Hydrophobic-Polar Model. J Comput Biol 2016; 24:412-421. [PMID: 27901606 DOI: 10.1089/cmb.2016.0181] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The tertiary structure of the proteins determines their functions. Therefore, the predicting of protein's tertiary structure, based on the primary amino acid sequence from long time, is the most important and challenging subject in biochemistry, molecular biology, and biophysics. One of the most popular protein structure prediction methods, called Hydrophobic-Polar (HP) model, is based on the observation that in polar environment hydrophobic amino acids are in the core of the molecule-in contact between them and more polar amino acids are in contact with the polar environment. In this study, we present a new mixed integer programming formulation, exact algorithm, and two heuristic algorithms to solve the protein folding problem stated as a combinatorial optimization problem in a simple cubic lattice. The results from computational runs on a set of benchmarks are favorably compared to known algorithms for solving the 3D lattice HP model as genetic algorithms, ant colony optimization algorithm, and Monte Carlo algorithm.
Collapse
Affiliation(s)
- Nicola Yanev
- 1 Institute of Mathematics and Informatics , Bulgarian Academy of Sciences, Sofia, Bulgaria
| | - Metodi Traykov
- 2 Department of Informatics, Center for Advanced Bioinformatics Research, South-West University "Neofit Rilski," Blagoevgrad, Bulgaria
| | - Peter Milanov
- 1 Institute of Mathematics and Informatics , Bulgarian Academy of Sciences, Sofia, Bulgaria .,2 Department of Informatics, Center for Advanced Bioinformatics Research, South-West University "Neofit Rilski," Blagoevgrad, Bulgaria
| | - Borislav Yurukov
- 2 Department of Informatics, Center for Advanced Bioinformatics Research, South-West University "Neofit Rilski," Blagoevgrad, Bulgaria
| |
Collapse
|
4
|
Bošković B, Brest J. Genetic algorithm with advanced mechanisms applied to the protein structure prediction in a hydrophobic-polar model and cubic lattice. Appl Soft Comput 2016. [DOI: 10.1016/j.asoc.2016.04.001] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
5
|
Guo Y. The Noncompacted Folding of Proteins by Modified Elastic Net Algorithm. J Comput Biol 2015; 22:609-18. [PMID: 26161596 DOI: 10.1089/cmb.2012.0290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
In this article, a protein sequence of length n is embedded in a two-dimensional lattice with m (>n) points. Thus the obtained minimal energy configurations are expected to have flexible shapes in contrast to the compact rectangular conformations. To fulfill this extension, the elastic net algorithm is modified to deal with the difficulty brought by the unsymmetrical relationship between amino acids and lattice points. New set partition strategy in the embedding phase is introduced, and two local search methods are applied to overcome the multimapping phenomena. Several HP benchmark examples with up to 48 amino acids are tested to verify the effectiveness of the proposed approach.
Collapse
Affiliation(s)
- Yuzhen Guo
- Department of Mathematics, College of Science, Nanjing University of Aeronautics and Astronautics , Nanjing, China
| |
Collapse
|
6
|
Santos J, Villot P, Diéguez M. Emergent protein folding modeled with evolved neural cellular automata using the 3D HP model. J Comput Biol 2014; 21:823-45. [PMID: 25343217 DOI: 10.1089/cmb.2014.0077] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
We used cellular automata (CA) for the modeling of the temporal folding of proteins. Unlike the focus of the vast research already done on the direct prediction of the final folded conformations, we will model the temporal and dynamic folding process. To reduce the complexity of the interactions and the nature of the amino acid elements, lattice models like HP were used, a model that categorizes the amino acids regarding their hydrophobicity. Taking into account the restrictions of the lattice model, the CA model defines how the amino acids interact through time to obtain a folded conformation. We extended the classical CA models using artificial neural networks for their implementation (neural CA), and we used evolutionary computing to automatically obtain the models by means of Differential Evolution. As the iterative folding also provides the final folded conformation, we can compare the results with those from direct prediction methods of the final protein conformation. Finally, as the neural CA that provides the iterative folding process can be evolved using several protein sequences and used as operators in the folding of another protein with different length, this represents an advantage over the NP-hard complexity of the original problem of the direct prediction.
Collapse
Affiliation(s)
- José Santos
- Department of Computer Science, University of A Coruña , A Coruña, Spain
| | | | | |
Collapse
|
7
|
Custódio FL, Barbosa HJ, Dardenne LE. A multiple minima genetic algorithm for protein structure prediction. Appl Soft Comput 2014. [DOI: 10.1016/j.asoc.2013.10.029] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
8
|
Particle swarm optimization approach for protein structure prediction in the 3D HP model. Interdiscip Sci 2013; 4:190-200. [DOI: 10.1007/s12539-012-0131-z] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2011] [Revised: 03/16/2012] [Accepted: 06/17/2012] [Indexed: 01/03/2023]
|
9
|
Liu J, Li G, Yu J, Yao Y. Heuristic energy landscape paving for protein folding problem in the three-dimensional HP lattice model. Comput Biol Chem 2012; 38:17-26. [PMID: 22551826 DOI: 10.1016/j.compbiolchem.2012.02.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2011] [Revised: 01/10/2012] [Accepted: 02/14/2012] [Indexed: 10/28/2022]
Abstract
The protein folding problem, i.e., the prediction of the tertiary structures of protein molecules from their amino acid sequences is one of the most important problems in computational biology and biochemistry. However, the extremely difficult optimization problem arising from energy function is a key challenge in protein folding simulation. The energy landscape paving (ELP) method has already been applied very successfully to off-lattice protein models and other optimization problems with complex energy landscape in continuous space. By improving the ELP method, and subsequently incorporating the neighborhood strategy with the pull-move set into the improved ELP method, a heuristic ELP algorithm is proposed to find low-energy conformations of 3D HP lattice model proteins in the discrete space. The algorithm is tested on three sets of 3D HP benchmark instances consisting 31 sequences. For eleven sequences with 27 monomers, the proposed method explores the conformation surfaces more efficiently than other methods, and finds new lower energies in several cases. For ten 48-monomer sequences, we find the lowest energies so far. With the achieved results, the algorithm converges rapidly and efficiently. For all ten 64-monomer sequences, the algorithm finds lower energies within comparable computation times than previous methods. Numeric results show that the heuristic ELP method is a competitive tool for protein folding simulation in 3D lattice model. To the best of our knowledge, this is the first application of ELP to the 3D discrete space.
Collapse
Affiliation(s)
- Jingfa Liu
- School of Computer & Software, Nanjing University of Information Science & Technology, Nanjing 210044, China.
| | | | | | | |
Collapse
|
10
|
Liu J, Li G, Yu J. Protein-folding simulations of the hydrophobic-hydrophilic model by combining pull moves with energy landscape paving. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2011; 84:031934. [PMID: 22060430 DOI: 10.1103/physreve.84.031934] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/04/2011] [Revised: 07/29/2011] [Indexed: 05/31/2023]
Abstract
The energy landscape paving (ELP) method is a class of heuristic global optimization algorithms based on Monte Carlo sampling. By incorporating the generation of an initial conformation based on a greedy strategy, the conformation update mechanism based on pull moves, and some heuristic off-trap strategies into an improved ELP method, we propose an alternative version of the ELP method, called the ELP-pull move method. We test the ELP-pull move method on both two-dimensional (2D) and 3D hydrophobic-hydrophilic protein-folding models. For ten 2D benchmark sequences of length ranging from 20 to 100, the proposed algorithm finds the lowest energies so far. Within the achieved results, the algorithm converges more rapidly and efficiently than previous methods. For all ten 3D sequences with a length of 64, the ELP-pull move method finds lower energies within comparable computational times. The numerical results demonstrate that our algorithm is a powerful method to study the lattice protein-folding model.
Collapse
Affiliation(s)
- Jingfa Liu
- School of Computer & Software, Nanjing University of Information Science & Technology, Nanjing 210044, China.
| | | | | |
Collapse
|
11
|
Santos J, Diéguez M. Differential Evolution for Protein Structure Prediction Using the HP Model. LECTURE NOTES IN COMPUTER SCIENCE 2011. [DOI: 10.1007/978-3-642-21344-1_34] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
|
12
|
Huang C, Yang X, He Z. Protein folding simulations of 2D HP model by the genetic algorithm based on optimal secondary structures. Comput Biol Chem 2010; 34:137-42. [PMID: 20627698 DOI: 10.1016/j.compbiolchem.2010.04.002] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2009] [Revised: 12/20/2009] [Accepted: 04/27/2010] [Indexed: 11/16/2022]
Abstract
In this paper, based on the evolutionary Monte Carlo (EMC) algorithm, we have made four points of ameliorations and propose a so-called genetic algorithm based on optimal secondary structure (GAOSS) method to predict efficiently the protein folding conformations in the two-dimensional hydrophobic-hydrophilic (2D HP) model. Nine benchmarks are tested to verify the effectiveness of the proposed approach and the results show that for the listed benchmarks GAOSS can find the best solutions so far. It means that reasonable, effective and compact secondary structures (SSs) can avoid blind searches and can reduce time consuming significantly. On the other hand, as examples, we discuss the diversity of protein GSC for the 24-mer and 85-mer sequences. Several GSCs have been found by GAOSS and some of the conformations are quite different from each other. It would be useful for the designing of protein molecules. GAOSS would be an efficient tool for the protein structure predictions (PSP).
Collapse
Affiliation(s)
- Chenhua Huang
- MOE Key Laboratory of Laser Life Science & Institute of Laser Life Science, South China Normal University, Zhongshan Road, Guangzhou 510631, China
| | | | | |
Collapse
|
13
|
Benítez CMV, Lopes HS. Protein structure prediction with the 3D-HP side-chain model using a master–slave parallel genetic algorithm. JOURNAL OF THE BRAZILIAN COMPUTER SOCIETY 2010. [DOI: 10.1007/s13173-010-0002-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Abstract
This work presents a master-slave parallel genetic algorithm for the protein folding problem, using the 3D-HP side-chain model (3D-HP-SC). This model is sparsely studied in the literature, although more expressive than other lattice models. The fitness function proposed includes information not only about the free-energy of the conformation, but also compactness of the side-chains. Since there is no benchmark available to date for this model, a set of 15 sequences was used, based on a simpler model. Results show that the parallel GA achieved a good level of efficiency and obtained biologically coherent results, suggesting the adequacy of the methodology. Future work will include new biologically-inspired genetic operators and more experiments to create new benchmarks.
Collapse
|
14
|
Lopes HS. Evolutionary Algorithms for the Protein Folding Problem: A Review and Current Trends. STUDIES IN COMPUTATIONAL INTELLIGENCE 2009. [DOI: 10.1007/978-3-540-70778-3_12] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
15
|
Advances on protein folding simulations based on the lattice HP models with natural computing. Appl Soft Comput 2008. [DOI: 10.1016/j.asoc.2007.03.012] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
16
|
Guo YZ, Feng EM. The simulation of the three-dimensional lattice hydrophobic-polar protein folding. J Chem Phys 2006; 125:234703. [PMID: 17190566 DOI: 10.1063/1.2402162] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
One of the most prominent problems in computational biology is to predict the natural conformation of a protein from its amino acid sequence. This paper focuses on the three-dimensional hydrophobic-polar (HP) lattice model of this problem. The modified elastic net (EN) algorithm is applied to solve this nonlinear programming hard problem. The lattice partition strategy and two local search methods (LS(1) and LS(2)) are proposed to improve the performance of the modified EN algorithm. The computation and analysis of 12 HP standard benchmark instances are also involved in this paper. The results indicate that the hybrid of modified EN algorithm, lattice partition strategy, and local search methods has a greater tendency to form a globular state than genetic algorithm does. The results of noncompact model are more natural in comparison with that of compact model.
Collapse
Affiliation(s)
- Yu-Zhen Guo
- Department of Applied Mathematics, Dalian University of Technology, Dalian, 116024, People's Republic of China.
| | | |
Collapse
|
17
|
Guo YZ, Feng EM, Wang Y. Exploration of two-dimensional hydrophobic-polar lattice model by combining local search with elastic net algorithm. J Chem Phys 2006; 125:154102. [PMID: 17059234 DOI: 10.1063/1.2357950] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Determining the functional conformation of a protein from its amino acid sequence remains a central problem in computational biology. In this paper, we establish the mathematical optimal model of protein folding problem (PFP) on two-dimensional space based on the minimal energy principle. A novel hybrid of elastic net algorithm and local search method (ENLS) is applied successfully to simulations of protein folding on two-dimensional hydrophobic-polar (HP) lattice model. Eight HP benchmark instances with up to 64 amino acids are tested to verify the effectiveness of proposed approach and model. In several cases, the ENLS method finds new lower energy states. The numerical results show that it is drastically superior to other methods in finding the ground state of a protein.
Collapse
Affiliation(s)
- Yu-Zhen Guo
- Department of Applied Mathematics, Dalian University of Technology, Dalian 116024, PRC.
| | | | | |
Collapse
|