1
|
Chou HH, Hsu CT, Chen LH, Lin YC, Hsieh SY. A Novel Branch-and-Bound Algorithm for the Protein Folding Problem in the 3D HP Model. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:455-462. [PMID: 31403440 DOI: 10.1109/tcbb.2019.2934102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The protein folding problem (PFP) is an important issue in bioinformatics and biochemical physics. One of the most widely studied models of protein folding is the hydrophobic-polar (HP) model introduced by Dill. The PFP in the three-dimensional (3D) lattice HP model has been shown to be NP-complete; the proposed algorithms for solving the problem can therefore only find near-optimal energy structures for most long benchmark sequences within acceptable time periods. In this paper, we propose a novel algorithm based on the branch-and-bound approach to solve the PFP in the 3D lattice HP model. For 10 48-monomer benchmark sequences, our proposed algorithm finds the lowest energies so far within comparable computation times than previous methods.
Collapse
|
2
|
Morshedian A, Razmara J, Lotfi S. A novel approach for protein structure prediction based on an estimation of distribution algorithm. Soft comput 2018. [DOI: 10.1007/s00500-018-3130-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
3
|
Yanev N, Traykov M, Milanov P, Yurukov B. Protein Folding Prediction in a Cubic Lattice in Hydrophobic-Polar Model. J Comput Biol 2016; 24:412-421. [PMID: 27901606 DOI: 10.1089/cmb.2016.0181] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The tertiary structure of the proteins determines their functions. Therefore, the predicting of protein's tertiary structure, based on the primary amino acid sequence from long time, is the most important and challenging subject in biochemistry, molecular biology, and biophysics. One of the most popular protein structure prediction methods, called Hydrophobic-Polar (HP) model, is based on the observation that in polar environment hydrophobic amino acids are in the core of the molecule-in contact between them and more polar amino acids are in contact with the polar environment. In this study, we present a new mixed integer programming formulation, exact algorithm, and two heuristic algorithms to solve the protein folding problem stated as a combinatorial optimization problem in a simple cubic lattice. The results from computational runs on a set of benchmarks are favorably compared to known algorithms for solving the 3D lattice HP model as genetic algorithms, ant colony optimization algorithm, and Monte Carlo algorithm.
Collapse
Affiliation(s)
- Nicola Yanev
- 1 Institute of Mathematics and Informatics , Bulgarian Academy of Sciences, Sofia, Bulgaria
| | - Metodi Traykov
- 2 Department of Informatics, Center for Advanced Bioinformatics Research, South-West University "Neofit Rilski," Blagoevgrad, Bulgaria
| | - Peter Milanov
- 1 Institute of Mathematics and Informatics , Bulgarian Academy of Sciences, Sofia, Bulgaria .,2 Department of Informatics, Center for Advanced Bioinformatics Research, South-West University "Neofit Rilski," Blagoevgrad, Bulgaria
| | - Borislav Yurukov
- 2 Department of Informatics, Center for Advanced Bioinformatics Research, South-West University "Neofit Rilski," Blagoevgrad, Bulgaria
| |
Collapse
|
4
|
Bošković B, Brest J. Genetic algorithm with advanced mechanisms applied to the protein structure prediction in a hydrophobic-polar model and cubic lattice. Appl Soft Comput 2016. [DOI: 10.1016/j.asoc.2016.04.001] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
5
|
Traykov M, Angelov S, Yanev N. A New Heuristic Algorithm for Protein Folding in the HP Model. J Comput Biol 2016; 23:662-8. [PMID: 27153764 DOI: 10.1089/cmb.2016.0015] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
This article presents an efficient heuristic for protein folding. The protein folding problem is to predict the compact three-dimensional structure of a protein based on its amino acid sequence. The focus is on an original integer programming model derived from a platform used for Contact Map Overlap problem.
Collapse
Affiliation(s)
- Metodi Traykov
- 1 Department of Informatics, Center for Advanced Bioinformatics Research, South-West University Neofit Rilski , Blagoevgrad, Bulgaria
| | - Slav Angelov
- 2 Department of Informatics, New Bulgarian University , Sofia, Bulgaria
| | - Nicola Yanev
- 3 Institute of Mathematics and Informatics , Bulgarian Academy of Sciences, Sofia, Bulgaria
| |
Collapse
|
6
|
Custódio FL, Barbosa HJ, Dardenne LE. A multiple minima genetic algorithm for protein structure prediction. Appl Soft Comput 2014. [DOI: 10.1016/j.asoc.2013.10.029] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
7
|
Particle swarm optimization approach for protein structure prediction in the 3D HP model. Interdiscip Sci 2013; 4:190-200. [DOI: 10.1007/s12539-012-0131-z] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2011] [Revised: 03/16/2012] [Accepted: 06/17/2012] [Indexed: 01/03/2023]
|
8
|
Narasimhan SL, Rajarajan AK, Vardharaj L. HP-sequence design for lattice proteins—An exact enumeration study on diamond as well as square lattice. J Chem Phys 2012; 137:115102. [DOI: 10.1063/1.4752479] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
|
9
|
Wüst T, Landau DP. Optimized Wang-Landau sampling of lattice polymers: Ground state search and folding thermodynamics of HP model proteins. J Chem Phys 2012; 137:064903. [DOI: 10.1063/1.4742969] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
|
10
|
Liu J, Li G, Yu J, Yao Y. Heuristic energy landscape paving for protein folding problem in the three-dimensional HP lattice model. Comput Biol Chem 2012; 38:17-26. [PMID: 22551826 DOI: 10.1016/j.compbiolchem.2012.02.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2011] [Revised: 01/10/2012] [Accepted: 02/14/2012] [Indexed: 10/28/2022]
Abstract
The protein folding problem, i.e., the prediction of the tertiary structures of protein molecules from their amino acid sequences is one of the most important problems in computational biology and biochemistry. However, the extremely difficult optimization problem arising from energy function is a key challenge in protein folding simulation. The energy landscape paving (ELP) method has already been applied very successfully to off-lattice protein models and other optimization problems with complex energy landscape in continuous space. By improving the ELP method, and subsequently incorporating the neighborhood strategy with the pull-move set into the improved ELP method, a heuristic ELP algorithm is proposed to find low-energy conformations of 3D HP lattice model proteins in the discrete space. The algorithm is tested on three sets of 3D HP benchmark instances consisting 31 sequences. For eleven sequences with 27 monomers, the proposed method explores the conformation surfaces more efficiently than other methods, and finds new lower energies in several cases. For ten 48-monomer sequences, we find the lowest energies so far. With the achieved results, the algorithm converges rapidly and efficiently. For all ten 64-monomer sequences, the algorithm finds lower energies within comparable computation times than previous methods. Numeric results show that the heuristic ELP method is a competitive tool for protein folding simulation in 3D lattice model. To the best of our knowledge, this is the first application of ELP to the 3D discrete space.
Collapse
Affiliation(s)
- Jingfa Liu
- School of Computer & Software, Nanjing University of Information Science & Technology, Nanjing 210044, China.
| | | | | | | |
Collapse
|
11
|
Abstract
Hard discrete optimization problems using randomized methods such as genetic algorithms require large numbers of samples from the solution space. Each candidate sample/solution must be evaluated using the target fitness/energy function being optimized. Such fitness computations are a bottleneck in sampling methods such as genetic algorithms. We observe that the caching of partial results from these fitness computations can reduce this bottleneck. We provide a rigorous analysis of the run-times of GAs with and without caching. By representing fitness functions as classic Divide and Conquer algorithms, we provide a formal model to predict the efficiency of caching GAs vs. non-caching GAs. Finally, we explore the domain of protein folding with GAs and demonstrate that caching can significantly reduce expected run-times through both theoretical and extensive empirical analyses.
Collapse
Affiliation(s)
- EUNICE E. SANTOS
- Department of Computer Science, Virginia Polytechnic Institute & State University, Blacksburg, VA 24061, USA
| | - EUGENE SANTOS
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269-3155, USA
| |
Collapse
|
12
|
Hoque MT, Chetty M, Lewis A, Sattar A. Twin removal in genetic algorithms for protein structure prediction using low-resolution model. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011; 8:234-245. [PMID: 21071811 DOI: 10.1109/tcbb.2009.34] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
This paper presents the impact of twins and the measures for their removal from the population of genetic algorithm (GA) when applied to effective conformational searching. It is conclusively shown that a twin removal strategy for a GA provides considerably enhanced performance when investigating solutions to complex ab initio protein structure prediction (PSP) problems in low-resolution model. Without twin removal, GA crossover and mutation operations can become ineffectual as generations lose their ability to produce significant differences, which can lead to the solution stalling. The paper relaxes the definition of chromosomal twins in the removal strategy to not only encompass identical, but also highly correlated chromosomes within the GA population, with empirical results consistently exhibiting significant improvements solving PSP problems.
Collapse
Affiliation(s)
- Md Tamjidul Hoque
- Griffith University, Nathan campus, 170 Kessels Road, Nathan, Brisbane, Qld 4111, Australia.
| | | | | | | |
Collapse
|
13
|
Hu X, Beratan DN, Yang W. A gradient-directed Monte Carlo method for global optimization in a discrete space: application to protein sequence design and folding. J Chem Phys 2010; 131:154117. [PMID: 20568857 DOI: 10.1063/1.3236834] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We apply the gradient-directed Monte Carlo (GDMC) method to select optimal members of a discrete space, the space of chemically viable proteins described by a model Hamiltonian. In contrast to conventional Monte Carlo approaches, our GDMC method uses local property gradients with respect to chemical variables that have discrete values in the actual systems, e.g., residue types in a protein sequence. The local property gradients are obtained from the interpolation of discrete property values, following the linear combination of atomic potentials scheme developed recently [M. Wang et al., J. Am. Chem. Soc. 128, 3228 (2006)]. The local property derivative information directs the search toward the global minima while the Metropolis criterion incorporated in the method overcomes barriers between local minima. Using the simple HP lattice model, we apply the GDMC method to protein sequence design and folding. The GDMC algorithm proves to be particularly efficient, suggesting that this strategy can be extended to other discrete optimization problems in addition to inverse molecular design.
Collapse
Affiliation(s)
- Xiangqian Hu
- Department of Chemistry, Duke University, Durham, North Carolina 27708-0354, USA
| | | | | |
Collapse
|
14
|
|
15
|
Hoque MT, Chetty M, Sattar A. Genetic Algorithm inAb Initio Protein Structure Prediction Using Low Resolution Model: A Review. BIOMEDICAL DATA AND APPLICATIONS 2009. [DOI: 10.1007/978-3-642-02193-0_14] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
|
16
|
Thachuk C, Shmygelska A, Hoos HH. A replica exchange Monte Carlo algorithm for protein folding in the HP model. BMC Bioinformatics 2007; 8:342. [PMID: 17875212 PMCID: PMC2071922 DOI: 10.1186/1471-2105-8-342] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2007] [Accepted: 09/17/2007] [Indexed: 12/04/2022] Open
Abstract
Background The ab initio protein folding problem consists of predicting protein tertiary structure from a given amino acid sequence by minimizing an energy function; it is one of the most important and challenging problems in biochemistry, molecular biology and biophysics. The ab initio protein folding problem is computationally challenging and has been shown to be NP
MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFneVtcqqGqbauaaa@3961@-hard even when conformations are restricted to a lattice. In this work, we implement and evaluate the replica exchange Monte Carlo (REMC) method, which has already been applied very successfully to more complex protein models and other optimization problems with complex energy landscapes, in combination with the highly effective pull move neighbourhood in two widely studied Hydrophobic Polar (HP) lattice models. Results We demonstrate that REMC is highly effective for solving instances of the square (2D) and cubic (3D) HP protein folding problem. When using the pull move neighbourhood, REMC outperforms current state-of-the-art algorithms for most benchmark instances. Additionally, we show that this new algorithm provides a larger ensemble of ground-state structures than the existing state-of-the-art methods. Furthermore, it scales well with sequence length, and it finds significantly better conformations on long biological sequences and sequences with a provably unique ground-state structure, which is believed to be a characteristic of real proteins. We also present evidence that our REMC algorithm can fold sequences which exhibit significant interaction between termini in the hydrophobic core relatively easily. Conclusion We demonstrate that REMC utilizing the pull move neighbourhood significantly outperforms current state-of-the-art methods for protein structure prediction in the HP model on 2D and 3D lattices. This is particularly noteworthy, since so far, the state-of-the-art methods for 2D and 3D HP protein folding – in particular, the pruned-enriched Rosenbluth method (PERM) and, to some extent, Ant Colony Optimisation (ACO) – were based on chain growth mechanisms. To the best of our knowledge, this is the first application of REMC to HP protein folding on the cubic lattice, and the first extension of the pull move neighbourhood to a 3D lattice.
Collapse
Affiliation(s)
- Chris Thachuk
- School of Computing Science, Simon Fraser University, Burnaby, B.C., V5A 1S6, Canada
| | - Alena Shmygelska
- Department of Structural Biology, Stanford University, Stanford, CA, 94305, USA
| | - Holger H Hoos
- Department of Computer Science, University of British Columbia, B.C., V6T 1Z4, Canada
| |
Collapse
|
17
|
Zhang J, Kou SC, Liu JS. Biopolymer structure simulation and optimization via fragment regrowth Monte Carlo. J Chem Phys 2007; 126:225101. [PMID: 17581081 DOI: 10.1063/1.2736681] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
An efficient exploration of the configuration space of a biopolymer is essential for its structure modeling and prediction. In this study, the authors propose a new Monte Carlo method, fragment regrowth via energy-guided sequential sampling (FRESS), which incorporates the idea of multigrid Monte Carlo into the framework of configurational-bias Monte Carlo and is suitable for chain polymer simulations. As a by-product, the authors also found a novel extension of the Metropolis Monte Carlo framework applicable to all Monte Carlo computations. They tested FRESS on hydrophobic-hydrophilic (HP) protein folding models in both two and three dimensions. For the benchmark sequences, FRESS not only found all the minimum energies obtained by previous studies with substantially less computation time but also found new lower energies for all the three-dimensional HP models with sequence length longer than 80 residues.
Collapse
Affiliation(s)
- Jinfeng Zhang
- Department of Statistics, Harvard University, Science Center, Cambridge, Massachusetts 02138, USA
| | | | | |
Collapse
|
18
|
Bachmann M, Janke W. Substrate specificity of peptide adsorption: a model study. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2006; 73:020901. [PMID: 16605320 DOI: 10.1103/physreve.73.020901] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2005] [Indexed: 05/08/2023]
Abstract
Applying the contact density chain-growth algorithm to lattice heteropolymers, we identify the conformational transitions of a nongrafted hydrophobic-polar heteropolymer with 103 residues in the vicinity of a polar, a hydrophobic, and a uniformly attractive substrate. Introducing only two system parameters, the numbers of surface contacts and intrinsic hydrophobic contacts, respectively, we obtain surprisingly complex temperature and solvent dependent, substrate-specific pseudophase diagrams.
Collapse
Affiliation(s)
- Michael Bachmann
- Institut für Theoretische Physik, Universität Leipzig, Augustusplatz 10/11, D-04109 Leipzig, Germany.
| | | |
Collapse
|
19
|
Błazewicz J, Lukasiak P, Miłostan M. Application of tabu search strategy for finding low energy structure of protein. Artif Intell Med 2005; 35:135-45. [PMID: 16051476 DOI: 10.1016/j.artmed.2005.02.001] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2004] [Revised: 01/24/2005] [Accepted: 02/22/2005] [Indexed: 11/22/2022]
Abstract
OBJECTIVE Understanding protein functionality would mean understanding the basics of life. This functionality follows a three-dimensional structure of proteins. Unfortunately till now it is not possible to obtain these structures artificially. This article offers a survey on the use of meta-heuristic methods in context of simplified models of protein folding. METHODS Tabu search (TS) strategy is one of the most successful meta-heuristics that has been applied for large number of optimization problems. In the paper, the application of TS for finding low energy conformations of proteins in a simplified lattice model has been proposed. RESULTS The algorithm has been extensively tested and the tests showed its good performance. It compares well with the other heuristic approaches. CONCLUSIONS The approach presented is competitive as compared with other methods and due to its low computation time can be used as a complementary tool for an analysis of the three-dimensional protein structures.
Collapse
Affiliation(s)
- Jacek Błazewicz
- Institute of Computing Science, Poznań University of Technology, Piotrowo 3a, 60-965 Poznań, Poland
| | | | | |
Collapse
|
20
|
Huang W, Lü Z, Shi H. Growth algorithm for finding low energy configurations of simple lattice proteins. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2005; 72:016704. [PMID: 16090131 DOI: 10.1103/physreve.72.016704] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/20/2004] [Indexed: 05/03/2023]
Abstract
PERM and its new variant nPERMis have been developed to optimize the energy function of protein folding based on HP simple lattice model and were found to outperform all other previous fully blind general purpose algorithms. Using the concept of core-guiding and life-forecasting, we propose a new version of nPERMis, called nPERMh. A major difference with respect to nPERMis is that criteria for further growth of new residue are based on the species of current growing monomer and its position in the HP sequence. Seventeen sequences of length ranging from 46 to 124 residues were tested by nPERMh on the cubic lattice and our algorithm proved very efficient. It should be pointed out that our new version of nPERMis is exclusively designed for conformational search. We hope that similar methods will ultimately be useful for finding native states of more realistic protein models.
Collapse
Affiliation(s)
- Wenqi Huang
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei Province, 430074, China
| | | | | |
Collapse
|
21
|
Schiemann R, Bachmann M, Janke W. Exact sequence analysis for three-dimensional hydrophobic-polar lattice proteins. J Chem Phys 2005; 122:114705. [PMID: 15836241 DOI: 10.1063/1.1814941] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We have exactly enumerated all sequences and conformations of hydrophobic-polar (HP) proteins with chains of up to 19 monomers on the simple cubic lattice. For two variants of the HP model, where only two types of monomers are distinguished, we determined and statistically analyzed designing sequences, i.e., sequences that have a nondegenerate ground state. Furthermore we were interested in characteristic thermodynamic properties of HP proteins with designing sequences. In order to be able to perform these exact studies, we applied an efficient enumeration method based on contact sets.
Collapse
|
22
|
An ant colony optimisation algorithm for the 2D and 3D hydrophobic polar protein folding problem. BMC Bioinformatics 2005; 6:30. [PMID: 15710037 PMCID: PMC555464 DOI: 10.1186/1471-2105-6-30] [Citation(s) in RCA: 67] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2004] [Accepted: 02/14/2005] [Indexed: 11/24/2022] Open
Abstract
Background The protein folding problem is a fundamental problems in computational molecular biology and biochemical physics. Various optimisation methods have been applied to formulations of the ab-initio folding problem that are based on reduced models of protein structure, including Monte Carlo methods, Evolutionary Algorithms, Tabu Search and hybrid approaches. In our work, we have introduced an ant colony optimisation (ACO) algorithm to address the non-deterministic polynomial-time hard (NP-hard) combinatorial problem of predicting a protein's conformation from its amino acid sequence under a widely studied, conceptually simple model – the 2-dimensional (2D) and 3-dimensional (3D) hydrophobic-polar (HP) model. Results We present an improvement of our previous ACO algorithm for the 2D HP model and its extension to the 3D HP model. We show that this new algorithm, dubbed ACO-HPPFP-3, performs better than previous state-of-the-art algorithms on sequences whose native conformations do not contain structural nuclei (parts of the native fold that predominantly consist of local interactions) at the ends, but rather in the middle of the sequence, and that it generally finds a more diverse set of native conformations. Conclusions The application of ACO to this bioinformatics problem compares favourably with specialised, state-of-the-art methods for the 2D and 3D HP protein folding problem; our empirical results indicate that our rather simple ACO algorithm scales worse with sequence length but usually finds a more diverse ensemble of native states. Therefore the development of ACO algorithms for more complex and realistic models of protein structure holds significant promise.
Collapse
|
23
|
Abstract
We calculate thermodynamic quantities of hydrophobic-polar (HP) lattice proteins by means of a multicanonical chain-growth algorithm that connects the new variants of the Pruned-Enriched Rosenbluth Method and flat histogram sampling of the entire energy space. Since our method directly simulates the density of states, we obtain results for thermodynamic quantities of the system for all temperatures. In particular, this algorithm enables us to accurately simulate the usually difficult accessible low-temperature region. Therefore, it becomes possible to perform detailed analyses of the low-temperature transition between ground states and compact globules.
Collapse
Affiliation(s)
- Michael Bachmann
- Institut fur Theoretische Physik, Universitat Leipzig, Augustusplatz 10/11, D-04109 Leipzig, Germany.
| | | |
Collapse
|
24
|
Bachmann M, Janke W. Multicanonical chain-growth algorithm. PHYSICAL REVIEW LETTERS 2003; 91:208105. [PMID: 14683403 DOI: 10.1103/physrevlett.91.208105] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2003] [Indexed: 05/24/2023]
Abstract
We present a temperature-independent Monte Carlo method for the determination of the density of states of lattice proteins that combines the fast ground-state search strategy of the new pruned-enriched Rosenbluth chain-growth method and multicanonical reweighting for sampling the complete energy space. Since the density of states contains all energetic information of a statistical system, we can directly calculate the mean energy, specific heat, Helmholtz free energy, and entropy for all temperatures. We apply this method to lattice proteins consisting of hydrophobic and polar monomers, and for the examples of sequences considered, we identify the transitions between native, globule, and random coil states. Since no special properties of heteropolymers are involved in this algorithm, the method applies to polymer models as well.
Collapse
Affiliation(s)
- Michael Bachmann
- Institut für Theoretische Physik, Universität Leipzig, Augustusplatz 10/11, D-04109 Leipzig, Germany.
| | | |
Collapse
|
25
|
Hsu HP, Mehra V, Nadler W, Grassberger P. Growth-based optimization algorithm for lattice heteropolymers. ACTA ACUST UNITED AC 2003; 68:021113. [PMID: 14524959 DOI: 10.1103/physreve.68.021113] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2002] [Indexed: 11/07/2022]
Abstract
An improved version of the pruned-enriched-Rosenbluth method (PERM) is proposed and tested on finding lowest energy states in simple models of lattice heteropolymers. It is found to outperform not only the previous version of PERM, but also all other fully blind general purpose stochastic algorithms which have been employed on this problem. In many cases, it found new lowest energy states missed in previous papers. Limitations are discussed.
Collapse
Affiliation(s)
- Hsiao-Ping Hsu
- John-von-Neumann Institute for Computing, Forschungszentrum Jülich, D-52425 Jülich, Germany
| | | | | | | |
Collapse
|
26
|
Chou CI, Han RS, Li SP, Lee TK. Guided simulated annealing method for optimization problems. ACTA ACUST UNITED AC 2003; 67:066704. [PMID: 16241377 DOI: 10.1103/physreve.67.066704] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2003] [Revised: 03/28/2003] [Indexed: 11/07/2022]
Abstract
Incorporating the concept of order parameter of the mean-field theory into the simulated annealing method, we present an optimization algorithm, the guided simulated annealing method. In this method mean-field order parameters are calculated to guide the configuration search for the global minimum. Allowing fluctuations and improvement of mean-field values iteratively, this method successfully identifies global minima for several difficult optimization problems. Application of this method to the HP lattice-protein model has found another lowest-energy state for an N=100 sequence that was not found by other methods before. Results for spin glass models are also presented which show improvement over the previous results.
Collapse
Affiliation(s)
- C I Chou
- Institute of Physics, Academia Sinica, Taipei, Taiwan
| | | | | | | |
Collapse
|
27
|
Hsu HP, Mehra V, Nadler W, Grassberger P. Growth algorithms for lattice heteropolymers at low temperatures. J Chem Phys 2003. [DOI: 10.1063/1.1522710] [Citation(s) in RCA: 116] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
28
|
Zhang JL, Liu JS. A new sequential importance sampling method and its application to the two-dimensional hydrophobic–hydrophilic model. J Chem Phys 2002. [DOI: 10.1063/1.1494415] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
29
|
Toma L, Toma S. A lattice study of multimolecular ensembles of protein models. Effect of sequence on the final state: globules, aggregates, dimers, fibrillae. Biomacromolecules 2002; 1:232-8. [PMID: 11710105 DOI: 10.1021/bm005506o] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Three sequences of simple protein models on the two-dimensional square lattice have been chosen for a study of the behavior of different classes of proteins as a function of temperature and concentration using multimolecular systems under periodic boundary conditions. The results of the dynamic simulations have shown the profound influence that the intermolecular contacts can have on the accessibility of the states the various sequences can reach; this makes possible the formation of different kinds of structures under the control not only of sequence and temperature but also of concentration that can become the main driving force for the formation of ordered lamellar structures for sequences properly designed.
Collapse
Affiliation(s)
- L Toma
- Dipartimento di Chimica Organica, Università di Pavia, Via Taramelli 10, 27100 Pavia, Italy.
| | | |
Collapse
|
30
|
Lee LW, Wang JS. Flat histogram simulation of lattice polymer systems. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2001; 64:056112. [PMID: 11736019 DOI: 10.1103/physreve.64.056112] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/25/2001] [Indexed: 05/23/2023]
Abstract
We demonstrate the use of an algorithm called the Flat Histogram sampling algorithm for the simulation of two-dimensional lattice polymer systems. Thermodynamic properties, such as the average energy or entropy and other physical quantities such as the end-to-end distance or radius of gyration can be easily calculated using this method. The ground-state energy can also be determined. We also explore the accuracy and limitations of this method.
Collapse
Affiliation(s)
- L W Lee
- Department of Computational Science, National University of Singapore, Singapore 119260, Republic of Singapore
| | | |
Collapse
|
31
|
Abstract
Single amino acid repeats are found in different kinds of proteins. Some of these repeats are pathogenic. It is striking that some amino acids are able to form such repeats, but other amino acids are not. We suggest an explanation for this fact based on the different tendency of each amino acid to form aggregates. Aggregation may be due to the formation of incipient lamellar crystals as they have been described in poly-alpha-amino acids and in most synthetic polymers.
Collapse
Affiliation(s)
- J A Subirana
- Department d'Enginyeria Química, ETSEIB, Universitat Politècnica de Catalunya, Barcelona, Spain.
| | | |
Collapse
|
32
|
Toma L, Toma S. Folding simulation of protein models on the structure-based cubo-octahedral lattice with the Contact Interactions algorithm. Protein Sci 1999; 8:196-202. [PMID: 10210197 PMCID: PMC2144108 DOI: 10.1110/ps.8.1.196] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
Computer simulations of protein models on lattices have been widely used as an aid in the study of protein folding process. Following the suggestion of Raghunathan and Jernigan (1997, Protein Sci. 6:2072-2083) that the cubooctahedral lattice can allow a more realistic representation of proteins than other lattices, we propose here the use of a new set of internal coordinates theta for the description of a protein model on this lattice. An easy procedure for the conversion of the theta coordinates to the Cartesian coordinates is also described. When the Contact Interaction approach, already proposed by us for simulations on square or cubic lattices, was applied to the cubo-octahedral lattice, the system obeyed the correct thermodynamics derived from the definition of energy. Thus, lattice simulations of protein models, in which secondary structure elements such as alpha-helices or beta-strands can be easily identifiable, can be performed.
Collapse
Affiliation(s)
- L Toma
- Dipartimento di Chimica Organica, Università di Pavia, Italy.
| | | |
Collapse
|
33
|
Toma L, Toma S, Subirana JA. Simulation of Polymer Crystallization through a Dynamic Monte Carlo Lattice Model. Macromolecules 1998. [DOI: 10.1021/ma971232u] [Citation(s) in RCA: 24] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Lucio Toma
- Dipartimento di Chimica Organica, Università di Pavia, Via Taramelli 10, 27100 Pavia, Italy, Pharmacia & Upjohn, Via per Pogliano, 20014 Nerviano (MI), Italy, and Departament d'Enginyeria Química, Universitat Politècnica de Catalunya, Av. Diagonal 647, 08028 Barcelona, Spain
| | - Salvatore Toma
- Dipartimento di Chimica Organica, Università di Pavia, Via Taramelli 10, 27100 Pavia, Italy, Pharmacia & Upjohn, Via per Pogliano, 20014 Nerviano (MI), Italy, and Departament d'Enginyeria Química, Universitat Politècnica de Catalunya, Av. Diagonal 647, 08028 Barcelona, Spain
| | - Juan A. Subirana
- Dipartimento di Chimica Organica, Università di Pavia, Via Taramelli 10, 27100 Pavia, Italy, Pharmacia & Upjohn, Via per Pogliano, 20014 Nerviano (MI), Italy, and Departament d'Enginyeria Química, Universitat Politècnica de Catalunya, Av. Diagonal 647, 08028 Barcelona, Spain
| |
Collapse
|
34
|
Beutler TC, Dill KA. A fast conformational search strategy for finding low energy structures of model proteins. Protein Sci 1996; 5:2037-43. [PMID: 8897604 PMCID: PMC2143263 DOI: 10.1002/pro.5560051010] [Citation(s) in RCA: 50] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
We describe a new computer algorithm for finding low-energy conformations of proteins. It is a chain-growth method that uses a heuristic bias function to help assemble a hydrophobic core. We call it the Core-directed chain Growth method (CG). We test the CG method on several well-known literature examples of HP lattice model proteins [in which proteins are modeled as sequences of hydrophobic (H) and polar (P) monomers], ranging from 20-64 monomers in two dimensions, and up to 88-mers in three dimensions. Previous nonexhaustive methods--Monte Carlo, a Genetic Algorithm, Hydrophobic Zippers, and Contact Interactions--have been tried on these same model sequences. CG is substantially better at finding the global optima, and avoiding local optima, and it does so in comparable or shorter times. CG finds the global minimum energy of the longest HP lattice model chain for which the global optimum is known, a 3D 88-mer that has only been reachable before by the CHCC complete search method. CG has the potential advantage that it should have nonexponential scaling with chain length. We believe this is a promising method for conformational searching in protein folding algorithms.
Collapse
Affiliation(s)
- T C Beutler
- Department of Pharmaceutical Chemistry, University of California, San Francisco 94143-1204, USA
| | | |
Collapse
|