1
|
ABDELHALIM MOHAMEDB, MABROUK MAIS, SAYED AHMEDY. HPS_PSP: HIGH PERFORMANCE SYSTEM FOR PROTEIN STRUCTURE PREDICTION. J BIOL SYST 2019. [DOI: 10.1142/s0218339019500190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Prediction of least energy conformation of a protein from its primary structure (chain of amino acids) is an optimization problem associated with a large complex energy landscape. In this study, a simple 2D hydrophobic–hydrophilic model was used to model the protein sequence, which allows the fast and efficient design of genetic algorithm-based protein structure prediction approach. The neighborhood search strategy is integrated into the genetic operator. The neighborhood search guides the genetic operator to regions in the computational space with good solutions. To prevent convergence to local optima, the proposed method employs crowding-based parent replacement strategy, which improves the performance of the algorithm and the ability to deal with multiple numbers of solutions. The proposed algorithm was tested with a standard benchmark of HP sequences and comparative results demonstrate that the proposed system beats most of the evolutionary algorithms for seven sequences. It finds the best energy for a sequence of length [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text] and [Formula: see text].
Collapse
Affiliation(s)
- MOHAMED B. ABDELHALIM
- College of Computing and Information Technology (CCIT), Arab Academy for Science Technology and Maritime Transport (AASTMT) Cairo, Egypt
| | - MAI S. MABROUK
- Biomedical Engineering Department, Misr University for Science and Technology, 6 October City, Giza, Egypt
| | - AHMED Y. SAYED
- Physics and Engineering Mathematics Department, Faculty of Engineering at Mataria, Helwan Uinversity, Cairo, Egypt
| |
Collapse
|
2
|
Dubey SP, Balaji S, Kini NG, Sathish Kumar M. A Novel Framework for Ab Initio Coarse Protein Structure Prediction. Adv Bioinformatics 2018; 2018:7607384. [PMID: 30026759 PMCID: PMC6031167 DOI: 10.1155/2018/7607384] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2018] [Revised: 04/26/2018] [Accepted: 05/27/2018] [Indexed: 02/07/2023] Open
Abstract
Hydrophobic-Polar model is a simplified representation of Protein Structure Prediction (PSP) problem. However, even with the HP model, the PSP problem remains NP-complete. This work proposes a systematic and problem specific design for operators of the evolutionary program which hybrids with local search hill climbing, to efficiently explore the search space of PSP and thereby obtain an optimum conformation. The proposed algorithm achieves this by incorporating the following novel features: (i) new initialization method which generates only valid individuals with (rather than random) better fitness values; (ii) use of probability-based selection operators that limit the local convergence; (iii) use of secondary structure based mutation operator that makes the structure more closely to the laboratory determined structure; and (iv) incorporating all the above-mentioned features developed a complete two-tier framework. The developed framework builds the protein conformation on the square and triangular lattice. The test has been performed using benchmark sequences, and a comparative evaluation is done with various state-of-the-art algorithms. Moreover, in addition to hypothetical test sequences, we have tested protein sequences deposited in protein database repository. It has been observed that the proposed framework has shown superior performance regarding accuracy (fitness value) and speed (number of generations needed to attain the final conformation). The concepts used to enhance the performance are generic and can be used with any other population-based search algorithm such as genetic algorithm, ant colony optimization, and immune algorithm.
Collapse
Affiliation(s)
- Sandhya Parasnath Dubey
- Department of Computer Science & Eng., Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka 576104, India
| | - S. Balaji
- Department of Biotechnology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka 576104, India
| | - N. Gopalakrishna Kini
- Department of Computer Science & Eng., Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka 576104, India
| | - M. Sathish Kumar
- Department of ECE, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka 576104, India
| |
Collapse
|
3
|
Li B, Fooksa M, Heinze S, Meiler J. Finding the needle in the haystack: towards solving the protein-folding problem computationally. Crit Rev Biochem Mol Biol 2018; 53:1-28. [PMID: 28976219 PMCID: PMC6790072 DOI: 10.1080/10409238.2017.1380596] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2017] [Revised: 08/22/2017] [Accepted: 09/13/2017] [Indexed: 12/22/2022]
Abstract
Prediction of protein tertiary structures from amino acid sequence and understanding the mechanisms of how proteins fold, collectively known as "the protein folding problem," has been a grand challenge in molecular biology for over half a century. Theories have been developed that provide us with an unprecedented understanding of protein folding mechanisms. However, computational simulation of protein folding is still difficult, and prediction of protein tertiary structure from amino acid sequence is an unsolved problem. Progress toward a satisfying solution has been slow due to challenges in sampling the vast conformational space and deriving sufficiently accurate energy functions. Nevertheless, several techniques and algorithms have been adopted to overcome these challenges, and the last two decades have seen exciting advances in enhanced sampling algorithms, computational power and tertiary structure prediction methodologies. This review aims at summarizing these computational techniques, specifically conformational sampling algorithms and energy approximations that have been frequently used to study protein-folding mechanisms or to de novo predict protein tertiary structures. We hope that this review can serve as an overview on how the protein-folding problem can be studied computationally and, in cases where experimental approaches are prohibitive, help the researcher choose the most relevant computational approach for the problem at hand. We conclude with a summary of current challenges faced and an outlook on potential future directions.
Collapse
Affiliation(s)
- Bian Li
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| | - Michaela Fooksa
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
- Chemical and Physical Biology Graduate Program, Vanderbilt University, Nashville, TN, USA
| | - Sten Heinze
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| | - Jens Meiler
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| |
Collapse
|
4
|
Maximova T, Moffatt R, Ma B, Nussinov R, Shehu A. Principles and Overview of Sampling Methods for Modeling Macromolecular Structure and Dynamics. PLoS Comput Biol 2016; 12:e1004619. [PMID: 27124275 PMCID: PMC4849799 DOI: 10.1371/journal.pcbi.1004619] [Citation(s) in RCA: 132] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Investigation of macromolecular structure and dynamics is fundamental to understanding how macromolecules carry out their functions in the cell. Significant advances have been made toward this end in silico, with a growing number of computational methods proposed yearly to study and simulate various aspects of macromolecular structure and dynamics. This review aims to provide an overview of recent advances, focusing primarily on methods proposed for exploring the structure space of macromolecules in isolation and in assemblies for the purpose of characterizing equilibrium structure and dynamics. In addition to surveying recent applications that showcase current capabilities of computational methods, this review highlights state-of-the-art algorithmic techniques proposed to overcome challenges posed in silico by the disparate spatial and time scales accessed by dynamic macromolecules. This review is not meant to be exhaustive, as such an endeavor is impossible, but rather aims to balance breadth and depth of strategies for modeling macromolecular structure and dynamics for a broad audience of novices and experts.
Collapse
Affiliation(s)
- Tatiana Maximova
- Department of Computer Science, George Mason University, Fairfax, Virginia, United States of America
| | - Ryan Moffatt
- Department of Computer Science, George Mason University, Fairfax, Virginia, United States of America
| | - Buyong Ma
- Basic Science Program, Leidos Biomedical Research, Inc. Cancer and Inflammation Program, National Cancer Institute, Frederick, Maryland, United States of America
| | - Ruth Nussinov
- Basic Science Program, Leidos Biomedical Research, Inc. Cancer and Inflammation Program, National Cancer Institute, Frederick, Maryland, United States of America
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Amarda Shehu
- Department of Computer Science, George Mason University, Fairfax, Virginia, United States of America
- Department of Biongineering, George Mason University, Fairfax, Virginia, United States of America
- School of Systems Biology, George Mason University, Manassas, Virginia, United States of America
| |
Collapse
|
5
|
Khor BY, Tye GJ, Lim TS, Choong YS. General overview on structure prediction of twilight-zone proteins. Theor Biol Med Model 2015; 12:15. [PMID: 26338054 PMCID: PMC4559291 DOI: 10.1186/s12976-015-0014-1] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2015] [Accepted: 08/27/2015] [Indexed: 01/02/2023] Open
Abstract
Protein structure prediction from amino acid sequence has been one of the most challenging aspects in computational structural biology despite significant progress in recent years showed by critical assessment of protein structure prediction (CASP) experiments. When experimentally determined structures are unavailable, the predictive structures may serve as starting points to study a protein. If the target protein consists of homologous region, high-resolution (typically <1.5 Å) model can be built via comparative modelling. However, when confronted with low sequence similarity of the target protein (also known as twilight-zone protein, sequence identity with available templates is less than 30%), the protein structure prediction has to be initiated from scratch. Traditionally, twilight-zone proteins can be predicted via threading or ab initio method. Based on the current trend, combination of different methods brings an improved success in the prediction of twilight-zone proteins. In this mini review, the methods, progresses and challenges for the prediction of twilight-zone proteins were discussed.
Collapse
Affiliation(s)
- Bee Yin Khor
- Institute for Research in Molecular Medicine, Universiti Sains Malaysia, 11800, Minden, Penang, Malaysia.
| | - Gee Jun Tye
- Institute for Research in Molecular Medicine, Universiti Sains Malaysia, 11800, Minden, Penang, Malaysia.
| | - Theam Soon Lim
- Institute for Research in Molecular Medicine, Universiti Sains Malaysia, 11800, Minden, Penang, Malaysia.
| | - Yee Siew Choong
- Institute for Research in Molecular Medicine, Universiti Sains Malaysia, 11800, Minden, Penang, Malaysia.
| |
Collapse
|
6
|
Guyeux C, Nicod JM, Philippe L, Bahi JM. The study of unfoldable self-avoiding walks — Application to protein structure prediction software. J Bioinform Comput Biol 2015; 13:1550009. [DOI: 10.1142/s0219720015500092] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Self-avoiding walks (SAWs) are the source of very difficult problems in probability and enumerative combinatorics. They are of great interest as, for example, they are the basis of protein structure prediction (PSP) in bioinformatics. The authors of this paper have previously shown that, depending on the prediction algorithm, the sets of obtained walk conformations differ: For example, all the SAWs can be generated using stretching-based algorithms whereas only the unfoldable SAWs can be obtained with methods that iteratively fold the straight line. A deeper study of (non-)unfoldable SAWs is presented in this paper. The contribution is first a survey of what is currently known about these sets. In particular, we provide clear definitions of various subsets of SAWs related to pivot moves (unfoldable and non-unfoldable SAWs, etc.) and the first results that we have obtained, theoretically or computationally, on these sets. Then a new theorem on the number of non-unfoldable SAWs is demonstrated. Finally, a list of open questions is provided and the consequences on the PSP problem is proposed.
Collapse
Affiliation(s)
- Christophe Guyeux
- FEMTO-ST Institute, Université de Franche-Comté/CNRS/ENSMM/UTBM, Besançon, France
| | - Jean-Marc Nicod
- FEMTO-ST Institute, Université de Franche-Comté/CNRS/ENSMM/UTBM, Besançon, France
| | - Laurent Philippe
- FEMTO-ST Institute, Université de Franche-Comté/CNRS/ENSMM/UTBM, Besançon, France
| | - Jacques M. Bahi
- FEMTO-ST Institute, Université de Franche-Comté/CNRS/ENSMM/UTBM, Besançon, France
| |
Collapse
|
7
|
Shehu A. A Review of Evolutionary Algorithms for Computing Functional Conformations of Protein Molecules. METHODS IN PHARMACOLOGY AND TOXICOLOGY 2015. [DOI: 10.1007/7653_2015_47] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
|
8
|
GUYEUX CHRISTOPHE, CÔTÉ NATHALIEML, BAHI JACQUESM, BIENIA WOJCIECH. IS PROTEIN FOLDING PROBLEM REALLY A NP-COMPLETE ONE? FIRST INVESTIGATIONS. J Bioinform Comput Biol 2014; 12:1350017. [DOI: 10.1142/s0219720013500170] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
To determine the 3D conformation of proteins is a necessity to understand their functions or interactions with other molecules. It is commonly admitted that, when proteins fold from their primary linear structures to their final 3D conformations, they tend to choose the ones that minimize their free energy. To find the 3D conformation of a protein knowing its amino acid sequence, bioinformaticians use various models of different resolutions and artificial intelligence tools, as the protein folding prediction problem is a NP complete one. More precisely, to determine the backbone structure of the protein using the low resolution models (2D HP square and 3D HP cubic), by finding the conformation that minimizes free energy, is intractable exactly. Both proofs of NP-completeness and the 2D prediction consider that acceptable conformations have to satisfy a self-avoiding walk (SAW) requirement, as two different amino acids cannot occupy a same position in the lattice. It is shown in this document that the SAW requirement considered when proving NP-completeness is different from the SAW requirement used in various prediction programs, and that they are different from the real biological requirement. Indeed, the proof of NP completeness and the predictions in silico consider conformations that are not possible in practice. Consequences of this fact are investigated in this research work.
Collapse
Affiliation(s)
- CHRISTOPHE GUYEUX
- FEMTO-ST Institute, UMR 6174 CNRS, University of Franche-Comté, Besançon, France
| | - NATHALIE M.-L. CÔTÉ
- Laboratoire de Biologie du Développement, UMR 7622, Université Pierre et Marie Curie, Paris, France
| | - JACQUES M. BAHI
- FEMTO-ST Institute, UMR 6174 CNRS, University of Franche-Comté, Besançon, France
| | - WOJCIECH BIENIA
- G-SCOP Laboratory, ENSIMAG, 46 Avenue Félix Viallet, F-38031 Grenoble Cedex 1, France
| |
Collapse
|
9
|
Venkatesan A, Gopal J, Candavelou M, Gollapalli S, Karthikeyan K. Computational approach for protein structure prediction. Healthc Inform Res 2013; 19:137-47. [PMID: 23882419 PMCID: PMC3717437 DOI: 10.4258/hir.2013.19.2.137] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2012] [Revised: 03/30/2013] [Accepted: 04/01/2013] [Indexed: 11/23/2022] Open
Abstract
Objectives To predict the structure of protein, which dictates the function it performs, a newly designed algorithm is developed which blends the concept of self-organization and the genetic algorithm. Methods Among many other approaches, genetic algorithm is found to be a promising cooperative computational method to solve protein structure prediction in a reasonable time. To automate the right choice of parameter values the influence of self-organization is adopted to design a new genetic operator to optimize the process of prediction. Torsion angles, the local structural parameters which define the backbone of protein are considered to encode the chromosome that enhances the quality of the confirmation. Newly designed self-configured genetic operators are used to develop self-organizing genetic algorithm to facilitate the accurate structure prediction. Results Peptides are used to gauge the validity of the proposed algorithm. As a result, the structure predicted shows clear improvements in the root mean square deviation on overlapping the native indicates the overall performance of the algorithm. In addition, the Ramachandran plot results implies that the conformations of phi-psi angles in the predicted structure are better as compared to native and also free from steric hindrances. Conclusions The proposed algorithm is promising which contributes to the prediction of a native-like structure by eliminating the time constraint and effort demand. In addition, the energy of the predicted structure is minimized to a greater extent, which proves the stability of protein.
Collapse
Affiliation(s)
- Amouda Venkatesan
- Centre for Bioinformatics, Pondicherry University, Kalapet, Pondicherry, India
| | | | | | | | | |
Collapse
|
10
|
|