1
|
SSA: Subset sum approach to protein β-sheet structure prediction. Comput Biol Chem 2021; 94:107552. [PMID: 34390958 DOI: 10.1016/j.compbiolchem.2021.107552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2020] [Revised: 07/21/2021] [Accepted: 07/27/2021] [Indexed: 11/22/2022]
Abstract
The three-dimensional structures of proteins provide their functions and incorrect folding of its β-strands can be the cause of many diseases. There are two major approaches for determining protein structures: computational prediction and experimental methods that employ technologies such as Cryo-electron microscopy. Due to experimental methods's high costs, extended wait times for its lengthy processes, and incompleteness of results, computational prediction is an attractive alternative. As the focus of the present paper, β-sheet structure prediction is a major portion of overall protein structure prediction. Prediction of other substructures, such as α-helices, is simpler with lower computational time complexities. Brute force methods are the most common approach and dynamic programming is also utilized to generate all possible conformations. The current study introduces the Subset Sum Approach (SSA) for the direct search space generation method, which is shown to outperform the dynamic programming approach in terms of both time and space. For the first time, the present work has calculated both the state space cardinality of the dynamic programming approach and the search space cardinality of the general brute force approaches. In regard to a set of pruning rules, SSA has demonstrated higher efficiency with respect to both time and accuracy in comparison to state-of-the-art methods.
Collapse
|
2
|
Dehghani T, Naghibzadeh M, Sadri J. Enhancement of Protein β-Sheet Topology Prediction Using Maximum Weight Disjoint Path Cover. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:1936-1947. [PMID: 29994539 DOI: 10.1109/tcbb.2018.2837753] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Predicting β-sheet topology (β-topology) is one of the most critical intermediate steps towards protein structure and function prediction. The β-topology prediction problem is defined as the determination of the optimal arrangement of β-strand interactions within protein β-sheets. Significant efforts have been made to predict β-topologies. However, due to the inaccurate determination of interactions among β-strands and the huge topological space of proteins with a large number of β-strands, more efficient methods are required to improve both the accuracy and speed of β-topology prediction. In order to attain higher accuracy, the current paper introduces a bidirectional strand-strand interaction graph and considers all possible orientations (parallel and antiparallel) and orders of β-strand pairwise interactions. For the first time, the β-topology prediction is transformed into a maximum weight disjoint path cover solution by conserving all potential topologies. Moreover, to manage the computation time, a set of candidate β-sheets is generated and an optimization process is applied to select a subset of maximum score disjoint β-sheets as a predicted β-topology. The proposed method is comprehensively compared with state-of-the-art methods. The experimental results on the BetaSheet916 and BetaSheet1452 datasets reveal that the current study's approach enhances performance measurements as well as reduces the runtime.
Collapse
|
3
|
Dehghani T, Naghibzadeh M, Eghdami M. BetaDL: A protein beta-sheet predictor utilizing a deep learning model and independent set solution. Comput Biol Med 2019; 104:241-249. [PMID: 30530227 DOI: 10.1016/j.compbiomed.2018.11.021] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2018] [Revised: 11/23/2018] [Accepted: 11/27/2018] [Indexed: 10/27/2022]
Abstract
The sequence-based prediction of beta-residue contacts and beta-sheet structures contain key information for protein structure prediction. However, the determination of beta-sheet structures poses numerous challenges due to long-range beta-residue interactions and the huge number of possible beta-sheet structures. Recently gaining attention has been the prediction of residue contacts based on deep learning models whose results have led to improvement in protein structure prediction. In addition, to reduce the computational complexity of determining beta-sheet structures, it has been suggested that this problem be transformed into graph-based solutions. Consequently, the current work proposes BetaDL, a combination of a deep learning and a graph-based beta-sheet structure predictor. BetaDL adopts deep learning models to capture beta-residue contacts and improve beta-sheet structure predictions. In addition, a graph-based approach is presented to model the beta-sheets conformational space and a new score function is introduced to evaluate beta-sheets. Furthermore, the present study demonstrates that the beta-sheet structure can be predicted within an acceptable computational time by the utilization of a heuristic maximum weight independent set solution. When compared to state-of-the-art methods, experimental results from BetaSheet916 and BetaSheet1452 datasets indicate that BetaDL improves the accuracy of beta-residue contact and beta-sheet structure prediction. Using BetaDL, beta-sheet structures are predicted with a 4% and 6% improvement in the F1-score at the residue and strand levels, respectively. BetaDL's source code and data are available at http://kerg.um.ac.ir/index.php/datasets/#BetaDL.
Collapse
Affiliation(s)
- Toktam Dehghani
- Department of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Mahmoud Naghibzadeh
- Department of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran.
| | - Mahdie Eghdami
- Department of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran
| |
Collapse
|
4
|
Sabzekar M, Naghibzadeh M, Eghdami M, Aydin Z. Protein β-sheet prediction using an efficient dynamic programming algorithm. Comput Biol Chem 2017; 70:142-155. [PMID: 28881217 DOI: 10.1016/j.compbiolchem.2017.08.011] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2017] [Revised: 07/25/2017] [Accepted: 08/18/2017] [Indexed: 11/28/2022]
Abstract
Predicting the β-sheet structure of a protein is one of the most important intermediate steps towards the identification of its tertiary structure. However, it is regarded as the primary bottleneck due to the presence of non-local interactions between several discontinuous regions in β-sheets. To achieve reliable long-range interactions, a promising approach is to enumerate and rank all β-sheet conformations for a given protein and find the one with the highest score. The problem with this solution is that the search space of the problem grows exponentially with respect to the number of β-strands. Additionally, brute-force calculation in this conformational space leads to dealing with a combinatorial explosion problem with intractable computational complexity. The main contribution of this paper is to generate and search the space of the problem efficiently to reduce the time complexity of the problem. To achieve this, two tree structures, called sheet-tree and grouping-tree, are proposed. They model the search space by breaking it into sub-problems. Then, an advanced dynamic programming is proposed that stores the intermediate results, avoids repetitive calculation by repeatedly uses them efficiently in successive steps and reduces the space of the problem by removing those intermediate results that will no longer be required in later steps. As a consequence, the following contributions have been made. Firstly, more accurate β-sheet structures are found by searching all possible conformations, and secondly, the time complexity of the problem is reduced by searching the space of the problem efficiently which makes the proposed method applicable to predict β-sheet structures with high number of β-strands. Experimental results on the BetaSheet916 dataset showed significant improvements of the proposed method in both execution time and the prediction accuracy in comparison with the state-of-the-art β-sheet structure prediction methods Moreover, we investigate the effect of different contact map predictors on the performance of the proposed method using BetaSheet1452 dataset. The source code is available at http://www.conceptsgate.com/BetaTop.rar.
Collapse
Affiliation(s)
- Mostafa Sabzekar
- Department of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Mahmoud Naghibzadeh
- Department of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran.
| | - Mahdie Eghdami
- Department of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Zafer Aydin
- Department of Computer Engineering, Abdullah Gul University, Kayseri, Turkey
| |
Collapse
|
5
|
Sabzekar M, Naghibzadeh M, Sadri J. Efficient dynamic programming algorithm with prior knowledge for protein β-strand alignment. J Theor Biol 2017; 417:43-50. [PMID: 28108305 DOI: 10.1016/j.jtbi.2017.01.018] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2016] [Revised: 11/11/2016] [Accepted: 01/12/2017] [Indexed: 11/30/2022]
Abstract
One of the main tasks towards the prediction of protein β-sheet structure is to predict the native alignment of β-strands. The alignment of two β-strands defines similar regions that may reflect functional, structural, or evolutionary relationships between them. Therefore, any improvement in β-strands alignment not only reduces the computational search space but also improves β-sheet structure prediction accuracy. To define the alignment scores, previous studies utilized predicted residue-residue contacts (contact maps). However, there are two serious problems using them. First, the precision of contact map prediction techniques, especially for long-range contacts (i.e., β-residues), is still not satisfactory. Second, the residue-residue contact predictors usually utilize general properties of amino acids and disregard the structural features of β-residues. In this paper, we consider β-structure information, which is estimated from protein β-sheet data sets, as alignment scores. However, the predicted contact maps are used as a prior knowledge about residues. They are used for strengthening or weakening the alignment scores in our algorithm. Thus, we can utilize both β-residues and β-structure information in alignment of β-strands. The structure of dynamic programming of the alignment algorithm is changed in order to work with our prior knowledge. Moreover, the Four Russians method is applied to the proposed alignment algorithm in order to reduce the time complexity of the problem. For evaluating the proposed method, we applied it to the state-of-the-art β-sheet structure prediction methods. The experimental results on the BetaSheet916 data set showed significant improvements in the execution time, the accuracy of β-strands' alignment and consequently β-sheet structure prediction accuracy. The results are available at http://conceptsgate.com/BetaSheet.
Collapse
Affiliation(s)
- Mostafa Sabzekar
- Department of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Mahmoud Naghibzadeh
- Department of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran.
| | - Javad Sadri
- Department of Computer Science & Software Engineering, Concordia University, Canada
| |
Collapse
|
6
|
Mahato DR, Fischer WB. Weak Selectivity Predicted for Modeled Bundles of Viral Channel-Forming Protein E5 of Human Papillomavirus-16. J Phys Chem B 2016; 120:13076-13085. [PMID: 27976908 DOI: 10.1021/acs.jpcb.6b10050] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Protein E5 is a polytopic 83 amino acid membrane protein with three transmembrane domains (TMDs), encoded by high-risk human papillomavirus-16 (HPV-16). HPV-16 is found to be the causative agent for cervical cancer. Protein E5, among other proteins (e.g., E6, E7), is expressed at an "early" (E) stage when the cell turns malignant. It has been experimentally found that E5 forms hexameric assemblies, which show the characteristics of the class of so-called channel-forming proteins by rendering lipid membranes permeable to ions and small molecules. Protein E5 is used to achieve structural models of the protein in assembled bundles using a force field-based docking approach. Extended molecular dynamics simulations of selected bundles in fully hydrated lipid bilayers suggest the second TMD to be pore-lining, allowing for water columns to exist within the lumen of the pore. Full correlation analysis indicates asymmetric dynamics within the monomers of the bundle. Potential of mean force calculations of a snapshot structure of the putative open pore of the protein bundle propose low selectivity.
Collapse
Affiliation(s)
- Dhani Ram Mahato
- Institute of Biophotonics and Biophotonics & Molecular Imaging Research Center (BMIRC), School of Biomedical Science and Engineering, National Yang-Ming University , Taipei 112, Taiwan
| | - Wolfgang B Fischer
- Institute of Biophotonics and Biophotonics & Molecular Imaging Research Center (BMIRC), School of Biomedical Science and Engineering, National Yang-Ming University , Taipei 112, Taiwan
| |
Collapse
|
7
|
Kieslich CA, Smadbeck J, Khoury GA, Floudas CA. conSSert: Consensus SVM Model for Accurate Prediction of Ordered Secondary Structure. J Chem Inf Model 2016; 56:455-61. [DOI: 10.1021/acs.jcim.5b00566] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
| | - James Smadbeck
- Department
of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey 08544, United States
| | - George A. Khoury
- Department
of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey 08544, United States
| | | |
Collapse
|
8
|
Joo H, Chavan AG, Fraga KJ, Tsai J. An amino acid code for irregular and mixed protein packing. Proteins 2015; 83:2147-61. [PMID: 26370334 DOI: 10.1002/prot.24929] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2015] [Revised: 09/01/2015] [Accepted: 09/02/2015] [Indexed: 11/10/2022]
Abstract
To advance our understanding of protein tertiary structure, the development of the knob-socket model is completed in an analysis of the packing in irregular coil and turn secondary structure packing as well as between mixed secondary structure. The knob-socket model simplifies packing based on repeated patterns of two motifs: a three-residue socket for packing within secondary (2°) structure and a four-residue knob-socket for tertiary (3°) packing. For coil and turn secondary structure, knob-sockets allow identification of a correlation between amino acid composition and tertiary arrangements in space. Coil contributes almost as much as α-helices to tertiary packing. In irregular sockets, Gly, Pro, Asp, and Ser are favored, while in irregular knobs, the preference order is Arg, Asp, Pro, Asn, Thr, Leu, and Gly. Cys, His,Met, and Trp are not favored in either. In mixed packing, the knob amino acid preferences are a function of the socket that they are packing into, whereas the amino acid composition of the sockets does not depend on the secondary structure of the knob. A unique motif of a coil knob with an XYZ β-sheet socket may potentially function to inhibit β-sheet extension. In addition, analysis of the preferred crossing angles for strands within a β-sheet and mixed α-helice/β-sheet identifies canonical packing patterns useful in protein design. Lastly, the knob-socket model abstracts the complexity of protein tertiary structure into an intuitive packing surface topology map.
Collapse
Affiliation(s)
- Hyun Joo
- Department of Chemistry, University of the Pacific, Stockton, California, 95211
| | - Archana G Chavan
- Department of Chemistry, University of the Pacific, Stockton, California, 95211
| | - Keith J Fraga
- Department of Chemistry, University of the Pacific, Stockton, California, 95211
| | - Jerry Tsai
- Department of Chemistry, University of the Pacific, Stockton, California, 95211
| |
Collapse
|
9
|
Krupa P, Mozolewska MA, Joo K, Lee J, Czaplewski C, Liwo A. Prediction of Protein Structure by Template-Based Modeling Combined with the UNRES Force Field. J Chem Inf Model 2015; 55:1271-81. [DOI: 10.1021/acs.jcim.5b00117] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Affiliation(s)
- Paweł Krupa
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308 Gdańsk, Poland
| | | | | | | | - Cezary Czaplewski
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308 Gdańsk, Poland
| | - Adam Liwo
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308 Gdańsk, Poland
| |
Collapse
|
10
|
An improved capping unit for stabilizing the ends of associated β-strands. FEBS Lett 2014; 588:4749-53. [PMID: 25451230 DOI: 10.1016/j.febslet.2014.11.006] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2014] [Revised: 10/14/2014] [Accepted: 11/06/2014] [Indexed: 12/25/2022]
Abstract
Understanding protein beta structures has been hindered by the challenge of designing small, well-folded β-sheet systems. A β-capping motif was previously designed to help solve this problem, but not without limitations, as the termini of this β-cap were not fully available for chain extension. Combining Coulombic side chain attractions with a Trp/Trp edge-to-face interaction we produced a new capping motif that provided greater β-sheet stability. This stability was maintained even in systems lacking a turn locus with a high propensity for chain direction reversal. The Coulombic cap was shown to improve β-sheet stability in a number of difficult systems, hence providing an additional tool for protein structure and folding studies.
Collapse
|
11
|
Khoury GA, Liwo A, Khatib F, Zhou H, Chopra G, Bacardit J, Bortot LO, Faccioli RA, Deng X, He Y, Krupa P, Li J, Mozolewska MA, Sieradzan AK, Smadbeck J, Wirecki T, Cooper S, Flatten J, Xu K, Baker D, Cheng J, Delbem ACB, Floudas CA, Keasar C, Levitt M, Popović Z, Scheraga HA, Skolnick J, Crivelli SN, Players F. WeFold: a coopetition for protein structure prediction. Proteins 2014; 82:1850-68. [PMID: 24677212 PMCID: PMC4249725 DOI: 10.1002/prot.24538] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2013] [Revised: 01/25/2014] [Accepted: 02/08/2014] [Indexed: 12/19/2022]
Abstract
The protein structure prediction problem continues to elude scientists. Despite the introduction of many methods, only modest gains were made over the last decade for certain classes of prediction targets. To address this challenge, a social-media based worldwide collaborative effort, named WeFold, was undertaken by 13 labs. During the collaboration, the laboratories were simultaneously competing with each other. Here, we present the first attempt at "coopetition" in scientific research applied to the protein structure prediction and refinement problems. The coopetition was possible by allowing the participating labs to contribute different components of their protein structure prediction pipelines and create new hybrid pipelines that they tested during CASP10. This manuscript describes both successes and areas needing improvement as identified throughout the first WeFold experiment and discusses the efforts that are underway to advance this initiative. A footprint of all contributions and structures are publicly accessible at http://www.wefold.org.
Collapse
Affiliation(s)
- George A. Khoury
- Department of Chemical and Biological Engineering, Princeton University, USA
| | - Adam Liwo
- Faculty of Chemistry, University of Gdansk, Poland
| | - Firas Khatib
- Department of Biochemistry, University of Washington, USA
| | - Hongyi Zhou
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, USA
| | - Gaurav Chopra
- Department of Structural Biology, School of Medicine, Stanford University, USA
- Diabetes Center, School of Medicine, University of California San Francisco (UCSF), USA
| | - Jaume Bacardit
- School of Computing Science, Newcastle University, United Kingdom
| | - Leandro O. Bortot
- Laboratory of Biological Physics, Faculty of Pharmaceutical Sciences at Ribeirão Preto, University of São Paulo, Brazil
| | - Rodrigo A. Faccioli
- Institute of Mathematical and Computer Sciences, University of São Paulo, Brazil
| | - Xin Deng
- Department of Computer Science, University of Missouri, USA
| | - Yi He
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853-1301, USA
| | - Pawel Krupa
- Faculty of Chemistry, University of Gdansk, Poland
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853-1301, USA
| | - Jilong Li
- Department of Computer Science, University of Missouri, USA
| | - Magdalena A. Mozolewska
- Faculty of Chemistry, University of Gdansk, Poland
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853-1301, USA
| | | | - James Smadbeck
- Department of Chemical and Biological Engineering, Princeton University, USA
| | - Tomasz Wirecki
- Faculty of Chemistry, University of Gdansk, Poland
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853-1301, USA
| | - Seth Cooper
- Center for Game Science, Department of Computer Science & Engineering, University of Washington, USA
| | - Jeff Flatten
- Center for Game Science, Department of Computer Science & Engineering, University of Washington, USA
| | - Kefan Xu
- Center for Game Science, Department of Computer Science & Engineering, University of Washington, USA
| | - David Baker
- Department of Biochemistry, University of Washington, USA
| | - Jianlin Cheng
- Department of Computer Science, University of Missouri, USA
| | | | | | - Chen Keasar
- Departments of Computer Science and Life Sciences, Ben Gurion University of the Negev, Israel
| | - Michael Levitt
- Department of Structural Biology, School of Medicine, Stanford University, USA
| | - Zoran Popović
- Center for Game Science, Department of Computer Science & Engineering, University of Washington, USA
| | - Harold A. Scheraga
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853-1301, USA
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, USA
| | | | | |
Collapse
|
12
|
Joo H, Tsai J. An amino acid code for β-sheet packing structure. Proteins 2014; 82:2128-40. [PMID: 24668690 DOI: 10.1002/prot.24569] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2013] [Revised: 03/17/2014] [Accepted: 03/19/2014] [Indexed: 11/09/2022]
Abstract
To understand the relationship between protein sequence and structure, this work extends the knob-socket model in an investigation of β-sheet packing. Over a comprehensive set of β-sheet folds, the contacts between residues were used to identify packing cliques: sets of residues that all contact each other. These packing cliques were then classified based on size and contact order. From this analysis, the two types of four-residue packing cliques necessary to describe β-sheet packing were characterized. Both occur between two adjacent hydrogen bonded β-strands. First, defining the secondary structure packing within β-sheets, the combined socket or XY:HG pocket consists of four residues i, i+2 on one strand and j, j+2 on the other. Second, characterizing the tertiary packing between β-sheets, the knob-socket XY:H+B consists of a three-residue XY:H socket (i, i+2 on one strand and j on the other) packed against a knob B residue (residue k distant in sequence). Depending on the packing depth of the knob B residue, two types of knob-sockets are found: side-chain and main-chain sockets. The amino acid composition of the pockets and knob-sockets reveal the sequence specificity of β-sheet packing. For β-sheet formation, the XY:HG pocket clearly shows sequence specificity of amino acids. For tertiary packing, the XY:H+B side-chain and main-chain sockets exhibit distinct amino acid preferences at each position. These relationships define an amino acid code for β-sheet structure and provide an intuitive topological mapping of β-sheet packing.
Collapse
Affiliation(s)
- Hyun Joo
- Department of Chemistry, University of the Pacific, Stockton, California, 95212
| | | |
Collapse
|
13
|
Khoury GA, Smadbeck J, Kieslich CA, Floudas CA. Protein folding and de novo protein design for biotechnological applications. Trends Biotechnol 2013; 32:99-109. [PMID: 24268901 DOI: 10.1016/j.tibtech.2013.10.008] [Citation(s) in RCA: 101] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2013] [Revised: 10/10/2013] [Accepted: 10/18/2013] [Indexed: 11/19/2022]
Abstract
In the postgenomic era, the medical/biological fields are advancing faster than ever. However, before the power of full-genome sequencing can be fully realized, the connection between amino acid sequence and protein structure, known as the protein folding problem, needs to be elucidated. The protein folding problem remains elusive, with significant difficulties still arising when modeling amino acid sequences lacking an identifiable template. Understanding protein folding will allow for unforeseen advances in protein design; often referred to as the inverse protein folding problem. Despite challenges in protein folding, de novo protein design has recently demonstrated significant success via computational techniques. We review advances and challenges in protein structure prediction and de novo protein design, and highlight their interplay in successful biotechnological applications.
Collapse
Affiliation(s)
- George A Khoury
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544, USA
| | - James Smadbeck
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544, USA
| | - Chris A Kieslich
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544, USA
| | - Christodoulos A Floudas
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544, USA.
| |
Collapse
|
14
|
Smadbeck J, Peterson MB, Khoury GA, Taylor MS, Floudas CA. Protein WISDOM: a workbench for in silico de novo design of biomolecules. J Vis Exp 2013. [PMID: 23912941 DOI: 10.3791/50476] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022] Open
Abstract
The aim of de novo protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of current advances drug design and discovery. Not only does protein design provide predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched tractably. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability and complexes for increased binding affinity. To disseminate these methods for broader use we present Protein WISDOM (http://www.proteinwisdom.org), a tool that provides automated methods for a variety of protein design problems. Structural templates are submitted to initialize the design process. The first stage of design is an optimization sequence selection stage that aims at improving stability through minimization of potential energy in the sequence space. Selected sequences are then run through a fold specificity stage and a binding affinity stage. A rank-ordered list of the sequences for each step of the process, along with relevant designed structures, provides the user with a comprehensive quantitative assessment of the design. Here we provide the details of each design method, as well as several notable experimental successes attained through the use of the methods.
Collapse
Affiliation(s)
- James Smadbeck
- Department of Chemical and Biological Engineering, Princeton University, USA
| | | | | | | | | |
Collapse
|
15
|
Guilloux A, Caudron B, Jestin JL. A method to predict edge strands in beta-sheets from protein sequences. Comput Struct Biotechnol J 2013; 7:e201305001. [PMID: 24688737 PMCID: PMC3962219 DOI: 10.5936/csbj.201305001] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2013] [Revised: 05/27/2013] [Accepted: 05/30/2013] [Indexed: 12/15/2022] Open
Abstract
There is a need for rules allowing three-dimensional structure information to be derived from protein sequences. In this work, consideration of an elementary protein folding step allows protein sub-sequences which optimize folding to be derived for any given protein sequence. Classical mechanics applied to this system and the energy conservation law during the elementary folding step yields an equation whose solutions are taken over the field of rational numbers. This formalism is applied to beta-sheets containing two edge strands and at least two central strands. The number of protein sub-sequences optimized for folding per amino acid in beta-strands is shown in particular to predict edge strands from protein sequences. Topological information on beta-strands and loops connecting them is derived for protein sequences with a prediction accuracy of 75%. The statistical significance of the finding is given. Applications in protein structure prediction are envisioned such as for the quality assessment of protein structure models.
Collapse
Affiliation(s)
- Antonin Guilloux
- Analyse algébrique, Institut de Mathématiques de Jussieu, Université Pierre et Marie Curie, Paris VI, France
| | - Bernard Caudron
- Centre d'Informatique pour la Biologie, Institut Pasteur, Paris, France
| | | |
Collapse
|
16
|
Caudron B, Jestin J. Sequence criteria for the anti-parallel character of protein beta-strands. J Theor Biol 2012; 315:146-9. [DOI: 10.1016/j.jtbi.2012.09.011] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2012] [Revised: 09/10/2012] [Accepted: 09/12/2012] [Indexed: 12/17/2022]
|