1
|
Das D, Ainavarapu SRK. Protein engineering using circular permutation - structure, function, stability, and applications. FEBS J 2024. [PMID: 38676939 DOI: 10.1111/febs.17146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 03/13/2024] [Accepted: 04/12/2024] [Indexed: 04/29/2024]
Abstract
Protein engineering is important for creating novel variants from natural proteins, enabling a wide range of applications. Approaches such as rational design and directed evolution are routinely used to make new protein variants. Computational tools like de novo design can introduce new protein folds. Expanding the amino acid repertoire to include unnatural amino acids with non-canonical side chains in vitro by native chemical ligation and in vivo via codon expansion methods broadens sequence and structural possibilities. Circular permutation (CP) is an invaluable approach to redesigning a protein by rearranging the amino acid sequence, where the connectivity of the secondary structural elements is altered without changing the overall structure of the protein. Artificial CP proteins (CPs) are employed in various applications such as biocatalysis, sensing of small molecules by fluorescence, genome editing, ligand-binding protein switches, and optogenetic engineering. Many studies have shown that CP can lead to either reduced or enhanced stability or catalytic efficiency. The effects of CP on a protein's energy landscape cannot be predicted a priori. Thus, it is important to understand how CP can affect the thermodynamic and kinetic stability of a protein. In this review, we discuss the discovery and advancement of techniques to create protein CP, and existing reviews on CP. We delve into the plethora of biological applications for designed CP proteins. We subsequently discuss the experimental and computational reports on the effects of CP on the thermodynamic and kinetic stabilities of proteins of various topologies. An understanding of the various aspects of CP will allow the reader to design robust CP proteins for their specific purposes.
Collapse
Affiliation(s)
- Debanjana Das
- Department of Chemical Sciences, Tata Institute of Fundamental Research, Mumbai, India
| | | |
Collapse
|
2
|
Puri S, Liu CY, Hu IC, Lai CH, Hsu STD, Lyu PC. Elucidation of the folding pathway of a circular permutant of topologically knotted YbeA by tryptophan substitutions. Biochem Biophys Res Commun 2023; 672:81-88. [PMID: 37343318 DOI: 10.1016/j.bbrc.2023.06.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 06/05/2023] [Accepted: 06/06/2023] [Indexed: 06/23/2023]
Abstract
CP74 is an engineered circular permutant of a deep trefoil knotted SpoU-TrmD (SPOUT) RNA methyl transferase protein YbeA from E. coli. We have previously established that the circular permutation unties the knotted topology of YbeA and CP74 forms a domain-swapped dimer with a large dimeric interface of ca. 4600 Å2. To understand the impact of domain-swapping and the newly formed hinge region joining the two folded domains on the folding and stability of CP74, the five equally spaced tryptophan residues were individually substituted into phenylalanine to monitor their conformational and stability changes by a battery of biophysical tools. Far-UV circular dichroism, intrinsic fluorescence, and small-angle X-ray scattering dictated minimal global conformational perturbations to the native structures in the tryptophan variants. The structures of the tryptophan variants also showed the conservation of the domain-swapped ternary structure with the exception that the W72F exhibited significant asymmetry in the α-helix 5. Comparative global thermal and chemical stability analyses indicated the pivotal role of W100 in the folding of CP74 followed by W19 and W72. Solution-state NMR spectroscopy and hydrogen-deuterium exchange mass spectrometry further revealed the accumulation of a native-like intermediate state in which the hinge region made important contributions to maintain the domain-swapped ternary structure of CP74.
Collapse
Affiliation(s)
- Sarita Puri
- Institute of Biological Chemistry, Academia Sinica, Taipei, 11529, Taiwan
| | - Cheng-Yu Liu
- Institute of Bioinformatics and Structural Biology, National Tsing Hua University, Hsinchu, 30013, Taiwan
| | - I-Chen Hu
- Institute of Bioinformatics and Structural Biology, National Tsing Hua University, Hsinchu, 30013, Taiwan
| | - Chih-Hsuan Lai
- Institute of Biological Chemistry, Academia Sinica, Taipei, 11529, Taiwan; Institute of Bioinformatics and Structural Biology, National Tsing Hua University, Hsinchu, 30013, Taiwan
| | - Shang-Te Danny Hsu
- Institute of Biological Chemistry, Academia Sinica, Taipei, 11529, Taiwan; Institute of Biochemical Sciences, National Taiwan University, Taipei, 10617, Taiwan; International Institute for Sustainability with Knotted Chiral Meta Matter, Hiroshima University, Higashihiroshima, 739-8527, Japan.
| | - Ping-Chiang Lyu
- Institute of Bioinformatics and Structural Biology, National Tsing Hua University, Hsinchu, 30013, Taiwan; Department of Medical Science, National Tsing Hua University, Hsinchu, 30013, Taiwan.
| |
Collapse
|
3
|
Shi Q, Abdel-Hamid AM, Sun Z, Cheng Y, Tu T, Cann I, Yao B, Zhu W. Carbohydrate-binding modules facilitate the enzymatic hydrolysis of lignocellulosic biomass: Releasing reducing sugars and dissociative lignin available for producing biofuels and chemicals. Biotechnol Adv 2023; 65:108126. [PMID: 36921877 DOI: 10.1016/j.biotechadv.2023.108126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 02/05/2023] [Accepted: 03/08/2023] [Indexed: 03/16/2023]
Abstract
The microbial decomposition and utilization of lignocellulosic biomass present in the plant tissues are driven by a series of carbohydrate active enzymes (CAZymes) acting in concert. As the non-catalytic domains widely found in the modular CAZymes, carbohydrate-binding modules (CBMs) are intimately associated with catalytic domains (CDs) that effect the diverse hydrolytic reactions. The CBMs function as auxiliary components for the recognition, adhesion, and depolymerization of the complex substrate mediated by the associated CDs. Therefore, CBMs are deemed as significant biotools available for enzyme engineering, especially to facilitate the enzymatic hydrolysis of dense and insoluble plant tissues to acquire more fermentable sugars. This review aims at presenting the taxonomies and biological properties of the CBMs currently curated in the CAZy database. The molecular mechanisms that CBMs use in assisting the enzymatic hydrolysis of plant polysaccharides and the regulatory factors of CBM-substrate interactions are outlined in detail. In addition, guidelines for the rational designs of CBM-fused CAZymes are proposed. Furthermore, the potential to harness CBMs for industrial applications, especially in enzymatic pretreatment of the recalcitrant lignocellulose, is evaluated. It is envisaged that the ideas outlined herein will aid in the engineering and production of novel CBM-fused enzymes to facilitate efficient degradation of lignocellulosic biomass to easily fermentable sugars for production of value-added products, including biofuels.
Collapse
Affiliation(s)
- Qicheng Shi
- Laboratory of Gastrointestinal Microbiology, National Center for International Research on Animal Gut Nutrition, Nanjing Agricultural University, Nanjing 210095, China
| | - Ahmed M Abdel-Hamid
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, IL 61801, USA
| | - Zhanying Sun
- Laboratory of Gastrointestinal Microbiology, National Center for International Research on Animal Gut Nutrition, Nanjing Agricultural University, Nanjing 210095, China
| | - Yanfen Cheng
- Laboratory of Gastrointestinal Microbiology, National Center for International Research on Animal Gut Nutrition, Nanjing Agricultural University, Nanjing 210095, China.
| | - Tao Tu
- State Key Laboratory of Animal Nutrition, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China.
| | - Isaac Cann
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, IL 61801, USA; Department of Animal Science, University of Illinois at Urbana-Champaign, IL 61801, USA; Department of Microbiology, University of Illinois at Urbana-Champaign, IL 61801, USA; Division of Nutritional Sciences, University of Illinois at Urbana-Champaign, IL 61801, USA; Center for East Asian and Pacific Studies, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Bin Yao
- State Key Laboratory of Animal Nutrition, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Weiyun Zhu
- Laboratory of Gastrointestinal Microbiology, National Center for International Research on Animal Gut Nutrition, Nanjing Agricultural University, Nanjing 210095, China
| |
Collapse
|
4
|
Li H, Schneider T, Tan Y, Zhang D. Ribonuclease T2 represents a distinct circularly permutated version of the BECR RNases. Protein Sci 2023; 32:e4531. [PMID: 36477982 PMCID: PMC9793965 DOI: 10.1002/pro.4531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Revised: 11/07/2022] [Accepted: 11/30/2022] [Indexed: 12/13/2022]
Abstract
Detection of homologous relationships among proteins and understanding their mechanisms of diversification are major topics in the fields of protein science, bioinformatics, and phylogenetics. Recent developments in sequence/profile-based and structural similarity-based methods have greatly facilitated the unification and classification of many protein families into superfamilies or folds, yet many proteins remain unclassified in current protein databases. As one of the three earliest identified RNases in biology, ribonuclease T2, also known as RNase I in Escherichia coli, RNase Rh in fungi, or S-RNase in plant, is thought to be an ancient RNase family due to its widespread distribution and distinct structure. In this study, we present evidence that RNase T2 represents a circularly permutated version of the BECR (Barnase-EndoU-Colicin E5/D-RelE) fold RNases. This subtle relationship cannot be detected by traditional methods such as sequence/profile-based comparisons, structure-similarity searches, and circular permutation detections. However, we were able to identify the structural similarity using rational reconstruction of a theoretical RNase T2 ancestor via a reverse circular permutation process, followed by structural modeling using AlphaFold2, and structural comparisons. This relationship is further supported by the fact that RNase T2 and other typical BECR RNases, namely Colicin D, RNase A, and BrnT, share similar catalytic site configurations, all involving an analogous set of conserved residues on the α0 helix and the β4 strand of the BECR fold. This study revealed a hidden root of RNase T2 in bacterial toxin systems and demonstrated that reconstruction and modeling of ancestral topology is an effective strategy to identify remote relationship between proteins.
Collapse
Affiliation(s)
- Huan Li
- Department of BiologyCollege of Arts & Sciences, Saint Louis UniversitySaint LouisMissouriUSA
| | - Theresa Schneider
- Department of BiologyCollege of Arts & Sciences, Saint Louis UniversitySaint LouisMissouriUSA
| | - Yongjun Tan
- Department of BiologyCollege of Arts & Sciences, Saint Louis UniversitySaint LouisMissouriUSA
| | - Dapeng Zhang
- Department of BiologyCollege of Arts & Sciences, Saint Louis UniversitySaint LouisMissouriUSA
- Program of Bioinformatics and Computational BiologySchool of Science and Engineering, Saint Louis UniversitySaint LouisMissouriUSA
| |
Collapse
|
5
|
SeqCP: A sequence-based algorithm for searching circularly permuted proteins. Comput Struct Biotechnol J 2022; 21:185-201. [PMID: 36582435 PMCID: PMC9763678 DOI: 10.1016/j.csbj.2022.11.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 11/10/2022] [Accepted: 11/10/2022] [Indexed: 11/16/2022] Open
Abstract
Circular permutation (CP) is a protein sequence rearrangement in which the amino- and carboxyl-termini of a protein can be created in different positions along the imaginary circularized sequence. Circularly permutated proteins usually exhibit conserved three-dimensional structures and functions. By comparing the structures of circular permutants (CPMs), protein research and bioengineering applications can be approached in ways that are difficult to achieve by traditional mutagenesis. Most current CP detection algorithms depend on structural information. Because there is a vast number of proteins with unknown structures, many CP pairs may remain unidentified. An efficient sequence-based CP detector will help identify more CP pairs and advance many protein studies. For instance, some hypothetical proteins may have CPMs with known functions and structures that are informative for functional annotation, but existing structure-based CP search methods cannot be applied when those hypothetical proteins lack structural information. Despite the considerable potential for applications, sequence-based CP search methods have not been well developed. We present a sequence-based method, SeqCP, which analyzes normal and duplicated sequence alignments to identify CPMs and determine candidate CP sites for proteins. SeqCP was trained by data obtained from the Circular Permutation Database and tested with nonredundant datasets from the Protein Data Bank. It shows high reliability in CP identification and achieves an AUC of 0.9. SeqCP has been implemented into a web server available at: http://pcnas.life.nthu.edu.tw/SeqCP/.
Collapse
Key Words
- AUC, area under the ROC curve
- CE, combinatorial extension
- CE-CP, CE with Circular Permutations
- CP, circular permutation
- CPDB, Circular Permutation Database
- CPMs, circular permutants
- CPSARST, Circular Permutation Search Aided by Ramachandran Sequential Transformation
- Circular permutants
- Circular permutation
- MCC, Matthews correlation coefficient
- Protein sequence analysis
- Protein structure modeling
- RMSD, root-mean-square distance
- ROC, receiver operating characteristic
Collapse
|
6
|
Chen TR, Lin YC, Huang YW, Chen CC, Lo WC. CirPred, the first structure modeling and linker design system for circularly permuted proteins. BMC Bioinformatics 2021; 22:494. [PMID: 34641789 PMCID: PMC8513176 DOI: 10.1186/s12859-021-04403-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2021] [Accepted: 09/24/2021] [Indexed: 11/16/2022] Open
Abstract
Background This work aims to help develop new protein engineering techniques based on a structural rearrangement phenomenon called circular permutation (CP), equivalent to connecting the native termini of a protein followed by creating new termini at another site. Although CP has been applied in many fields, its implementation is still costly because of inevitable trials and errors.
Results Here we present CirPred, a structure modeling and termini linker design method for circularly permuted proteins. Compared with state-of-the-art protein structure modeling methods, CirPred is the only one fully capable of both circularly-permuted modeling and traditional co-linear modeling. CirPred performs well when the permutant shares low sequence identity with the native protein and even when the permutant adopts a different conformation from the native protein because of three-dimensional (3D) domain swapping. Linker redesign experiments demonstrated that the linker design algorithm of CirPred achieved subangstrom accuracy. Conclusions The CirPred system is capable of (1) predicting the structure of circular permutants, (2) designing termini linkers, (3) performing traditional co-linear protein structure modeling, and (4) identifying the CP-induced occurrence of 3D domain swapping. This method is supposed helpful for broadening the application of CP, and its web server is available at http://10.life.nctu.edu.tw/CirPred/ and http://lo.life.nctu.edu.tw/CirPred/. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04403-1.
Collapse
Affiliation(s)
- Teng-Ruei Chen
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan.,Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Yen-Cheng Lin
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan.,Department of Biological Science and Technology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Yu-Wei Huang
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan.,Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Chih-Chieh Chen
- Institute of Medical Science and Technology, National Sun Yat-sen University, Kaohsiung, Taiwan
| | - Wei-Cheng Lo
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan. .,Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan. .,Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan. .,Department of Biological Science and Technology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan. .,The Center for Bioinformatics Research, National Yang Ming Chiao Tung University, Hsinchu, Taiwan.
| |
Collapse
|
7
|
Lalwani Prakash D, Gosavi S. Understanding the Folding Mediated Assembly of the Bacteriophage MS2 Coat Protein Dimers. J Phys Chem B 2021; 125:8722-8732. [PMID: 34339197 DOI: 10.1021/acs.jpcb.1c03928] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The capsids of RNA viruses such as MS2 are great models for studying protein self-assembly because they are made almost entirely of multiple copies of a single coat protein (CP). Although CP is the minimal repeating unit of the capsid, previous studies have shown that CP exists as a homodimer (CP2) even in an acid-disassembled system, indicating that CP2 is an obligate dimer. Here, we investigate the molecular basis of this obligate dimerization using coarse-grained structure-based models and molecular dynamics simulations. We find that, unlike monomeric proteins of similar size, CP populates a single partially folded ensemble whose "foldedness" is sensitive to denaturing conditions. In contrast, CP2 folds similarly to single-domain proteins populating only the folded and the unfolded ensembles, separated by a prominent folding free energy barrier. Several intramonomer contacts form early, but the CP2 folding barrier is crossed only when the intermonomer contacts are made. A dissection of the structure of CP2 through mutant folding simulations shows that the folding barrier arises both from the topology of CP and the interface contacts of CP2. Together, our results show that CP2 is an obligate dimer because of kinetic stability, that is, dimerization induces a folding barrier and that makes it difficult for proteins in the dimer minimum to partially unfold and access the monomeric state without completely unfolding. We discuss the advantages of this obligate dimerization in the context of dimer design and virus stability.
Collapse
Affiliation(s)
- Digvijay Lalwani Prakash
- Simons Centre for the Study of Living Machines, National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bengaluru 560065, India
| | - Shachi Gosavi
- Simons Centre for the Study of Living Machines, National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bengaluru 560065, India
| |
Collapse
|
8
|
Chen TR, Juan SH, Huang YW, Lin YC, Lo WC. A secondary structure-based position-specific scoring matrix applied to the improvement in protein secondary structure prediction. PLoS One 2021; 16:e0255076. [PMID: 34320027 PMCID: PMC8318245 DOI: 10.1371/journal.pone.0255076] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Accepted: 07/11/2021] [Indexed: 11/18/2022] Open
Abstract
Protein secondary structure prediction (SSP) has a variety of applications; however, there has been relatively limited improvement in accuracy for years. With a vision of moving forward all related fields, we aimed to make a fundamental advance in SSP. There have been many admirable efforts made to improve the machine learning algorithm for SSP. This work thus took a step back by manipulating the input features. A secondary structure element-based position-specific scoring matrix (SSE-PSSM) is proposed, based on which a new set of machine learning features can be established. The feasibility of this new PSSM was evaluated by rigid independent tests with training and testing datasets sharing <25% sequence identities. In all experiments, the proposed PSSM outperformed the traditional amino acid PSSM. This new PSSM can be easily combined with the amino acid PSSM, and the improvement in accuracy was remarkable. Preliminary tests made by combining the SSE-PSSM and well-known SSP methods showed 2.0% and 5.2% average improvements in three- and eight-state SSP accuracies, respectively. If this PSSM can be integrated into state-of-the-art SSP methods, the overall accuracy of SSP may break the current restriction and eventually bring benefit to all research and applications where secondary structure prediction plays a vital role during development. To facilitate the application and integration of the SSE-PSSM with modern SSP methods, we have established a web server and standalone programs for generating SSE-PSSM available at http://10.life.nctu.edu.tw/SSE-PSSM.
Collapse
Affiliation(s)
- Teng-Ruei Chen
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Sheng-Hung Juan
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Yu-Wei Huang
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Yen-Cheng Lin
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan
- Department of Biological Science and Technology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Wei-Cheng Lo
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan
- Department of Biological Science and Technology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
- The Center for Bioinformatics Research, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
- * E-mail:
| |
Collapse
|
9
|
Mushegian A, Sorokina I, Eroshkin A, Dlakić M. An ancient evolutionary connection between Ribonuclease A and EndoU families. RNA (NEW YORK, N.Y.) 2020; 26:803-813. [PMID: 32284351 PMCID: PMC7297114 DOI: 10.1261/rna.074385.119] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/17/2019] [Accepted: 04/06/2020] [Indexed: 06/11/2023]
Abstract
The ribonuclease A family of proteins is well studied from the biochemical and biophysical points of view, but its evolutionary origins are obscure, as no sequences homologous to this family have been reported outside of vertebrates. Recently, the spatial structure of the ribonuclease domain from a bacterial polymorphic toxin was shown to be closely similar to the structure of vertebrate ribonuclease A. The absence of sequence similarity between the two structures prompted a speculation of convergent evolution of bacterial and vertebrate ribonuclease A-like enzymes. We show that bacterial and homologous archaeal polymorphic toxin ribonucleases with a known or predicted ribonuclease A-like fold are distant homologs of the ribonucleases from the EndoU family, found in all domains of cellular life and in viruses. We also detected a homolog of vertebrate ribonucleases A in the transcriptome assembly of the sea urchin Mesocentrotus franciscanus These observations argue for the common ancestry of prokaryotic ribonuclease A-like and ubiquitous EndoU-like ribonucleases, and suggest a better-grounded scenario for the origin of animal ribonucleases A, which could have emerged in the deuterostome lineage, either by an extensive modification of a copy of an EndoU gene, or, more likely, by a horizontal acquisition of a prokaryotic immunity-mediating ribonuclease gene.
Collapse
Affiliation(s)
- Arcady Mushegian
- Division of Molecular and Cellular Biosciences, National Science Foundation, Alexandria, Virginia 22314, USA
| | | | | | - Mensur Dlakić
- Department of Microbiology and Immunology, Montana State University, Bozeman, Montana 59717, USA
| |
Collapse
|
10
|
Juan SH, Chen TR, Lo WC. A simple strategy to enhance the speed of protein secondary structure prediction without sacrificing accuracy. PLoS One 2020; 15:e0235153. [PMID: 32603341 PMCID: PMC7326220 DOI: 10.1371/journal.pone.0235153] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2019] [Accepted: 06/09/2020] [Indexed: 01/06/2023] Open
Abstract
The secondary structure prediction of proteins is a classic topic of computational structural biology with a variety of applications. During the past decade, the accuracy of prediction achieved by state-of-the-art algorithms has been >80%; meanwhile, the time cost of prediction increased rapidly because of the exponential growth of fundamental protein sequence data. Based on literature studies and preliminary observations on the relationships between the size/homology of the fundamental protein dataset and the speed/accuracy of predictions, we raised two hypotheses that might be helpful to determine the main influence factors of the efficiency of secondary structure prediction. Experimental results of size and homology reductions of the fundamental protein dataset supported those hypotheses. They revealed that shrinking the size of the dataset could substantially cut down the time cost of prediction with a slight decrease of accuracy, which could be increased on the contrary by homology reduction of the dataset. Moreover, the Shannon information entropy could be applied to explain how accuracy was influenced by the size and homology of the dataset. Based on these findings, we proposed that a proper combination of size and homology reductions of the protein dataset could speed up the secondary structure prediction while preserving the high accuracy of state-of-the-art algorithms. Testing the proposed strategy with the fundamental protein dataset of the year 2018 provided by the Universal Protein Resource, the speed of prediction was enhanced over 20 folds while all accuracy measures remained equivalently high. These findings are supposed helpful for improving the efficiency of researches and applications depending on the secondary structure prediction of proteins. To make future implementations of the proposed strategy easy, we have established a database of size and homology reduced protein datasets at http://10.life.nctu.edu.tw/UniRefNR.
Collapse
Affiliation(s)
- Sheng-Hung Juan
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan
| | - Teng-Ruei Chen
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan
| | - Wei-Cheng Lo
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan
- The Center for Bioinformatics Research, National Chiao Tung University, Hsinchu, Taiwan
| |
Collapse
|
11
|
Atkinson JT, Jones AM, Zhou Q, Silberg JJ. Circular permutation profiling by deep sequencing libraries created using transposon mutagenesis. Nucleic Acids Res 2019; 46:e76. [PMID: 29912470 PMCID: PMC6061844 DOI: 10.1093/nar/gky255] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2017] [Accepted: 03/28/2018] [Indexed: 12/17/2022] Open
Abstract
Deep mutational scanning has been used to create high-resolution DNA sequence maps that illustrate the functional consequences of large numbers of point mutations. However, this approach has not yet been applied to libraries of genes created by random circular permutation, an engineering strategy that is used to create open reading frames that express proteins with altered contact order. We describe a new method, termed circular permutation profiling with DNA sequencing (CPP-seq), which combines a one-step transposon mutagenesis protocol for creating libraries with a functional selection, deep sequencing and computational analysis to obtain unbiased insight into a protein's tolerance to circular permutation. Application of this method to an adenylate kinase revealed that CPP-seq creates two types of vectors encoding each circularly permuted gene, which differ in their ability to express proteins. Functional selection of this library revealed that >65% of the sampled vectors that express proteins are enriched relative to those that cannot translate proteins. Mapping enriched sequences onto structure revealed that the mobile AMP binding and rigid core domains display greater tolerance to backbone fragmentation than the mobile lid domain, illustrating how CPP-seq can be used to relate a protein's biophysical characteristics to the retention of activity upon permutation.
Collapse
Affiliation(s)
- Joshua T Atkinson
- Systems, Synthetic, and Physical Biology Graduate Program, Rice University, 6100 Main MS-180, Houston, TX 77005, USA
| | - Alicia M Jones
- Department of BioSciences, Rice University, MS-140, 6100 Main Street, Houston, TX 77005, USA
| | - Quan Zhou
- Department of Statistics, Rice University, 6100 Main Street, Houston, TX 77005, USA
| | - Jonathan J Silberg
- Department of BioSciences, Rice University, MS-140, 6100 Main Street, Houston, TX 77005, USA.,Department of Bioengineering, Rice University, 6100 Main Street, Houston, TX 77005, USA
| |
Collapse
|
12
|
Bandyopadhyay B, Peleg Y. Facilitating circular permutation using Restriction Free (RF) cloning. Protein Eng Des Sel 2019; 31:65-68. [PMID: 29319799 DOI: 10.1093/protein/gzx061] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2017] [Accepted: 11/14/2017] [Indexed: 02/02/2023] Open
Abstract
Circular permutation is a powerful tool to test the role of topology in protein folding and function. Previous methods for generating circular permutants were based on rearranging gene elements using restriction enzymes-based cloning. Here, we present a Restriction Free (RF) approach to achieve circular permutation which is faster and more cost-effective.
Collapse
Affiliation(s)
| | - Yoav Peleg
- The Israel Structural Proteomics Center (ISPC), Weizmann Institute of Science, Rehovot 7610001, Israel
| |
Collapse
|
13
|
Armenta S, Moreno-Mendieta S, Sánchez-Cuapio Z, Sánchez S, Rodríguez-Sanoja R. Advances in molecular engineering of carbohydrate-binding modules. Proteins 2017; 85:1602-1617. [PMID: 28547780 DOI: 10.1002/prot.25327] [Citation(s) in RCA: 57] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2017] [Revised: 05/04/2017] [Accepted: 05/20/2017] [Indexed: 11/06/2022]
Abstract
Carbohydrate-binding modules (CBMs) are non-catalytic domains that are generally appended to carbohydrate-active enzymes. CBMs have a broadly conserved structure that allows recognition of a notable variety of carbohydrates, in both their soluble and insoluble forms, as well as in their alpha and beta conformations and with different types of bonds or substitutions. This versatility suggests a high functional plasticity that is not yet clearly understood, in spite of the important number of studies relating protein structure and function. Several studies have explored the flexibility of these systems by changing or improving their specificity toward substrates of interest. In this review, we examine the molecular strategies used to identify CBMs with novel or improved characteristics. The impact of the spatial arrangement of the functional amino acids of CBMs is discussed in terms of unexpected new functions that are not related to the original biological roles of the enzymes. Proteins 2017; 85:1602-1617. © 2017 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Silvia Armenta
- Instituto de Investigaciones Biomédicas, Universidad Nacional Autónoma de México, Circuito Mario de la Cueva s/n Ciudad Universitaria, Ciudad de México, 04510, México
| | - Silvia Moreno-Mendieta
- CONACYT, Instituto de Investigaciones Biomédicas, Universidad Nacional Autónoma de México, Circuito Mario de la Cueva s/n Ciudad Universitaria, Ciudad de México, 04510, México
| | - Zaira Sánchez-Cuapio
- Instituto de Investigaciones Biomédicas, Universidad Nacional Autónoma de México, Circuito Mario de la Cueva s/n Ciudad Universitaria, Ciudad de México, 04510, México
| | - Sergio Sánchez
- Instituto de Investigaciones Biomédicas, Universidad Nacional Autónoma de México, Circuito Mario de la Cueva s/n Ciudad Universitaria, Ciudad de México, 04510, México
| | - Romina Rodríguez-Sanoja
- Instituto de Investigaciones Biomédicas, Universidad Nacional Autónoma de México, Circuito Mario de la Cueva s/n Ciudad Universitaria, Ciudad de México, 04510, México
| |
Collapse
|
14
|
Abstract
Split inteins have emerged as a powerful tool in protein engineering. We describe a reliable in silico method to predict viable split sites for the design of new split inteins. A computational circular permutation (CP) prediction method facilitates the search for internal permissive sites to create artificial circular permutants. In this procedure, the original amino- and carboxyl-termini are connected and new termini are created. The identified new terminal sites are promising candidates for the generation of new split sites with the backbone opening being tolerated by the structural scaffold. Here we show how to integrate the online usage of the CP predictor, CPred, in the search of new split intein sites.
Collapse
Affiliation(s)
- Yi-Zong Lee
- Institute of Bioinformatics and Structural Biology, National Tsing Hua University, 101, Section 2, Kuang-Fu Road, 30013, Hsinchu, Taiwan
| | - Wei-Cheng Lo
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan.
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan.
| | - Shih-Che Sue
- Institute of Bioinformatics and Structural Biology, National Tsing Hua University, 101, Section 2, Kuang-Fu Road, 30013, Hsinchu, Taiwan.
- Department of Life Science, National Tsing Hua University, Hsinchu, Taiwan.
| |
Collapse
|
15
|
Lu J, Xu G, Zhang S, Lu B. An effective sequence-alignment-free superpositioning of pairwise or multiple structures with missing data. Algorithms Mol Biol 2016; 11:18. [PMID: 27330544 PMCID: PMC4915111 DOI: 10.1186/s13015-016-0079-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2015] [Accepted: 05/18/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Superpositioning is an important problem in structural biology. Determining an optimal superposition requires a one-to-one correspondence between the atoms of two proteins structures. However, in practice, some atoms are missing from their original structures. Current superposition implementations address the missing data crudely by ignoring such atoms from their structures. RESULTS In this paper, we propose an effective method for superpositioning pairwise and multiple structures without sequence alignment. It is a two-stage procedure including data reduction and data registration. CONCLUSIONS Numerical experiments demonstrated that our method is effective and efficient. The code package of protein structure superposition method for addressing the cases with missing data is implemented by MATLAB, and it is freely available from: http://sourceforge.net/projects/pssm123/files/?source=navbar.
Collapse
Affiliation(s)
- Jianbo Lu
- />Human Genetics Resource Center, National Research Institute for Family Planning, Beijing, 100081 China
- />Graduate School of Peking Union Medical College, Beijing, 100730 China
| | - Guoliang Xu
- />National Center for Mathematics and Interdisciplinary Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190 China
| | - Shihua Zhang
- />National Center for Mathematics and Interdisciplinary Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190 China
| | - Benzhuo Lu
- />National Center for Mathematics and Interdisciplinary Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190 China
| |
Collapse
|
16
|
Jones AM, Mehta MM, Thomas EE, Atkinson JT, Segall-Shapiro TH, Liu S, Silberg JJ. The Structure of a Thermophilic Kinase Shapes Fitness upon Random Circular Permutation. ACS Synth Biol 2016; 5:415-25. [PMID: 26976658 DOI: 10.1021/acssynbio.5b00305] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Proteins can be engineered for synthetic biology through circular permutation, a sequence rearrangement in which native protein termini become linked and new termini are created elsewhere through backbone fission. However, it remains challenging to anticipate a protein's functional tolerance to circular permutation. Here, we describe new transposons for creating libraries of randomly circularly permuted proteins that minimize peptide additions at their termini, and we use transposase mutagenesis to study the tolerance of a thermophilic adenylate kinase (AK) to circular permutation. We find that libraries expressing permuted AKs with either short or long peptides amended to their N-terminus yield distinct sets of active variants and present evidence that this trend arises because permuted protein expression varies across libraries. Mapping all sites that tolerate backbone cleavage onto AK structure reveals that the largest contiguous regions of sequence that lack cleavage sites are proximal to the phosphotransfer site. A comparison of our results with a range of structure-derived parameters further showed that retention of function correlates to the strongest extent with the distance to the phosphotransfer site, amino acid variability in an AK family sequence alignment, and residue-level deviations in superimposed AK structures. Our work illustrates how permuted protein libraries can be created with minimal peptide additions using transposase mutagenesis, and it reveals a challenge of maintaining consistent expression across permuted variants in a library that minimizes peptide additions. Furthermore, these findings provide a basis for interpreting responses of thermophilic phosphotransferases to circular permutation by calibrating how different structure-derived parameters relate to retention of function in a cellular selection.
Collapse
Affiliation(s)
- Alicia M. Jones
- Department
of Biosciences, Rice University, MS-140, 6100 Main Street, Houston, Texas 77005, United States
| | - Manan M. Mehta
- Medical
Scientist Training Program, Northwestern University, 303 East
Chicago Avenue, Morton 1-670, Chicago, Illinois 60611, United States
| | - Emily E. Thomas
- Department
of Biosciences, Rice University, MS-140, 6100 Main Street, Houston, Texas 77005, United States
| | - Joshua T. Atkinson
- Systems,
Synthetic, and Physical Biology Graduate Program, Rice University, 6100
Main MS-180, Houston, Texas 77005, United States
| | - Thomas H. Segall-Shapiro
- Department
of Biological Engineering, Synthetic Biology Center, Massachusetts Institute of Technology, 500 Technology Square, NE47-257, Cambridge, Massachusetts 02139, United States
| | - Shirley Liu
- Department
of Biosciences, Rice University, MS-140, 6100 Main Street, Houston, Texas 77005, United States
| | - Jonathan J. Silberg
- Department
of Biosciences, Rice University, MS-140, 6100 Main Street, Houston, Texas 77005, United States
| |
Collapse
|
17
|
Adjeroh D, Jiang Y, Jiang BH, Lin J. Network analysis of circular permutations in multidomain proteins reveals functional linkages for uncharacterized proteins. Cancer Inform 2015; 13:109-24. [PMID: 25741177 PMCID: PMC4338801 DOI: 10.4137/cin.s14059] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2014] [Revised: 09/23/2014] [Accepted: 09/24/2014] [Indexed: 01/19/2023] Open
Abstract
Various studies have implicated different multidomain proteins in cancer. However, there has been little or no detailed study on the role of circular multidomain proteins in the general problem of cancer or on specific cancer types. This work represents an initial attempt at investigating the potential for predicting linkages between known cancer-associated proteins with uncharacterized or hypothetical multidomain proteins, based primarily on circular permutation (CP) relationships. First, we propose an efficient algorithm for rapid identification of both exact and approximate CPs in multidomain proteins. Using the circular relations identified, we construct networks between multidomain proteins, based on which we perform functional annotation of multidomain proteins. We then extend the method to construct subnetworks for selected cancer subtypes, and performed prediction of potential link-ages between uncharacterized multidomain proteins and the selected cancer types. We include practical results showing the performance of the proposed methods.
Collapse
Affiliation(s)
- Donald Adjeroh
- Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV, USA
| | - Yue Jiang
- Faculty of Software, Fujian Normal University, Fuzhou, Fujian, China
| | - Bing-Hua Jiang
- Pathology, Anatomy and Cell Biology, Thomas Jefferson University, Philadelphia, PA, USA
| | - Jie Lin
- Faculty of Software, Fujian Normal University, Fuzhou, Fujian, China
| |
Collapse
|
18
|
Tyurin А, Sadovskaya N, Nikiforova K, Mustafaev О, Komakhin R, Fadeev V, Goldenkova-Pavlova I. Clostridium thermocellum thermostable lichenase with circular permutations and modifications in the N-terminal region retains its activity and thermostability. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2015; 1854:10-9. [DOI: 10.1016/j.bbapap.2014.10.012] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/26/2014] [Revised: 09/25/2014] [Accepted: 10/15/2014] [Indexed: 11/30/2022]
|
19
|
Bliven SE, Bourne PE, Prlić A. Detection of circular permutations within protein structures using CE-CP. Bioinformatics 2014; 31:1316-8. [PMID: 25505094 DOI: 10.1093/bioinformatics/btu823] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2014] [Accepted: 12/08/2014] [Indexed: 12/19/2022] Open
Abstract
MOTIVATION Circular permutation is an important type of protein rearrangement. Natural circular permutations have implications for protein function, stability and evolution. Artificial circular permutations have also been used for protein studies. However, such relationships are difficult to detect for many sequence and structure comparison algorithms and require special consideration. RESULTS We developed a new algorithm, called Combinatorial Extension for Circular Permutations (CE-CP), which allows the structural comparison of circularly permuted proteins. CE-CP was designed to be user friendly and is integrated into the RCSB Protein Data Bank. It was tested on two collections of circularly permuted proteins. Pairwise alignments can be visualized both in a desktop application or on the web using Jmol and exported to other programs in a variety of formats. AVAILABILITY AND IMPLEMENTATION The CE-CP algorithm can be accessed through the RCSB website at http://www.rcsb.org/pdb/workbench/workbench.do. Source code is available under the LGPL 2.1 as part of BioJava 3 (http://biojava.org; http://github.com/biojava/biojava). CONTACT sbliven@ucsd.edu or info@rcsb.org.
Collapse
Affiliation(s)
- Spencer E Bliven
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA 92093, USA, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA and RCSB Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA 92093, USA, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA and RCSB Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA
| | - Philip E Bourne
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA 92093, USA, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA and RCSB Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA 92093, USA, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA and RCSB Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA
| | - Andreas Prlić
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA 92093, USA, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA and RCSB Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA
| |
Collapse
|
20
|
Engineering of Yarrowia lipolytica lipase Lip8p by circular permutation to alter substrate and temperature characteristics. ACTA ACUST UNITED AC 2014; 41:757-62. [DOI: 10.1007/s10295-014-1428-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2013] [Accepted: 02/25/2014] [Indexed: 10/25/2022]
Abstract
Abstract
Applications of lipases are mainly based on their catalytic efficiency and substrate specificity. In this study, circular permutation (CP), an unconventional protein engineering technique, was employed to acquire active mutants of Yarrowia lipolytica lipase Lip8p. A total of 21 mutant lipases exhibited significant shifts in substrate specificity. Cp128, the most active enzyme mutant, showed higher catalytic activity (14.5-fold) and higher affinity (4.6-fold) (decreased K m) to p-nitrophenyl-myristate (pNP-C14) than wild type (WT). Based on the three-dimensional (3D) structure model of the Lip8p, we found that most of the functional mutation occurred in the surface-exposed loop region in close proximity to the lid domain (S112–F122), which implies the steric effect of the lid on lipase activity and substrate specificity. The temperature properties of Cp128 were also investigated. In contrast to the optimal temperature of 45 °C for the WT enzyme, Cp128 exhibited the maximal activity at 37 °C. But it is noteworthy that there is no change in thermostability.
Collapse
|
21
|
Circular permutation prediction reveals a viable backbone disconnection for split proteins: an approach in identifying a new functional split intein. PLoS One 2012; 7:e43820. [PMID: 22937103 PMCID: PMC3427171 DOI: 10.1371/journal.pone.0043820] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2012] [Accepted: 07/26/2012] [Indexed: 01/30/2023] Open
Abstract
Split-protein systems have emerged as a powerful tool for detecting biomolecular interactions and reporting biological reactions. However, reliable methods for identifying viable split sites are still unavailable. In this study, we demonstrated the feasibility that valid circular permutation (CP) sites in proteins have the potential to act as split sites and that CP prediction can be used to search for internal permissive sites for creating new split proteins. Using a protein ligase, intein, as a model, CP predictor facilitated the creation of circular permutants in which backbone opening imposes the least detrimental effects on intein folding. We screened a series of predicted intein CPs and identified stable and native-fold CPs. When the valid CP sites were introduced as split sites, there was a reduction in folding enthalpy caused by the new backbone opening; however, the coincident loss in entropy was sufficient to be compensated, yielding a favorable free energy for self-association. Since split intein is exploited in protein semi-synthesis, we tested the related protein trans-splicing (PTS) activities of the corresponding split inteins. Notably, a novel functional split intein composed of the N-terminal 36 residues combined with the remaining C-terminal fragment was identified. Its PTS activity was shown to be better than current reported two-piece intein with a short N-terminal segment. Thus, the incorporation of in silico CP prediction facilitated the design of split intein as well as circular permutants.
Collapse
|
22
|
Lo WC, Wang LF, Liu YY, Dai T, Hwang JK, Lyu PC. CPred: a web server for predicting viable circular permutations in proteins. Nucleic Acids Res 2012; 40:W232-7. [PMID: 22693212 PMCID: PMC3394280 DOI: 10.1093/nar/gks529] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Circular permutation (CP) is a protein structural rearrangement phenomenon, through which nature allows structural homologs to have different locations of termini and thus varied activities, stabilities and functional properties. It can be applied in many fields of protein research and bioengineering. The limitation of applying CP lies in its technical complexity, high cost and uncertainty of the viability of the resulting protein variants. Not every position in a protein can be used to create a viable circular permutant, but there is still a lack of practical computational tools for evaluating the positional feasibility of CP before costly experiments are carried out. We have previously designed a comprehensive method for predicting viable CP cleavage sites in proteins. In this work, we implement that method into an efficient and user-friendly web server named CPred (CP site predictor), which is supposed to be helpful to promote fundamental researches and biotechnological applications of CP. The CPred is accessible at http://sarst.life.nthu.edu.tw/CPred.
Collapse
Affiliation(s)
- Wei-Cheng Lo
- Institute of Bioinformatics and Structural Biology, National Tsing Hua University, Hsinchu 30013, Taiwan
| | | | | | | | | | | |
Collapse
|
23
|
Yang Y, Zhan J, Zhao H, Zhou Y. A new size-independent score for pairwise protein structure alignment and its application to structure classification and nucleic-acid binding prediction. Proteins 2012; 80:2080-8. [PMID: 22522696 DOI: 10.1002/prot.24100] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2012] [Revised: 04/13/2012] [Accepted: 04/17/2012] [Indexed: 11/12/2022]
Abstract
A structure alignment program aligns two structures by optimizing a scoring function that measures structural similarity. It is highly desirable that such scoring function is independent of the sizes of proteins in comparison so that the significance of alignment across different sizes of the protein regions aligned is comparable. Here, we developed a new score called SP-score that fixes the cutoff distance at 4 Å and removed the size dependence using a normalization prefactor. We further built a program called SPalign that optimizes SP-score for structure alignment. SPalign was applied to recognize proteins within the same structure fold and having the same function of DNA or RNA binding. For fold discrimination, SPalign improves sensitivity over TMalign for the chain-level comparison by 12% and over DALI for the domain-level comparison by 13% at the same specificity of 99.6%. The difference between TMalign and SPalign at the chain level is due to the inability of TMalign to detect single domain similarity between multidomain proteins. For recognizing nucleic acid binding proteins, SPalign consistently improves over TMalign by 12% and DALI by 31% in average value of Mathews correlation coefficients for four datasets. SPalign with default setting is 14% faster than TMalign. SPalign is expected to be useful for function prediction and comparing structures with or without domains defined. The source code for SPalign and the server are available at http://sparks.informatics.iupui.edu.
Collapse
Affiliation(s)
- Yuedong Yang
- Indiana University School of Informatics, Indiana University-Purdue University, Indianapolis, Indiana 46202, USA
| | | | | | | |
Collapse
|
24
|
Affiliation(s)
- Spencer Bliven
- Bioinformatics Program, University of California, San Diego, La Jolla, California, United States of America
- * E-mail: (SB); (AP)
| | - Andreas Prlić
- San Diego Supercomputer Center, University of California San Diego, La Jolla, California, United States of America
- * E-mail: (SB); (AP)
| |
Collapse
|
25
|
Deciphering the preference and predicting the viability of circular permutations in proteins. PLoS One 2012; 7:e31791. [PMID: 22359629 PMCID: PMC3281007 DOI: 10.1371/journal.pone.0031791] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2011] [Accepted: 01/19/2012] [Indexed: 01/21/2023] Open
Abstract
Circular permutation (CP) refers to situations in which the termini of a protein are relocated to other positions in the structure. CP occurs naturally and has been artificially created to study protein function, stability and folding. Recently CP is increasingly applied to engineer enzyme structure and function, and to create bifunctional fusion proteins unachievable by tandem fusion. CP is a complicated and expensive technique. An intrinsic difficulty in its application lies in the fact that not every position in a protein is amenable for creating a viable permutant. To examine the preferences of CP and develop CP viability prediction methods, we carried out comprehensive analyses of the sequence, structural, and dynamical properties of known CP sites using a variety of statistics and simulation methods, such as the bootstrap aggregating, permutation test and molecular dynamics simulations. CP particularly favors Gly, Pro, Asp and Asn. Positions preferred by CP lie within coils, loops, turns, and at residues that are exposed to solvent, weakly hydrogen-bonded, environmentally unpacked, or flexible. Disfavored positions include Cys, bulky hydrophobic residues, and residues located within helices or near the protein's core. These results fostered the development of an effective viable CP site prediction system, which combined four machine learning methods, e.g., artificial neural networks, the support vector machine, a random forest, and a hierarchical feature integration procedure developed in this work. As assessed by using the hydrofolate reductase dataset as the independent evaluation dataset, this prediction system achieved an AUC of 0.9. Large-scale predictions have been performed for nine thousand representative protein structures; several new potential applications of CP were thus identified. Many unreported preferences of CP are revealed in this study. The developed system is the best CP viability prediction method currently available. This work will facilitate the application of CP in research and biotechnology.
Collapse
|
26
|
Stephen P, Tseng KL, Liu YN, Lyu PC. Circular permutation of the starch-binding domain: inversion of ligand selectivity with increased affinity. Chem Commun (Camb) 2012; 48:2612-4. [PMID: 22294161 DOI: 10.1039/c2cc17376j] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Proteins containing starch-binding domains (SBDs) are used in a variety of scientific and technological applications. A circularly permutated SBD (CP90) with improved affinity and selectivity toward longer-chain carbohydrates was synthesized, suggesting that a new starch-binding protein may be developed for specific scientific and industrial applications.
Collapse
Affiliation(s)
- Preyesh Stephen
- Institute of Bioinformatics and Structural Biology, National Tsing Hua University, No. 101, Sec. 2, Kuang Fu Rd, Hsinchu, 30013, Taiwan ROC
| | | | | | | |
Collapse
|
27
|
Szilágyi A, Zhang Y, Závodszky P. Intra-chain 3D segment swapping spawns the evolution of new multidomain protein architectures. J Mol Biol 2011; 415:221-35. [PMID: 22079367 DOI: 10.1016/j.jmb.2011.10.045] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2011] [Revised: 10/07/2011] [Accepted: 10/27/2011] [Indexed: 10/15/2022]
Abstract
Multidomain proteins form in evolution through the concatenation of domains, but structural domains may comprise multiple segments of the chain. In this work, we demonstrate that new multidomain architectures can evolve by an apparent three-dimensional swap of segments between structurally similar domains within a single-chain monomer. By a comprehensive structural search of the current Protein Data Bank (PDB), we identified 32 well-defined segment-swapped proteins (SSPs) belonging to 18 structural families. Nearly 13% of all multidomain proteins in the PDB may have a segment-swapped evolutionary precursor as estimated by more permissive searching criteria. The formation of SSPs can be explained by two principal evolutionary mechanisms: (i) domain swapping and fusion (DSF) and (ii) circular permutation (CP). By large-scale comparative analyses using structural alignment and hidden Markov model methods, it was found that the majority of SSPs have evolved via the DSF mechanism, and a much smaller fraction, via CP. Functional analyses further revealed that segment swapping, which results in two linkers connecting the domains, may impart directed flexibility to multidomain proteins and contributes to the development of new functions. Thus, inter-domain segment swapping represents a novel general mechanism by which new protein folds and multidomain architectures arise in evolution, and SSPs have structural and functional properties that make them worth defining as a separate group.
Collapse
Affiliation(s)
- András Szilágyi
- Institute of Enzymology, Hungarian Academy of Sciences, Karolina út 29, H-1113 Budapest, Hungary
| | | | | |
Collapse
|
28
|
Yu Y, Lutz S. Circular permutation: a different way to engineer enzyme structure and function. Trends Biotechnol 2011; 29:18-25. [DOI: 10.1016/j.tibtech.2010.10.004] [Citation(s) in RCA: 116] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2010] [Revised: 10/11/2010] [Accepted: 10/18/2010] [Indexed: 12/15/2022]
|
29
|
Chu CH, Lo WC, Wang HW, Hsu YC, Hwang JK, Lyu PC, Pai TW, Tang CY. Detection and alignment of 3D domain swapping proteins using angle-distance image-based secondary structural matching techniques. PLoS One 2010; 5:e13361. [PMID: 20976204 PMCID: PMC2955075 DOI: 10.1371/journal.pone.0013361] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2010] [Accepted: 09/13/2010] [Indexed: 11/18/2022] Open
Abstract
This work presents a novel detection method for three-dimensional domain swapping (DS), a mechanism for forming protein quaternary structures that can be visualized as if monomers had “opened” their “closed” structures and exchanged the opened portion to form intertwined oligomers. Since the first report of DS in the mid 1990s, an increasing number of identified cases has led to the postulation that DS might occur in a protein with an unconstrained terminus under appropriate conditions. DS may play important roles in the molecular evolution and functional regulation of proteins and the formation of depositions in Alzheimer's and prion diseases. Moreover, it is promising for designing auto-assembling biomaterials. Despite the increasing interest in DS, related bioinformatics methods are rarely available. Owing to a dramatic conformational difference between the monomeric/closed and oligomeric/open forms, conventional structural comparison methods are inadequate for detecting DS. Hence, there is also a lack of comprehensive datasets for studying DS. Based on angle-distance (A-D) image transformations of secondary structural elements (SSEs), specific patterns within A-D images can be recognized and classified for structural similarities. In this work, a matching algorithm to extract corresponding SSE pairs from A-D images and a novel DS score have been designed and demonstrated to be applicable to the detection of DS relationships. The Matthews correlation coefficient (MCC) and sensitivity of the proposed DS-detecting method were higher than 0.81 even when the sequence identities of the proteins examined were lower than 10%. On average, the alignment percentage and root-mean-square distance (RMSD) computed by the proposed method were 90% and 1.8Å for a set of 1,211 DS-related pairs of proteins. The performances of structural alignments remain high and stable for DS-related homologs with less than 10% sequence identities. In addition, the quality of its hinge loop determination is comparable to that of manual inspection. This method has been implemented as a web-based tool, which requires two protein structures as the input and then the type and/or existence of DS relationships between the input structures are determined according to the A-D image-based structural alignments and the DS score. The proposed method is expected to trigger large-scale studies of this interesting structural phenomenon and facilitate related applications.
Collapse
Affiliation(s)
- Chia-Han Chu
- Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan, Republic of China
| | - Wei-Cheng Lo
- Institute of Bioinformatics and Structural Biology, National Tsing Hua University, Hsinchu, Taiwan, Republic of China
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan, Republic of China
| | - Hsin-Wei Wang
- Department of Computer Science and Engineering, National Taiwan Ocean University, Keelung, Taiwan, Republic of China
| | - Yen-Chu Hsu
- Department of Computer Science and Engineering, National Taiwan Ocean University, Keelung, Taiwan, Republic of China
| | - Jenn-Kang Hwang
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan, Republic of China
| | - Ping-Chiang Lyu
- Institute of Bioinformatics and Structural Biology, National Tsing Hua University, Hsinchu, Taiwan, Republic of China
| | - Tun-Wen Pai
- Department of Computer Science and Engineering, National Taiwan Ocean University, Keelung, Taiwan, Republic of China
- * E-mail: (T-WP); (CYT)
| | - Chuan Yi Tang
- Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan, Republic of China
- Department of Computer Science and Information Engineering, Providence University, Taichung, Taiwan, Republic of China
- * E-mail: (T-WP); (CYT)
| |
Collapse
|
30
|
Wang L, Wu LY, Wang Y, Zhang XS, Chen L. SANA: an algorithm for sequential and non-sequential protein structure alignment. Amino Acids 2010; 39:417-25. [DOI: 10.1007/s00726-009-0457-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2008] [Accepted: 12/19/2009] [Indexed: 11/30/2022]
|
31
|
Abstract
BACKGROUND Protein structure comparison is a fundamental task in structural biology. While the number of known protein structures has grown rapidly over the last decade, searching a large database of protein structures is still relatively slow using existing methods. There is a need for new techniques which can rapidly compare protein structures, whilst maintaining high matching accuracy. RESULTS We have developed IR Tableau, a fast protein comparison algorithm, which leverages the tableau representation to compare protein tertiary structures. IR tableau compares tableaux using information retrieval style feature indexing techniques. Experimental analysis on the ASTRAL SCOP protein structural domain database demonstrates that IR Tableau achieves two orders of magnitude speedup over the search times of existing methods, while producing search results of comparable accuracy. CONCLUSION We show that it is possible to obtain very significant speedups for the protein structure comparison problem, by employing an information retrieval style approach for indexing proteins. The comparison accuracy achieved is also strong, thus opening the way for large scale processing of very large protein structure databases.
Collapse
|
32
|
Schmidt-Goenner T, Guerler A, Kolbeck B, Knapp EW. Circular permuted proteins in the universe of protein folds. Proteins 2009; 78:1618-30. [DOI: 10.1002/prot.22678] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
33
|
Lo WC, Lee CY, Lee CC, Lyu PC. iSARST: an integrated SARST web server for rapid protein structural similarity searches. Nucleic Acids Res 2009; 37:W545-51. [PMID: 19420060 PMCID: PMC2703971 DOI: 10.1093/nar/gkp291] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022] Open
Abstract
iSARST is a web server for efficient protein structural similarity searches. It is a multi-processor, batch-processing and integrated implementation of several structural comparison tools and two database searching methods: SARST for common structural homologs and CPSARST for homologs with circular permutations. iSARST allows users submitting multiple PDB/SCOP entry IDs or an archive file containing many structures. After scanning the target database using SARST/CPSARST, the ordering of hits are refined with conventional structure alignment tools such as FAST, TM-align and SAMO, which are run in a PC cluster. In this way, iSARST achieves a high running speed while preserving the high precision of refinement engines. The final outputs include tables listing co-linear or circularly permuted homologs of the query proteins and a functional summary of the best hits. Superimposed structures can be examined through an interactive and informative visualization tool. iSARST provides the first batch mode structural comparison web service for both co-linear homologs and circular permutants. It can serve as a rapid annotation system for functionally unknown or hypothetical proteins, which are increasing rapidly in this post-genomics era. The server can be accessed at http://sarst.life.nthu.edu.tw/iSARST/.
Collapse
Affiliation(s)
- Wei-Cheng Lo
- Institute of Bioinformatics and Structural Biology, National Tsing Hua University, Hsinchu, Taiwan
| | | | | | | |
Collapse
|
34
|
Abstract
Circular permutation (CP) in a protein can be considered as if its sequence were circularized followed by a creation of termini at a new location. Since the first observation of CP in 1979, a substantial number of studies have concluded that circular permutants (CPs) usually retain native structures and functions, sometimes with increased stability or functional diversity. Although this interesting property has made CP useful in many protein engineering and folding researches, large-scale collections of CP-related information were not available until this study. Here we describe CPDB, the first CP DataBase. The organizational principle of CPDB is a hierarchical categorization in which pairs of circular permutants are grouped into CP clusters, which are further grouped into folds and in turn classes. Additions to CPDB include a useful set of tools and resources for the identification, characterization, comparison and visualization of CP. Besides, several viable CP site prediction methods are implemented and assessed in CPDB. This database can be useful in protein folding and evolution studies, the discovery of novel protein structural and functional relationships, and facilitating the production of new CPs with unique biotechnical or industrial interests. The CPDB database can be accessed at http://sarst.life.nthu.edu.tw/cpdb
Collapse
Affiliation(s)
- Wei-Cheng Lo
- Institute of Bioinformatics and Structural Biology, National Tsing Hua University, Hsinchu 30013, Taiwan
| | | | | | | |
Collapse
|