1
|
Zhou B, Zheng L, Wu B, Tan Y, Lv O, Yi K, Fan G, Hong L. Protein Engineering with Lightweight Graph Denoising Neural Networks. J Chem Inf Model 2024; 64:3650-3661. [PMID: 38630581 DOI: 10.1021/acs.jcim.4c00036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/19/2024]
Abstract
Protein engineering faces challenges in finding optimal mutants from a massive pool of candidate mutants. In this study, we introduce a deep-learning-based data-efficient fitness prediction tool to steer protein engineering. Our methodology establishes a lightweight graph neural network scheme for protein structures, which efficiently analyzes the microenvironment of amino acids in wild-type proteins and reconstructs the distribution of the amino acid sequences that are more likely to pass natural selection. This distribution serves as a general guidance for scoring proteins toward arbitrary properties on any order of mutations. Our proposed solution undergoes extensive wet-lab experimental validation spanning diverse physicochemical properties of various proteins, including fluorescence intensity, antigen-antibody affinity, thermostability, and DNA cleavage activity. More than 40% of ProtLGN-designed single-site mutants outperform their wild-type counterparts across all studied proteins and targeted properties. More importantly, our model can bypass the negative epistatic effect to combine single mutation sites and form deep mutants with up to seven mutation sites in a single round, whose physicochemical properties are significantly improved. This observation provides compelling evidence of the structure-based model's potential to guide deep mutations in protein engineering. Overall, our approach emerges as a versatile tool for protein engineering, benefiting both the computational and bioengineering communities.
Collapse
Affiliation(s)
- Bingxin Zhou
- Institute of Natural Sciences, Shanghai Jiao Tong University, Shanghai 200240, China
- Shanghai National Center for Applied Mathematics (SJTU Center), Shanghai 200240, China
| | - Lirong Zheng
- Institute of Natural Sciences, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Banghao Wu
- Institute of Natural Sciences, Shanghai Jiao Tong University, Shanghai 200240, China
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yang Tan
- Institute of Natural Sciences, Shanghai Jiao Tong University, Shanghai 200240, China
- School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
- Shanghai Artificial Intelligence Laboratory, Shanghai 200232, China
| | - Outongyi Lv
- Institute of Natural Sciences, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Kai Yi
- School of Mathematics and Statistics, University of New South Wales, Sydney 2052, Australia
| | - Guisheng Fan
- School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Liang Hong
- Institute of Natural Sciences, Shanghai Jiao Tong University, Shanghai 200240, China
- Shanghai National Center for Applied Mathematics (SJTU Center), Shanghai 200240, China
- Shanghai Artificial Intelligence Laboratory, Shanghai 200232, China
- Zhangjiang Institute for Advanced Study, Shanghai Jiao Tong University, Shanghai 201203, China
| |
Collapse
|
2
|
Mardikoraem M, Woldring D. Protein Fitness Prediction Is Impacted by the Interplay of Language Models, Ensemble Learning, and Sampling Methods. Pharmaceutics 2023; 15:1337. [PMID: 37242577 PMCID: PMC10224321 DOI: 10.3390/pharmaceutics15051337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Revised: 04/19/2023] [Accepted: 04/21/2023] [Indexed: 05/28/2023] Open
Abstract
Advances in machine learning (ML) and the availability of protein sequences via high-throughput sequencing techniques have transformed the ability to design novel diagnostic and therapeutic proteins. ML allows protein engineers to capture complex trends hidden within protein sequences that would otherwise be difficult to identify in the context of the immense and rugged protein fitness landscape. Despite this potential, there persists a need for guidance during the training and evaluation of ML methods over sequencing data. Two key challenges for training discriminative models and evaluating their performance include handling severely imbalanced datasets (e.g., few high-fitness proteins among an abundance of non-functional proteins) and selecting appropriate protein sequence representations (numerical encodings). Here, we present a framework for applying ML over assay-labeled datasets to elucidate the capacity of sampling techniques and protein encoding methods to improve binding affinity and thermal stability prediction tasks. For protein sequence representations, we incorporate two widely used methods (One-Hot encoding and physiochemical encoding) and two language-based methods (next-token prediction, UniRep; masked-token prediction, ESM). Elaboration on performance is provided over protein fitness, protein size, and sampling techniques. In addition, an ensemble of protein representation methods is generated to discover the contribution of distinct representations and improve the final prediction score. We then implement multiple criteria decision analysis (MCDA; TOPSIS with entropy weighting), using multiple metrics well-suited for imbalanced data, to ensure statistical rigor in ranking our methods. Within the context of these datasets, the synthetic minority oversampling technique (SMOTE) outperformed undersampling while encoding sequences with One-Hot, UniRep, and ESM representations. Moreover, ensemble learning increased the predictive performance of the affinity-based dataset by 4% compared to the best single-encoding candidate (F1-score = 97%), while ESM alone was rigorous enough in stability prediction (F1-score = 92%).
Collapse
Affiliation(s)
- Mehrsa Mardikoraem
- Department of Chemical Engineering and Materials Science, Michigan State University, East Lansing, MI 48824, USA
- Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Daniel Woldring
- Department of Chemical Engineering and Materials Science, Michigan State University, East Lansing, MI 48824, USA
- Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
3
|
Seo K, Hagino K, Ichihashi N. Progresses in Cell-Free In Vitro Evolution. ADVANCES IN BIOCHEMICAL ENGINEERING/BIOTECHNOLOGY 2023; 186:121-140. [PMID: 37306699 DOI: 10.1007/10_2023_219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Biopolymers, such as proteins and RNA, are integral components of living organisms and have evolved through a process of repeated mutation and selection. The technique of "cell-free in vitro evolution" is a powerful experimental approach for developing biopolymers with desired functions and structural properties. Since Spiegelman's pioneering work over 50 years ago, biopolymers with a wide range of functions have been developed using in vitro evolution in cell-free systems. The use of cell-free systems offers several advantages, including the ability to synthesize a wider range of proteins without the limitations imposed by cytotoxicity, and the capacity for higher throughput and larger library sizes than cell-based evolutionary experiments. In this chapter, we provide a comprehensive overview of the progress made in the field of cell-free in vitro evolution by categorizing evolution into directed and undirected. The biopolymers produced by these methods are valuable assets in medicine and industry, and as a means of exploring the potential of biopolymers.
Collapse
Affiliation(s)
- Kaito Seo
- Department of Life Science, Graduate School of Arts and Science, The University of Tokyo, Tokyo, Japan
| | - Katsumi Hagino
- Department of Life Science, Graduate School of Arts and Science, The University of Tokyo, Tokyo, Japan
| | - Norikazu Ichihashi
- Department of Life Science, Graduate School of Arts and Science, The University of Tokyo, Tokyo, Japan.
- Komaba Institute for Science, The University of Tokyo, Tokyo, Japan.
- Universal Biology Institute, The University of Tokyo, Tokyo, Japan.
| |
Collapse
|
4
|
E C, Dai L, Yu J. Switching promotor recognition of phage RNA polymerase in silico along lab-directed evolution path. Biophys J 2022; 121:582-595. [PMID: 35031277 PMCID: PMC8874028 DOI: 10.1016/j.bpj.2022.01.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Revised: 12/01/2021] [Accepted: 01/10/2022] [Indexed: 11/16/2022] Open
Abstract
In this work, we computationally investigated how a viral RNA polymerase (RNAP) from bacteriophage T7 evolves into RNAP variants under lab-directed evolution to switch recognition from T7 promoter to T3 promoter in transcription initiation. We first constructed a closed initiation complex for the wild-type T7 RNAP and then for six mutant RNAPs discovered from phage-assisted continuous evolution experiments. All-atom molecular dynamics simulations up to 1 μs each were conducted on these RNAPs in a complex with the T7 and T3 promoters. Our simulations show notably that protein-DNA electrostatic interactions or stabilities at the RNAP-DNA promoter interface well dictate the promoter recognition preference of the RNAP and variants. Key residues and structural elements that contribute significantly to switching the promoter recognition were identified. Followed by a first point mutation N748D on the specificity loop to slightly disengage the RNAP from the promoter to hinder the original recognition, we found an auxiliary helix (206-225) that takes over switching the promoter recognition upon further mutations (E222K and E207K) by forming additional charge interactions with the promoter DNA and reorientating differently on the T7 and T3 promoters. Further mutations on the AT-rich loop and the specificity loop can fully switch the RNAP-promoter recognition to the T3 promoter. Overall, our studies reveal energetics and structural dynamics details along an exemplary directed evolutionary path of the phage RNAP variants for a rewired promoter recognition function. The findings demonstrate underlying physical mechanisms and are expected to assist knowledge and data learning or rational redesign of the protein enzyme structure function.
Collapse
Affiliation(s)
- Chao E
- Beijing Computational Science Research Center, Beijing, China
| | - Liqiang Dai
- Beijing Computational Science Research Center, Beijing, China; Shenzhen JL Computational Science and Applied Research Institute, Shenzhen, Guangdong, China
| | - Jin Yu
- Department of Physics and Astronomy, Department of Chemistry, NSF-Simons Center for Multiscale Cell Fate Research, University of California, Irvine, Irvine, California.
| |
Collapse
|
5
|
George A, Ravi R, Tiwari PB, Srivastava SR, Jain V, Mahalakshmi R. Engineering a Hyperstable Yersinia pestis Outer Membrane Protein Ail Using Thermodynamic Design. J Am Chem Soc 2022; 144:1545-1555. [DOI: 10.1021/jacs.1c05964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Anjana George
- Molecular Biophysics Laboratory, Department of Biological Sciences, Indian Institute of Science Education and Research, Bhopal - 462066, India
| | - Roshika Ravi
- Molecular Biophysics Laboratory, Department of Biological Sciences, Indian Institute of Science Education and Research, Bhopal - 462066, India
| | - Pankaj Bharat Tiwari
- Molecular Biophysics Laboratory, Department of Biological Sciences, Indian Institute of Science Education and Research, Bhopal - 462066, India
| | - Shashank Ranjan Srivastava
- Molecular Biophysics Laboratory, Department of Biological Sciences, Indian Institute of Science Education and Research, Bhopal - 462066, India
| | - Vikas Jain
- Microbiology and Molecular Biology Laboratory, Department of Biological Sciences, Indian Institute of Science Education and Research, Bhopal - 462066, India
| | - Radhakrishnan Mahalakshmi
- Molecular Biophysics Laboratory, Department of Biological Sciences, Indian Institute of Science Education and Research, Bhopal - 462066, India
| |
Collapse
|
6
|
Siedhoff NE, Illig AM, Schwaneberg U, Davari MD. PyPEF-An Integrated Framework for Data-Driven Protein Engineering. J Chem Inf Model 2021; 61:3463-3476. [PMID: 34260225 DOI: 10.1021/acs.jcim.1c00099] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Data-driven strategies are gaining increased attention in protein engineering due to recent advances in access to large experimental databanks of proteins, next-generation sequencing (NGS), high-throughput screening (HTS) methods, and the development of artificial intelligence algorithms. However, the reliable prediction of beneficial amino acid substitutions, their combination, and the effect on functional properties remain the most significant challenges in protein engineering, which is applied to develop proteins and enzymes for biocatalysis, biomedicine, and life sciences. Here, we present a general-purpose framework (PyPEF: pythonic protein engineering framework) for performing data-driven protein engineering using machine learning methods combined with techniques from signal processing and statistical physics. PyPEF guides the identification and selection of beneficial proteins of a defined sequence space by systematically or randomly exploring the fitness of variants and by sampling random evolution pathways. The performance of PyPEF was evaluated concerning its predictive accuracy and throughput on four public protein and enzyme data sets using common regression models. It was proved that the program could efficiently predict the fitness of protein sequences for different target properties (predictive models with coefficient of determination values ranging from 0.58 to 0.92). By combining machine learning and protein evolution, PyPEF enabled the screening of proteins with various functions, reaching a screening capacity of more than 500,000 protein sequence variants in the timeframe of only a few minutes on a personal computer. PyPEF displayed significant accuracies on four public data sets (different proteins and properties) and underlined the potential of integrating data-driven technologies for covering different philosophies by either predicting the fitness of the variants to the highest accuracy accounting for epistatic effects or capturing the general trend of introduced mutations on the fitness in directed protein evolution campaigns. In essence, PyPEF can provide a powerful solution to current sequence exploration and combinatorial problems faced in protein engineering through exhaustive in silico screening of the sequence space.
Collapse
Affiliation(s)
- Niklas E Siedhoff
- Institute of Biotechnology, RWTH Aachen University, Worringer Weg 3, 52074 Aachen, Germany
| | | | - Ulrich Schwaneberg
- Institute of Biotechnology, RWTH Aachen University, Worringer Weg 3, 52074 Aachen, Germany.,DWI-Leibniz Institute for Interactive Materials, Forckenbeckstraße 50, 52074 Aachen, Germany
| | - Mehdi D Davari
- Institute of Biotechnology, RWTH Aachen University, Worringer Weg 3, 52074 Aachen, Germany
| |
Collapse
|
7
|
Ngo K, Bruno da Silva F, Leite VBP, Contessoto VG, Onuchic JN. Improving the Thermostability of Xylanase A from Bacillus subtilis by Combining Bioinformatics and Electrostatic Interactions Optimization. J Phys Chem B 2021; 125:4359-4367. [PMID: 33887137 DOI: 10.1021/acs.jpcb.1c01253] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The rational improvement of the enzyme catalytic activity is one of the most significant challenges in biotechnology. Most conventional strategies used to engineer enzymes involve selecting mutations to increase their thermostability. Determining good criteria for choosing these substitutions continues to be a challenge. In this work, we combine bioinformatics, electrostatic analysis, and molecular dynamics to predict beneficial mutations that may improve the thermostability of XynA from Bacillus subtilis. First, the Tanford-Kirkwood surface accessibility method is used to characterize each ionizable residue contribution to the protein native state stability. Residues identified to be destabilizing were mutated with the corresponding residues determined by the consensus or ancestral sequences at the same locations. Five mutants (K99T/N151D, K99T, S31R, N151D, and K154A) were investigated and compared with 12 control mutants derived from experimental approaches from the literature. Molecular dynamics results show that the mutants exhibited folding temperatures in the order K99T > K99T/N151D > S31R > N151D > WT > K154A. The combined approaches employed provide an effective strategy for low-cost enzyme optimization needed for large-scale biotechnological and medical applications.
Collapse
Affiliation(s)
- Khoa Ngo
- Department of Physics, University of Houston, Houston, Texas 77004, United States
| | - Fernando Bruno da Silva
- Departamento de Física, Instituto de Biociências, Letras e Ciências Exatas UNESP - Univ. Estadual Paulista, São José do Rio Preto, SP, Brazil
| | - Vitor B P Leite
- Departamento de Física, Instituto de Biociências, Letras e Ciências Exatas UNESP - Univ. Estadual Paulista, São José do Rio Preto, SP, Brazil
| | - Vinícius G Contessoto
- Departamento de Física, Instituto de Biociências, Letras e Ciências Exatas UNESP - Univ. Estadual Paulista, São José do Rio Preto, SP, Brazil
| | | |
Collapse
|
8
|
Pannecoucke E, Van Trimpont M, Desmet J, Pieters T, Reunes L, Demoen L, Vuylsteke M, Loverix S, Vandenbroucke K, Alard P, Henderikx P, Deroo S, Baatz F, Lorent E, Thiolloy S, Somers K, McGrath Y, Van Vlierberghe P, Lasters I, Savvides SN. Cell-penetrating Alphabody protein scaffolds for intracellular drug targeting. SCIENCE ADVANCES 2021; 7:7/13/eabe1682. [PMID: 33771865 PMCID: PMC7997521 DOI: 10.1126/sciadv.abe1682] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Accepted: 02/05/2021] [Indexed: 05/02/2023]
Abstract
The therapeutic scope of antibody and nonantibody protein scaffolds is still prohibitively limited against intracellular drug targets. Here, we demonstrate that the Alphabody scaffold can be engineered into a cell-penetrating protein antagonist against induced myeloid leukemia cell differentiation protein MCL-1, an intracellular target in cancer, by grafting the critical B-cell lymphoma 2 homology 3 helix of MCL-1 onto the Alphabody and tagging the scaffold's termini with designed cell-penetration polypeptides. Introduction of an albumin-binding moiety extended the serum half-life of the engineered Alphabody to therapeutically relevant levels, and administration thereof in mouse tumor xenografts based on myeloma cell lines reduced tumor burden. Crystal structures of such a designed Alphabody in complex with MCL-1 and serum albumin provided the structural blueprint of the applied design principles. Collectively, we provide proof of concept for the use of Alphabodies against intracellular disease mediators, which, to date, have remained in the realm of small-molecule therapeutics.
Collapse
Affiliation(s)
- Erwin Pannecoucke
- VIB Center for Inflammation Research, 9052 Ghent, Belgium
- Unit for Structural Biology, Department of Biochemistry and Microbiology, Ghent University, 9052 Ghent, Belgium
| | - Maaike Van Trimpont
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
- Cancer Research Institute Ghent (CRIG), Ghent, Belgium
| | | | - Tim Pieters
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
- Cancer Research Institute Ghent (CRIG), Ghent, Belgium
| | - Lindy Reunes
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
- Cancer Research Institute Ghent (CRIG), Ghent, Belgium
| | - Lisa Demoen
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
- Cancer Research Institute Ghent (CRIG), Ghent, Belgium
| | | | | | | | | | | | | | | | | | | | | | | | - Pieter Van Vlierberghe
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
- Cancer Research Institute Ghent (CRIG), Ghent, Belgium
| | | | - Savvas N Savvides
- VIB Center for Inflammation Research, 9052 Ghent, Belgium.
- Unit for Structural Biology, Department of Biochemistry and Microbiology, Ghent University, 9052 Ghent, Belgium
| |
Collapse
|
9
|
Phylogeny and Structure of Fatty Acid Photodecarboxylases and Glucose-Methanol-Choline Oxidoreductases. Catalysts 2020. [DOI: 10.3390/catal10091072] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Glucose-methanol-choline (GMC) oxidoreductases are a large and diverse family of flavin-binding enzymes found in all kingdoms of life. Recently, a new related family of proteins has been discovered in algae named fatty acid photodecarboxylases (FAPs). These enzymes use the energy of light to convert fatty acids to the corresponding Cn-1 alkanes or alkenes, and hold great potential for biotechnological application. In this work, we aimed at uncovering the natural diversity of FAPs and their relations with other GMC oxidoreductases. We reviewed the available GMC structures, assembled a large dataset of GMC sequences, and found that one active site amino acid, a histidine, is extremely well conserved among the GMC proteins but not among FAPs, where it is replaced with alanine. Using this criterion, we found several new potential FAP genes, both in genomic and metagenomic databases, and showed that related bacterial, archaeal and fungal genes are unlikely to be FAPs. We also identified several uncharacterized clusters of GMC-like proteins as well as subfamilies of proteins that lack the conserved histidine but are not FAPs. Finally, the analysis of the collected dataset of potential photodecarboxylase sequences revealed the key active site residues that are strictly conserved, whereas other residues in the vicinity of the flavin adenine dinucleotide (FAD) cofactor and in the fatty acid-binding pocket are more variable. The identified variants may have different FAP activity and selectivity and consequently may prove useful for new biotechnological applications, thereby fostering the transition from a fossil carbon-based economy to a bio-economy by enabling the sustainable production of hydrocarbon fuels.
Collapse
|
10
|
Halder R, Jana B. Exploring the role of hydrophilic amino acids in unfolding of protein in aqueous ethanol solution. Proteins 2020; 89:116-125. [PMID: 32860277 DOI: 10.1002/prot.25999] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2019] [Revised: 08/07/2020] [Accepted: 08/25/2020] [Indexed: 12/14/2022]
Abstract
Hydrophobic association is the key contributor behind the formation of well packed core of a protein which is often believed to be an important step for folding from an unfolded chain to its compact functional form. While most of the protein folding/unfolding studies have evaluated the changes in the hydrophobic interactions during chemical denaturation, the role of hydrophilic amino acids in such processes are not discussed in detail. Here we report the role of the hydrophilic amino acids behind ethanol induced unfolding of protein. Using free energy simulations, we show that chicken villin head piece (HP-36) protein unfolds gradually in presence of water-ethanol binary mixture with increasing composition of ethanol. However, upon mutation of hydrophilic amino acids by glycine while keeping the hydrophobic amino acids intact, the compact state of the protein is found to be stable at all compositions with gradual flattening of the free energy landscape upon increasing compositions. The local environment around the protein in terms of ethanol/water number significantly differs in wild type protein compared to the mutated protein. The calculated Wyman-Tanford preferential binding coefficient of ethanol for wild type protein reveals that a greater number of cosolutes (here ethanol) bind to the unfolded state compared to its folded state. However, no significant increase in binding coefficient of ethanol at the unfolded state is found for mutated protein. Local-bulk partition coefficient calculation also suggests similar scenarios. Our results reveal that the weakening of hydrophobic interactions in aqueous ethanol solution along with larger preferential binding of ethanol to the unfolded state mediated by hydrophilic amino acids combinedly helps unfolding of protein in aqueous ethanol solution.
Collapse
Affiliation(s)
- Ritaban Halder
- School of Chemical Sciences, Indian Association for the cultivation of Science, Jadavpur, Kolkata, West Bengal, India
| | - Biman Jana
- School of Chemical Sciences, Indian Association for the cultivation of Science, Jadavpur, Kolkata, West Bengal, India
| |
Collapse
|
11
|
Zhang L, Xiao WH, Wang Y, Yao MD, Jiang GZ, Zeng BX, Zhang RS, Yuan YJ. Chassis and key enzymes engineering for monoterpenes production. Biotechnol Adv 2017; 35:1022-1031. [DOI: 10.1016/j.biotechadv.2017.09.002] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2017] [Revised: 09/02/2017] [Accepted: 09/04/2017] [Indexed: 02/07/2023]
|
12
|
Application of conventional molecular dynamics simulation in evaluating the stability of apomyoglobin in urea solution. Sci Rep 2017; 7:44651. [PMID: 28300210 PMCID: PMC5353640 DOI: 10.1038/srep44651] [Citation(s) in RCA: 61] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2016] [Accepted: 02/09/2017] [Indexed: 01/02/2023] Open
Abstract
In this study, we had exploited the advancement in computer technology to determine the stability of four apomyoglobin variants namely wild type, E109A, E109G and G65A/G73A by conducting conventional molecular dynamics simulations in explicit urea solution. Variations in RMSD, native contacts and solvent accessible surface area of the apomyoglobin variants during the simulation were calculated to probe the effect of mutation on the overall conformation of the protein. Subsequently, the mechanism leading to the destabilization of the apoMb variants was studied through the calculation of correlation matrix, principal component analyses, hydrogen bond analyses and RMSF. The results obtained here correlate well with the study conducted by Baldwin and Luo which showed improved stability of apomyoglobin with E109A mutation and contrariwise for E109G and G65A/G73A mutation. These positive observations showcase the feasibility of exploiting MD simulation in determining protein stability prior to protein expression.
Collapse
|
13
|
Childers MC, Daggett V. Insights from molecular dynamics simulations for computational protein design. MOLECULAR SYSTEMS DESIGN & ENGINEERING 2017; 2:9-33. [PMID: 28239489 PMCID: PMC5321087 DOI: 10.1039/c6me00083e] [Citation(s) in RCA: 127] [Impact Index Per Article: 18.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
A grand challenge in the field of structural biology is to design and engineer proteins that exhibit targeted functions. Although much success on this front has been achieved, design success rates remain low, an ever-present reminder of our limited understanding of the relationship between amino acid sequences and the structures they adopt. In addition to experimental techniques and rational design strategies, computational methods have been employed to aid in the design and engineering of proteins. Molecular dynamics (MD) is one such method that simulates the motions of proteins according to classical dynamics. Here, we review how insights into protein dynamics derived from MD simulations have influenced the design of proteins. One of the greatest strengths of MD is its capacity to reveal information beyond what is available in the static structures deposited in the Protein Data Bank. In this regard simulations can be used to directly guide protein design by providing atomistic details of the dynamic molecular interactions contributing to protein stability and function. MD simulations can also be used as a virtual screening tool to rank, select, identify, and assess potential designs. MD is uniquely poised to inform protein design efforts where the application requires realistic models of protein dynamics and atomic level descriptions of the relationship between dynamics and function. Here, we review cases where MD simulations was used to modulate protein stability and protein function by providing information regarding the conformation(s), conformational transitions, interactions, and dynamics that govern stability and function. In addition, we discuss cases where conformations from protein folding/unfolding simulations have been exploited for protein design, yielding novel outcomes that could not be obtained from static structures.
Collapse
Affiliation(s)
| | - Valerie Daggett
- Corresponding author: , Phone: 1.206.685.7420, Fax: 1.206.685.3300
| |
Collapse
|
14
|
Heinemann J, Deng K, Shih SCC, Gao J, Adams PD, Singh AK, Northen TR. On-chip integration of droplet microfluidics and nanostructure-initiator mass spectrometry for enzyme screening. LAB ON A CHIP 2017; 17:323-331. [PMID: 27957569 DOI: 10.1039/c6lc01182a] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Biological assays often require expensive reagents and tedious manipulations. These shortcomings can be overcome using digitally operated microfluidic devices that require reduced sample volumes to automate assays. One particular challenge is integrating bioassays with mass spectrometry based analysis. Towards this goal we have developed μNIMS, a highly sensitive and high throughput technique that integrates droplet microfluidics with nanostructure-initiator mass spectrometry (NIMS). Enzyme reactions are carried out in droplets that can be arrayed on discrete NIMS elements at defined time intervals for subsequent mass spectrometry analysis, enabling time resolved enzyme activity assay. We apply the μNIMS platform for kinetic characterization of a glycoside hydrolase enzyme (CelE-CMB3A), a chimeric enzyme capable of deconstructing plant hemicellulose into monosaccharides for subsequent conversion to biofuel. This study reveals NIMS nanostructures can be fabricated into arrays for microfluidic droplet deposition, NIMS is compatible with droplet and digital microfluidics, and can be used on-chip to assay glycoside hydrolase enzyme in vitro.
Collapse
Affiliation(s)
- Joshua Heinemann
- Joint Bioenergy Institute, Emeryville, California 94608, USA and Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA.
| | - Kai Deng
- Joint Bioenergy Institute, Emeryville, California 94608, USA and Sandia National Laboratories, Livermore, California 94551, USA
| | - Steve C C Shih
- Department of Electrical and Computer Engineering, Concordia University, Montreal, Quebec, Canada
| | - Jian Gao
- Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA.
| | - Paul D Adams
- Joint Bioenergy Institute, Emeryville, California 94608, USA and Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA. and Department of Bioengineering, University of California, Berkeley, California, 94720, USA
| | - Anup K Singh
- Joint Bioenergy Institute, Emeryville, California 94608, USA and Sandia National Laboratories, Livermore, California 94551, USA
| | - Trent R Northen
- Joint Bioenergy Institute, Emeryville, California 94608, USA and Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA. and Joint Genome Institute, Walnut creek, California, 94598, USA
| |
Collapse
|
15
|
Bayram Akcapinar G, Venturini A, Martelli PL, Casadio R, Sezerman UO. Modulating the thermostability of Endoglucanase I from Trichoderma reesei using computational approaches. Protein Eng Des Sel 2015; 28:127-35. [DOI: 10.1093/protein/gzv012] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2014] [Accepted: 02/04/2015] [Indexed: 11/12/2022] Open
|
16
|
van den Berg BA, Reinders MJ, van der Laan JM, Roubos JA, de Ridder D. Protein redesign by learning from data. Protein Eng Des Sel 2014; 27:281-8. [DOI: 10.1093/protein/gzu031] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
|
17
|
Sijenyi F, Saro P, Ouyang Z, Damm-Ganamet K, Wood M, Jiang J, SantaLucia J. The RNA Folding Problems: Different Levels of sRNA Structure Prediction. NUCLEIC ACIDS AND MOLECULAR BIOLOGY 2012. [DOI: 10.1007/978-3-642-25740-7_6] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
18
|
Huggins DJ, Tidor B. Systematic placement of structural water molecules for improved scoring of protein-ligand interactions. Protein Eng Des Sel 2011; 24:777-89. [PMID: 21771870 PMCID: PMC3170077 DOI: 10.1093/protein/gzr036] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2011] [Revised: 06/03/2011] [Accepted: 06/15/2011] [Indexed: 11/13/2022] Open
Abstract
Structural water molecules are found in many protein-ligand complexes. They are known to be vital in mediating hydrogen-bonding interactions and, in some cases, key for facilitating tight binding. It is thus very important to consider water molecules when attempting to model protein-ligand interactions for cognate ligand identification, virtual screening and drug design. While the rigid treatment of water molecules present in structures is feasible, the more relevant task of treating all possible positions and orientations of water molecules with each possible ligand pose is computationally daunting. Current methods in molecular docking provide partial treatment for such water molecules, with modest success. Here we describe a new method employing dead-end elimination to place water molecules within a binding site, bridging interactions between protein and ligand. Dead-end elimination permits a thorough, though still incomplete, treatment of water placement. The results show that this method is able to place water molecules correctly within known complexes and to create physically reasonable hydrogen bonds. The approach has also been incorporated within an inverse molecular design approach, to model a variety of compounds in the process of de novo ligand design. The inclusion of structural water molecules, combined with ranking based on the electrostatic contribution to binding affinity, improves a number of otherwise poor energetic predictions.
Collapse
Affiliation(s)
- David J. Huggins
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139–4307, USA
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139–4307, USA
| | - Bruce Tidor
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139–4307, USA
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139–4307, USA
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139–4307, USA
| |
Collapse
|
19
|
Samish I, MacDermaid CM, Perez-Aguilar JM, Saven JG. Theoretical and Computational Protein Design. Annu Rev Phys Chem 2011; 62:129-49. [DOI: 10.1146/annurev-physchem-032210-103509] [Citation(s) in RCA: 119] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
| | | | | | - Jeffery G. Saven
- Department of Chemistry, University of Pennsylvania, Philadelphia, Pennsylvania 19104;
| |
Collapse
|
20
|
Balaraman GS, Bhattacharya S, Vaidehi N. Structural insights into conformational stability of wild-type and mutant beta1-adrenergic receptor. Biophys J 2010; 99:568-77. [PMID: 20643076 DOI: 10.1016/j.bpj.2010.04.075] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2010] [Revised: 04/09/2010] [Accepted: 04/16/2010] [Indexed: 11/26/2022] Open
Abstract
Recent experiments to derive a thermally stable mutant of turkey beta-1-adrenergic receptor (beta1AR) have shown that a combination of six single point mutations resulted in a 20 degrees C increase in thermal stability in mutant beta1AR. Here we have used the all-atom force-field energy function to calculate a stability score to detect stabilizing point mutations in G-protein coupled receptors. The calculated stability score shows good correlation with the measured thermal stability for 76 single point mutations and 22 multiple mutants in beta1AR. We have demonstrated that conformational sampling of the receptor for various mutants improve the prediction of thermal stability by 50%. Point mutations Y227A5.58, V230A5.61, and F338M7.48 in the thermally stable mutant m23-beta1AR stabilizes key microdomains of the receptor in the inactive conformation. The Y227A5.58 and V230A5.61 mutations stabilize the ionic lock between R139(3.50) on transmembrane helix3 and E285(6.30) on transmembrane helix6. The mutation F338M7.48 on TM7 alters the interaction of the conserved motif NPxxY(x)5,6F with helix8 and hence modulates the interaction of TM2-TM7-helix8 microdomain. The D186-R317 salt bridge (in extracellular loops 2 and 3) is stabilized in the cyanopindolol-bound wild-type beta1AR, whereas the salt bridge between D184-R317 is preferred in the mutant m23. We propose that this could be the surrogate to a similar salt bridge found between the extracellular loop 2 and TM7 in beta2AR reported recently. We show that the binding energy difference between the inactive and active states is less in m23 compared to the wild-type, which explains the activation of m23 at higher norepinephrine concentration compared to the wild-type. Results from this work throw light into the mechanism behind stabilizing mutations. The computational scheme proposed in this work could be used to design stabilizing mutations for other G-protein coupled receptors.
Collapse
Affiliation(s)
- Gouthaman S Balaraman
- Division of Immunology, Beckman Research Institute of the City of Hope, Duarte, California, USA
| | | | | |
Collapse
|
21
|
Barakat NH, Barakat NH, Love JJ. Combined use of experimental and computational screens to characterize protein stability. Protein Eng Des Sel 2010; 23:799-807. [PMID: 20805093 DOI: 10.1093/protein/gzq052] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
One of the primary goals of protein design is to engineer proteins with improved stability. Protein stability is a key issue for chemical, biotechnology and pharmaceutical industries. The development of robust proteins/enzymes with the ability to withstand the potentially harsh conditions of industrial operations is of high importance. A number of strategies are currently being employed to achieve this goal. Two particular approaches, (i) directed evolution and (ii) computational protein design, are quite powerful yet have only recently been combined or applied and analyzed in parallel. In directed evolution, libraries of variants are searched experimentally for clones possessing the desired properties. With computational methods, protein design algorithms are utilized to perform in silico screening for stable protein sequences. Here, we used gene libraries of an unstable variant of streptococcal protein G (Gbeta1) and an in vivo screening method to identify stabilized variants. Many variants with notably increased thermal stabilities were isolated and characterized. Concomitantly, computational techniques and protein design algorithms were used to perform in silico screening of the same destabilized variant of Gbeta1. The combined use, and critical analysis, of these methods promises to advance the field of protein design.
Collapse
Affiliation(s)
- Nora H Barakat
- Department of Chemistry and Biochemistry, San Diego State University, 5500 Campanile Dr, San Diego, CA 92182-1030, USA
| | | | | |
Collapse
|
22
|
Noivirt-Brik O, Horovitz A, Unger R. Trade-off between positive and negative design of protein stability: from lattice models to real proteins. PLoS Comput Biol 2009; 5:e1000592. [PMID: 20011105 PMCID: PMC2781108 DOI: 10.1371/journal.pcbi.1000592] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2009] [Accepted: 11/03/2009] [Indexed: 11/18/2022] Open
Abstract
Two different strategies for stabilizing proteins are (i) positive design in which the native state is stabilized and (ii) negative design in which competing non-native conformations are destabilized. Here, the circumstances under which one strategy might be favored over the other are explored in the case of lattice models of proteins and then generalized and discussed with regard to real proteins. The balance between positive and negative design of proteins is found to be determined by their average "contact-frequency", a property that corresponds to the fraction of states in the conformational ensemble of the sequence in which a pair of residues is in contact. Lattice model proteins with a high average contact-frequency are found to use negative design more than model proteins with a low average contact-frequency. A mathematical derivation of this result indicates that it is general and likely to hold also for real proteins. Comparison of the results of correlated mutation analysis for real proteins with typical contact-frequencies to those of proteins likely to have high contact-frequencies (such as disordered proteins and proteins that are dependent on chaperonins for their folding) indicates that the latter tend to have stronger interactions between residues that are not in contact in their native conformation. Hence, our work indicates that negative design is employed when insufficient stabilization is achieved via positive design owing to high contact-frequencies.
Collapse
Affiliation(s)
- Orly Noivirt-Brik
- Department of Structural Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Amnon Horovitz
- Department of Structural Biology, Weizmann Institute of Science, Rehovot, Israel
- * E-mail:
| | - Ron Unger
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, Israel
| |
Collapse
|
23
|
Huggins DJ, Altman MD, Tidor B. Evaluation of an inverse molecular design algorithm in a model binding site. Proteins 2009; 75:168-86. [PMID: 18831031 DOI: 10.1002/prot.22226] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Computational molecular design is a useful tool in modern drug discovery. Virtual screening is an approach that docks and then scores individual members of compound libraries. In contrast to this forward approach, inverse approaches construct compounds from fragments, such that the computed affinity, or a combination of relevant properties, is optimized. We have recently developed a new inverse approach to drug design based on the dead-end elimination and A* algorithms employing a physical potential function. This approach has been applied to combinatorially constructed libraries of small-molecule ligands to design high-affinity HIV-1 protease inhibitors (Altman et al., J Am Chem Soc 2008;130:6099-6013). Here we have evaluated the new method using the well-studied W191G mutant of cytochrome c peroxidase. This mutant possesses a charged binding pocket and has been used to evaluate other design approaches. The results show that overall the new inverse approach does an excellent job of separating binders from nonbinders. For a few individual cases, scoring inaccuracies led to false positives. The majority of these involve erroneous solvation energy estimation for charged amines, anilinium ions, and phenols, which has been observed previously for a variety of scoring algorithms. Interestingly, although inverse approaches are generally expected to identify some but not all binders in a library, due to limited conformational searching, these results show excellent coverage of the known binders while still showing strong discrimination of the nonbinders.
Collapse
Affiliation(s)
- David J Huggins
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | | | | |
Collapse
|
24
|
Noivirt-Brik O, Unger R, Horovitz A. Analysing the origin of long-range interactions in proteins using lattice models. BMC STRUCTURAL BIOLOGY 2009; 9:4. [PMID: 19178726 PMCID: PMC2670300 DOI: 10.1186/1472-6807-9-4] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/18/2008] [Accepted: 01/29/2009] [Indexed: 11/10/2022]
Abstract
BACKGROUND Long-range communication is very common in proteins but the physical basis of this phenomenon remains unclear. In order to gain insight into this problem, we decided to explore whether long-range interactions exist in lattice models of proteins. Lattice models of proteins have proven to capture some of the basic properties of real proteins and, thus, can be used for elucidating general principles of protein stability and folding. RESULTS Using a computational version of double-mutant cycle analysis, we show that long-range interactions emerge in lattice models even though they are not an input feature of them. The coupling energy of both short- and long-range pairwise interactions is found to become more positive (destabilizing) in a linear fashion with increasing 'contact-frequency', an entropic term that corresponds to the fraction of states in the conformational ensemble of the sequence in which the pair of residues is in contact. A mathematical derivation of the linear dependence of the coupling energy on 'contact-frequency' is provided. CONCLUSION Our work shows how 'contact-frequency' should be taken into account in attempts to stabilize proteins by introducing (or stabilizing) contacts in the native state and/or through 'negative design' of non-native contacts.
Collapse
Affiliation(s)
- Orly Noivirt-Brik
- Department of Structural Biology, Weizmann Institute of Science, Rehovot 76100, Israel.
| | | | | |
Collapse
|
25
|
Suárez M, Tortosa P, Carrera J, Jaramillo A. Pareto optimization in computational protein design with multiple objectives. J Comput Chem 2008; 29:2704-11. [DOI: 10.1002/jcc.20981] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
26
|
Using a strategy based on the concept of convergent evolution to identify residue substitutions responsible for thermal adaptation. Proteins 2008; 73:53-62. [DOI: 10.1002/prot.22049] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
27
|
Wilson CJ, Zhan H, Swint-Kruse L, Matthews KS. Ligand interactions with lactose repressor protein and the repressor-operator complex: the effects of ionization and oligomerization on binding. Biophys Chem 2006; 126:94-105. [PMID: 16860458 DOI: 10.1016/j.bpc.2006.06.005] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2006] [Revised: 06/09/2006] [Accepted: 06/10/2006] [Indexed: 10/24/2022]
Abstract
Specific interactions between proteins and ligands that modify their functions are crucial in biology. Here, we examine sugars that bind the lactose repressor protein (LacI) and modify repressor affinity for operator DNA using isothermal titration calorimetry and equilibrium DNA binding experiments. High affinity binding of the commonly-used inducer isopropyl-beta,D-thiogalactoside is strongly driven by enthalpic forces, whereas inducer 2-phenylethyl-beta,D-galactoside has weaker affinity with low enthalpic contributions. Perturbing the dimer interface with either pH or oligomeric state shows that weak inducer binding is sensitive to changes in this distant region. Effects of the neutral compound o-nitrophenyl-beta,D-galactoside are sensitive to oligomerization, and at elevated pH this compound converts to an anti-inducer ligand with slightly enhanced enthalpic contributions to the binding energy. Anti-inducer o-nitrophenyl-beta,D-fucoside exhibits slightly enhanced affinity and increased enthalpic contributions at elevated pH. Collectively, these results both demonstrate the range of energetic consequences that occur with LacI binding to structurally-similar ligands and expand our insight into the link between effector binding and structural changes at the subunit interface.
Collapse
Affiliation(s)
- Corey J Wilson
- Department of Biochemistry and Cell Biology, Rice University, 6100 Main Street, Houston, TX 77005, USA
| | | | | | | |
Collapse
|
28
|
Mattanovich D, Borth N. Applications of cell sorting in biotechnology. Microb Cell Fact 2006; 5:12. [PMID: 16551353 PMCID: PMC1435767 DOI: 10.1186/1475-2859-5-12] [Citation(s) in RCA: 94] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2005] [Accepted: 03/21/2006] [Indexed: 01/28/2023] Open
Abstract
Due to its unique capability to analyze a large number of single cells for several parameters simultaneously, flow cytometry has changed our understanding of the behavior of cells in culture and of the population dynamics even of clonal populations. The potential of this method for biotechnological research, which is based on populations of living cells, was soon appreciated. Sorting applications, however, are still less frequent than one would expect with regard to their potential. This review highlights important contributions where flow cytometric cell sorting was used for physiological research, protein engineering, cell engineering, specifically emphasizing selection of overproducing cell lines. Finally conclusions are drawn concerning the impact of cell sorting on inverse metabolic engineering and systems biology.
Collapse
Affiliation(s)
- Diethard Mattanovich
- University of Natural Resources and Applied Life Sciences Vienna, Department of Biotechnology, Institute of Applied Microbiology, Muthgasse 18, A-1190 Vienna, Austria
- School of Bioengineering, University of Applied Sciences FH-Campus Vienna, Muthgasse 18, A-1190 Vienna, Austria
| | - Nicole Borth
- University of Natural Resources and Applied Life Sciences Vienna, Department of Biotechnology, Institute of Applied Microbiology, Muthgasse 18, A-1190 Vienna, Austria
| |
Collapse
|
29
|
Abstract
Protein design allows sequence-to-structure relationships in proteins to be examined and, potentially, new protein structures and functions to be made to order. To succeed, however, the protein-design process requires reliable rules that link protein sequence to structure?function. Although our present understanding of coiled-coil folding and assembly is not complete, through numerous bioinformatics and experimental studies there are now sufficient rules to allow confident design attempts of naturally observed and even novel coiled-coil motifs. This review summarizes the current design rules for coiled coils, and describes some of the key successful coiled-coil designs that have been created to date. The designs range from those for relatively straightforward, naturally observed structures-including parallel and antiparallel dimers, trimers and tetramers, all of which have been made as homomers and heteromers-to more exotic structures that expand the repertoire of Nature's coiled-coil structures. Examples in the second bracket include a probe that binds a cancer-associated coiled-coil protein; a tetramer with a right-handed supercoil; sticky-ended coiled coils that self-assemble to form fibers; coiled coils that switch conformational state; a three-component two-stranded coiled coil; and an antiparallel dimer that directs fragment complementation of larger proteins. Some of the more recent examples show an important development in the field; namely, new designs are being created with function as well as structure in mind. This will remain one of the key challenges in coiled-coil design in the next few years. Other challenges that lie ahead include the need to discover more rules for coiled-coil prediction and design, and to implement these in prediction and design algorithms. The considerable success of coiled-coil design so far bodes well for this, however. It is likely that these challenges will be met and surpassed.
Collapse
Affiliation(s)
- Derek N Woolfson
- Department of Biochemistry, School of Life Sciences, University of Sussex, Falmer BN1 9QG, United Kingdom
| |
Collapse
|
30
|
Pandya MJ, Cerasoli E, Joseph A, Stoneman RG, Waite E, Woolfson DN. Sequence and Structural Duality: Designing Peptides to Adopt Two Stable Conformations. J Am Chem Soc 2004; 126:17016-24. [PMID: 15612740 DOI: 10.1021/ja045568c] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
To improve our understanding of conformational transitions in proteins, we are attempting the de novo design of peptides that switch structural state. Here, we describe coiled-coil peptides with sequence and structural duality; that is, features compatible with two different coiled-coil motifs superimposed within the same sequence. Specifically, we promoted a parallel leucine-zipper dimer under reducing conditions, and a monomeric helical hairpin in an intramolecularly disulfide bridged state. Using an iterative process, we engineered peptides that formed stable structures consistent with both targets under the different conditions. Finally, for one of the designs, we demonstrated a one-way switch from the helical hairpin to the coiled-coil dimer upon addition of disulfide-reducing agents.
Collapse
Affiliation(s)
- Maya J Pandya
- Department of Biochemistry, John Maynard-Smith Building, School of Life Sciences, University of Sussex, Falmer, Brighton, BN1 9QG, United Kingdom
| | | | | | | | | | | |
Collapse
|
31
|
Abstract
We consider highly specific protein-protein interactions in proteomes of simple model proteins. We are inspired by the work of Zarrinpar et al (2003 Nature 426 676). They took a binding domain in a signalling pathway in yeast and replaced it with domains of the same class but from different organisms. They found that the probability of a protein binding to a protein from the proteome of a different organism is rather high, around one half. We calculate the probability of a model protein from one proteome binding to the protein of a different proteome. These proteomes are obtained by sampling the space of functional proteomes uniformly. In agreement with Zarrinpar et al we find that the probability of a protein binding a protein from another proteome is rather high, of order one tenth. Our results, together with those of Zarrinpar et al, suggest that designing, say, a peptide to block or reconstitute a single signalling pathway, without affecting any other pathways, requires knowledge of all the partners of the class of binding domains the peptide is designed to mimic. This knowledge is required to use negative design to explicitly design out interactions of the peptide with proteins other than its target. We also found that patches that are required to bind with high specificity evolve more slowly than those that are required only to not bind to any other patch. This is consistent with some analysis of sequence data for proteins engaged in highly specific interactions.
Collapse
Affiliation(s)
- Richard P Sear
- The Isaac Newton Institute for Mathematical Sciences, University of Cambridge, 20 Clarkson Road, Cambridge CB3 0EH, UK.
| |
Collapse
|
32
|
Abstract
We aim to design novel proteins that link specific biochemical binding events, such as DNA recognition, with electron transfer functionality. We want these proteins to form the basis of new molecules that can be used for templated assembly of conducting cofactors or for thermodynamically linking DNA binding with cofactor chemistry for nanodevice applications. The first examples of our new proteins recruit the DNA-binding basic helix region of the leucine zipper protein GCN4. This basic helix region was attached to the N and C termini of cytochrome b(562) (cyt b(562)) to produce new, monomeric, multifunctional polypeptides. We have fully characterised the DNA and haem-binding properties of these proteins, which is a prerequisite for future application of the new molecules. Attachment of a single basic helix of GCN4 to either the N or C terminus of the cytochrome does not result in specific DNA binding but the presence of DNA-binding domains at both termini converts the cytochrome into a specific DNA-binding protein. Upon binding haem, this chimeric protein attains the spectral characteristics of wild-type cyt b(562). The three forms of the protein, apo, oxidised holo and reduced holo, all bind the designed (ATGAcgATGA) target DNA sequence with a dissociation constant, K(D), of approximately 90 nM. The protein has a lower affinity (K(D) ca. 370 nM) for the wild-type GCN4 recognition sequence (ATGAcTCAT). The presence of only half the consensus DNA sequence (ATGAcgGGCC) shifts the K(D) value to more than 2500 nM and the chimera does not bind specifically to DNA sequences with no target recognition sites. Ultracentrifugation revealed that the holoprotein-DNA complex is formed with a 1:1 stoichiometry, which indicates that a higher-order protein aggregate is not responsible for DNA binding. Mutagenesis of a loop linking helices 2 and 3 of the cytochrome results in a chimera with a haem-dependent DNA binding affinity. This is the first demonstration that binding of a haem group to a designed monomeric protein can allosterically modulate the DNA binding affinity.
Collapse
Affiliation(s)
- D Dafydd Jones
- University Chemical Laboratories and MRC Centre for Protein Engineering, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.
| | | |
Collapse
|
33
|
Khatun J, Khare SD, Dokholyan NV. Can Contact Potentials Reliably Predict Stability of Proteins? J Mol Biol 2004; 336:1223-38. [PMID: 15037081 DOI: 10.1016/j.jmb.2004.01.002] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2003] [Revised: 01/08/2004] [Accepted: 01/08/2004] [Indexed: 11/17/2022]
Abstract
The simplest approximation of interaction potential between amino acid residues in proteins is the contact potential, which defines the effective free energy of a protein conformation by a set of amino acid contacts formed in this conformation. Finding a contact potential capable of predicting free energies of protein states across a variety of protein families will aid protein folding and engineering in silico on a computationally tractable time-scale. We test the ability of contact potentials to accurately and transferably (across various protein families) predict stability changes of proteins upon mutations. We develop a new methodology to determine the contact potentials in proteins from experimental measurements of changes in protein's thermodynamic stabilities (DeltaDeltaG) upon mutations. We apply our methodology to derive sets of contact interaction parameters for a hierarchy of interaction models including solvation and multi-body contact parameters. We test how well our models reproduce experimental measurements by statistical tests. We evaluate the maximum accuracy of predictions obtained by using contact potentials and the correlation between parameters derived from different data-sets of experimental (DeltaDeltaG) values. We argue that it is impossible to reach experimental accuracy and derive fully transferable contact parameters using the contact models of potentials. However, contact parameters may yield reliable predictions of DeltaDeltaG for datasets of mutations confined to the same amino acid positions in the sequence of a single protein.
Collapse
Affiliation(s)
- Jainab Khatun
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | | | | |
Collapse
|
34
|
Doye JPK, Louis AA, Vendruscolo M. Inhibition of protein crystallization by evolutionary negative design. Phys Biol 2004; 1:P9-13. [PMID: 16204814 DOI: 10.1088/1478-3967/1/1/p02] [Citation(s) in RCA: 58] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Why are proteins so hard to crystallize? We propose an 'evolutionary negative design' principle to explain this difficulty. Proteins have evolved to avoid crystallization because crystallization compromises the viability of the cell. Evolutionary negative design is supported by much evidence in the literature, including the effect of mutations on the crystallizability of a protein, the correlations found in the properties of crystal contacts in bioinformatics databases, and the positive use of protein crystallization by bacteria and viruses.
Collapse
Affiliation(s)
- Jonathan P K Doye
- University Chemical Laboratory, Lensfield Road, Cambridge CB2 1EW, UK.
| | | | | |
Collapse
|
35
|
Vinkers HM, de Jonge MR, Daeyaert FFD, Heeres J, Koymans LMH, van Lenthe JH, Lewi PJ, Timmerman H, Van Aken K, Janssen PAJ. SYNOPSIS: SYNthesize and OPtimize System in Silico. J Med Chem 2003; 46:2765-73. [PMID: 12801239 DOI: 10.1021/jm030809x] [Citation(s) in RCA: 130] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We present a de novo design program called SYNOPSIS, that includes a synthesis route for each generated molecule. SYNOPSIS designs novel molecules by starting from a database of available molecules and simulating organic synthesis steps. This way of generating molecules imposes synthetic accessibility on the molecules. In addition to a starting database, a fitness function is needed that calculates the value of a desired property for an arbitrary molecule. The values obtained from this function guide the design process in optimizing the molecules toward an optimal value of the calculated property. Two applications are described. The first uses an electric dipole moment calculation to generate molecules possessing a strong dipole moment. The second makes use of the three-dimensional structure of a viral enzyme in order to generate high affinity ligands. Twenty eight compounds designed with the program resulted in 18 synthesized and tested compounds, 10 of which showed HIV inhibitory activity in vitro.
Collapse
Affiliation(s)
- H Maarten Vinkers
- Center for Molecular Design, Janssen Pharmaceutica N.V., Antwerpsesteenweg 37, B-2350 Vosselaar, Belgium.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
36
|
Jin W, Kambara O, Sasakawa H, Tamura A, Takada S. De novo design of foldable proteins with smooth folding funnel: automated negative design and experimental verification. Structure 2003; 11:581-90. [PMID: 12737823 DOI: 10.1016/s0969-2126(03)00075-3] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
De novo sequence design of foldable proteins provides a way of investigating principles of protein architecture. We performed fully automated sequence design for a target structure having a three-helix bundle topology and synthesized the designed sequences. Our design principle is different from the conventional approach, in that instead of optimizing interactions within the target structure, we design the global shape of the protein folding funnel. This includes automated implementation of negative design by explicitly requiring higher free energy of the denatured state. The designed sequences do not have significant similarity to those of any natural proteins. The NMR and CD spectroscopic data indicated that one designed sequence has a well-defined three-dimensional structure as well as alpha-helical content consistent with the target.
Collapse
Affiliation(s)
- Wenzhen Jin
- Graduate School of Science and Technology, Japan Science and Technology Corporation, Kobe University, Rokkodai, Nada, 657-8501, Kobe, Japan
| | | | | | | | | |
Collapse
|
37
|
Abstract
A series of mimetic cores composed of a synthetic scaffold and amino acids have been constructed and their properties investigated in chloroform. A relative measure of H-bond strength was obtained by comparing temperature coefficients derived from variable-temperature (1)H NMR experiments. Although most templates had a strong H-bond, only a single template composed of D- and L-phenylalanines was able to form two strong H-bonds. Templates containing D- and L-leucines formed only a single H-bond. The results of these studies suggest that aromatic edge-to-face interactions provide greater stabilization energy than aliphatic-aromatic interactions in the tightly packed hydrophobic cores of proteins. Partial structures of the templates were derived by analyzing a series of two-dimensional (1)H NMR spectra and performing molecular mechanics calculations using AMBER and MMFF94 force fields.
Collapse
Affiliation(s)
- J A Turk
- Department of Chemistry, University of Cincinnati, Cincinnati, Ohio 45221-0172, USA
| | | |
Collapse
|
38
|
Marshall SA, Mayo SL. Achieving stability and conformational specificity in designed proteins via binary patterning. J Mol Biol 2001; 305:619-31. [PMID: 11152617 DOI: 10.1006/jmbi.2000.4319] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
We have developed a method to determine the optimal binary pattern (arrangement of hydrophobic and polar amino acids) of a target protein fold prior to amino acid sequence selection in protein design studies. A solvent accessible surface is generated for a target fold using its backbone coordinates and "generic" side-chains, which are constructs whose size and shape are similar to an average amino acid. Each position is classified as hydrophobic or polar according to the solvent exposure of its generic side-chain. The method was tested by analyzing a set of proteins in the Protein Data Bank and by experimentally constructing and analyzing a set of engrailed homeodomain variants whose binary patterns were systematically varied. Selection of the optimal binary pattern results in a designed protein that is monomeric, well-folded, and hyperthermophilic. Homeodomain variants with fewer hydrophobic residues are destabilized, while additional hydrophobic residues induce aggregation. Binary patterning, in conjunction with a force field that models folded state energies, appears sufficient to satisfy two basic goals of protein design: stability and conformational specificity.
Collapse
Affiliation(s)
- S A Marshall
- Division of Chemistry and Chemical Engineering, California Institute of Technology, 1200 East California Blvd., Pasadena, CA 91125, USA
| | | |
Collapse
|
39
|
|
40
|
Grell D, Richardson JS, Richardson DC, Mutter M. SymROP: ROP protein with identical helices redesigned by all-atom contact analysis and molecular dynamics. J Mol Graph Model 2000; 18:290-8, 309-10. [PMID: 11021545 DOI: 10.1016/s1093-3263(00)00049-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
Experience has shown that protein redesigns (using the backbone from a known protein structure) are far more likely to produce well-ordered, native-like structures than are true de novo designs. Therefore, to design a four-helix bundle made of identical short helices, we here proceed by an extensive redesign of the ROP protein. A fully symmetrical SymROP sequence derived from ROP was chosen by modeling ideal-geometry side chains, including hydrogens, while maintaining the "goodness-of-fit" of side-chain packing by calculating all-atom contact surfaces with the Reduce and Probe programs. To estimate the probable extent of backbone movement and side-chain mobility, restrained molecular dynamics simulations were compared for candidate sequences and controls, including substitution of Abu for all or half the core Ala residues. The resulting 17-residue designed sequence is 41% identical to the relevant regions in ROP. SymROP is intended for construction by the Template Assembled Synthetic Proteins approach, to control the bundle topology, to use short helices, and to allow blocked termini and unnatural amino acids. ROP protein has been a valuable system for studying helical protein structure because of its simplicity and regularity within a structure large enough to have a real hydrophobic core. The SymROP design carries that simplicity and regularity even further.
Collapse
Affiliation(s)
- D Grell
- Institute of Organic Chemistry, University of Lausanne, Switzerland
| | | | | | | |
Collapse
|
41
|
Raha K, Wollacott AM, Italia MJ, Desjarlais JR. Prediction of amino acid sequence from structure. Protein Sci 2000; 9:1106-19. [PMID: 10892804 PMCID: PMC2144664 DOI: 10.1110/ps.9.6.1106] [Citation(s) in RCA: 61] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
We have developed a method for the prediction of an amino acid sequence that is compatible with a three-dimensional backbone structure. Using only a backbone structure of a protein as input, the algorithm is capable of designing sequences that closely resemble natural members of the protein family to which the template structure belongs. In general, the predicted sequences are shown to have multiple sequence profile scores that are dramatically higher than those of random sequences, and sometimes better than some of the natural sequences that make up the superfamily. As anticipated, highly conserved but poorly predicted residues are often those that contribute to the functional rather than structural properties of the protein. Overall, our analysis suggests that statistical profile scores of designed sequences are a novel and valuable figure of merit for assessing and improving protein design algorithms.
Collapse
Affiliation(s)
- K Raha
- Integrative Biosciences Program, Pennsylvania State University, University Park, Pennsylvania 16803, USA
| | | | | | | |
Collapse
|
42
|
Street AG, Datta D, Gordon DB, Mayo SL. Designing protein beta-sheet surfaces by Z-score optimization. PHYSICAL REVIEW LETTERS 2000; 84:5010-5013. [PMID: 10990854 DOI: 10.1103/physrevlett.84.5010] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/1999] [Indexed: 05/23/2023]
Abstract
Studies of lattice models of proteins have suggested that the appropriate energy expression for protein design may include nonthermodynamic terms to accommodate negative design concerns. One method, developed in lattice model studies, maximizes a quantity known as the " Z-score," which compares the lowest energy sequence whose ground state structure is the target structure to an ensemble of random sequences. Here we show that, in certain circumstances, the technique can be applied to real proteins. The resulting energy expression is used to design the beta-sheet surfaces of two real proteins. We find experimentally that the designed proteins are stable and well folded, and in one case is even more thermostable than the wild type.
Collapse
Affiliation(s)
- A G Street
- Division of Physics, Mathematics and Astronomy, California Institute of Technology, MC 147-75, Pasadena, California 91125, USA
| | | | | | | |
Collapse
|
43
|
Abstract
BACKGROUND A large energy gap between the native state and the non-native folded states is required for folding into a unique three-dimensional structure. The features that define this energy gap are not well understood, but can be addressed using de novo protein design. Previously, alpha(2)D, a dimeric four-helix bundle, was designed and shown to adopt a native-like conformation. The high-resolution solution structure revealed that this protein adopted a bisecting U motif. Glu7, a solvent-exposed residue that adopts many conformations in solution, might be involved in defining the unique three-dimensional structure of alpha(2)D. RESULTS A variety of hydrophobic and polar residues were substituted for Glu7 and the dynamic and thermodynamic properties of the resulting proteins were characterized by analytical ultracentrifugation, circular dichroism spectroscopy, and nuclear magnetic resonance spectroscopy. The majority of substitutions at this solvent-exposed position had little affect on the ability to fold into a dimeric four-helix bundle. The ability to adopt a unique conformation, however, was profoundly modulated by the residue at this position despite the similar free energies of folding of each variant. CONCLUSIONS Although Glu7 is not involved directly in stabilizing the native state of alpha(2)D, it is involved indirectly in specifying the observed fold by modulating the energy gap between the native state and the non-native folded states. These results provide experimental support for hypothetical models arising from lattice simulations of protein folding, and underscore the importance of polar interfacial residues in defining the native conformations of proteins.
Collapse
|
44
|
Petrosian SA, Makhatadze GI. Contribution of proton linkage to the thermodynamic stability of the major cold-shock protein of Escherichia coli CspA. Protein Sci 2000; 9:387-94. [PMID: 10716191 PMCID: PMC2144560 DOI: 10.1110/ps.9.2.387] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
The stability of protein is defined not only by the hydrogen bonding, hydrophobic effect, van der Waals interactions, and salt bridges. Additional, much more subtle contributions to protein stability can arise from surface residues that change their properties upon unfolding. The recombinant major cold shock protein of Escherichia coli CspA an all-beta protein unfolds reversible in a two-state manner, and behaves in all other respects as typical globular protein. However, the enthalpy of CspA unfolding strongly depends on the pH and buffer composition. Detailed analysis of the unfolding enthalpies as a function of pH and buffers with different heats of ionization shows that CspA unfolding in the pH range 5.5-9.0 is linked to protonation of an amino group. This amino group appears to be the N-terminal alpha-amino group of the CspA molecule. It undergoes a 1.6 U shift in pKa values between native and unfolded states. Although this shift in pKa is expected to contribute approximately 5 kJ/mol to CspA stabilization energy the experimentally observed stabilization is only approximately 1 kJ/mol. This discrepancy is related to a strong enthalpy-entropy compensation due, most likely, to the differences in hydration of the protonated and deprotonated forms of the alpha-amino group.
Collapse
Affiliation(s)
- S A Petrosian
- Department of Chemistry and Biochemistry, Texas Tech University, Lubbock 79409, USA
| | | |
Collapse
|
45
|
Skalicky JJ, Gibney BR, Rabanal F, Bieber Urbauer RJ, Dutton PL, Wand AJ. Solution Structure of a Designed Four-α-Helix Bundle Maquette Scaffold. J Am Chem Soc 1999. [DOI: 10.1021/ja983309f] [Citation(s) in RCA: 65] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Jack J. Skalicky
- Contribution from the Johnson Research Foundation, Department of Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, Pennsylvania 19104
| | - Brian R. Gibney
- Contribution from the Johnson Research Foundation, Department of Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, Pennsylvania 19104
| | - Francesc Rabanal
- Contribution from the Johnson Research Foundation, Department of Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, Pennsylvania 19104
| | - Ramona J. Bieber Urbauer
- Contribution from the Johnson Research Foundation, Department of Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, Pennsylvania 19104
| | - P. Leslie Dutton
- Contribution from the Johnson Research Foundation, Department of Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, Pennsylvania 19104
| | - A. Joshua Wand
- Contribution from the Johnson Research Foundation, Department of Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, Pennsylvania 19104
| |
Collapse
|
46
|
Hegyi H, Gerstein M. The relationship between protein structure and function: a comprehensive survey with application to the yeast genome. J Mol Biol 1999; 288:147-64. [PMID: 10329133 DOI: 10.1006/jmbi.1999.2661] [Citation(s) in RCA: 269] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
For most proteins in the genome databases, function is predicted via sequence comparison. In spite of the popularity of this approach, the extent to which it can be reliably applied is unknown. We address this issue by systematically investigating the relationship between protein function and structure. We focus initially on enzymes functionally classified by the Enzyme Commission (EC) and relate these to by structurally classified domains the SCOP database. We find that the major SCOP fold classes have different propensities to carry out certain broad categories of functions. For instance, alpha/beta folds are disproportionately associated with enzymes, especially transferases and hydrolases, and all-alpha and small folds with non-enzymes, while alpha+beta folds have an equal tendency either way. These observations for the database overall are largely true for specific genomes. We focus, in particular, on yeast, analyzing it with many classifications in addition to SCOP and EC (i.e. COGs, CATH, MIPS), and find clear tendencies for fold-function association, across a broad spectrum of functions. Analysis with the COGs scheme also suggests that the functions of the most ancient proteins are more evenly distributed among different structural classes than those of more modern ones. For the database overall, we identify the most versatile functions, i.e. those that are associated with the most folds, and the most versatile folds, associated with the most functions. The two most versatile enzymatic functions (hydro-lyases and O-glycosyl glucosidases) are associated with seven folds each. The five most versatile folds (TIM-barrel, Rossmann, ferredoxin, alpha-beta hydrolase, and P-loop NTP hydrolase) are all mixed alpha-beta structures. They stand out as generic scaffolds, accommodating from six to as many as 16 functions (for the exceptional TIM-barrel). At the conclusion of our analysis we are able to construct a graph giving the chance that a functional annotation can be reliably transferred at different degrees of sequence and structural similarity. Supplemental information is available from http://bioinfo.mbb.yale.edu/genome/foldfunc++ +.
Collapse
Affiliation(s)
- H Hegyi
- Department of Molecular Biophysics & Biochemistry Yale University, 266 Whitney Avenue, New Haven, CT 06520, USA
| | | |
Collapse
|
47
|
|
48
|
Hellinga HW. Construction of a Blue Copper Analogue through Iterative Rational Protein Design Cycles Demonstrates Principles of Molecular Recognition in Metal Center Formation. J Am Chem Soc 1998. [DOI: 10.1021/ja980054x] [Citation(s) in RCA: 48] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Homme W. Hellinga
- Contribution from the Department of Biochemistry, Duke University Medical Center, Durham, North Carolina 27710
| |
Collapse
|
49
|
Abstract
A variety of methodologies are under development to alter the behavior of existing metal centers or create entirely new sites within a protein framework in order to exploit the intrinsic chemical versatility of metals using the exquisite level of control that a protein matrix can exert to modulate their reactivity. Even at this relatively early stage, engineering of metal centers has led to the development of a number of emerging technologies with a wide variety of applications, including affinity purification of proteins, engineering of metal-mediated protein stability, control of protein activity, imaging and therapy, biosensors, and new catalysts.
Collapse
|
50
|
Abstract
Biosensors exploit the remarkable specificity of biomolecular recognition to provide analytical tools that can measure the presence of a single molecular species in a complex mixture. A new strategy is emerging in the development of biosensor technologies: molecular-engineering techniques are being used to adapt the properties of proteins to simple, generic detector instrumentation, rather than adapting instruments to the unique requirements of a natural molecule.
Collapse
Affiliation(s)
- H W Hellinga
- Department of Biochemistry, Duke University Medical Center, Durham, NC 27710, USA
| | | |
Collapse
|