1
|
Sandhu M, Chen JZ, Matthews DS, Spence MA, Pulsford SB, Gall B, Kaczmarski JA, Nichols J, Tokuriki N, Jackson CJ. Computational and Experimental Exploration of Protein Fitness Landscapes: Navigating Smooth and Rugged Terrains. Biochemistry 2025; 64:1673-1684. [PMID: 40132127 DOI: 10.1021/acs.biochem.4c00673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/27/2025]
Abstract
Proteins evolve through complex sequence spaces, with fitness landscapes serving as a conceptual framework that links sequence to function. Fitness landscapes can be smooth, where multiple similarly accessible evolutionary paths are available, or rugged, where the presence of multiple local fitness optima complicate evolution and prediction. Indeed, many proteins, especially those with complex functions or under multiple selection pressures, exist on rugged fitness landscapes. Here we discuss the theoretical framework that underpins our understanding of fitness landscapes, alongside recent work that has advanced our understanding─particularly the biophysical basis for smoothness versus ruggedness. Finally, we address the rapid advances that have been made in computational and experimental exploration and exploitation of fitness landscapes, and how these can identify efficient routes to protein optimization.
Collapse
Affiliation(s)
- Mahakaran Sandhu
- Research School of Chemistry, Australian National University, Canberra ACT 2601, Australia
- ARC Centre of Excellence for Innovations in Peptide & Protein Science, Research School of Chemistry, Australian National University, Canberra ACT 2601, Australia
| | - John Z Chen
- Research School of Chemistry, Australian National University, Canberra ACT 2601, Australia
- ARC Centre of Excellence in Synthetic Biology, Research School of Biology, Australian National University, Canberra ACT 2601, Australia
| | - Dana S Matthews
- Research School of Chemistry, Australian National University, Canberra ACT 2601, Australia
- ARC Centre of Excellence for Innovations in Peptide & Protein Science, Research School of Chemistry, Australian National University, Canberra ACT 2601, Australia
| | - Matthew A Spence
- Research School of Chemistry, Australian National University, Canberra ACT 2601, Australia
- ARC Centre of Excellence for Innovations in Peptide & Protein Science, Research School of Chemistry, Australian National University, Canberra ACT 2601, Australia
| | - Sacha B Pulsford
- Research School of Chemistry, Australian National University, Canberra ACT 2601, Australia
- ARC Centre of Excellence for Innovations in Peptide & Protein Science, Research School of Chemistry, Australian National University, Canberra ACT 2601, Australia
| | - Barnabas Gall
- Research School of Chemistry, Australian National University, Canberra ACT 2601, Australia
- ARC Centre of Excellence for Innovations in Peptide & Protein Science, Research School of Chemistry, Australian National University, Canberra ACT 2601, Australia
| | - Joe A Kaczmarski
- Research School of Chemistry, Australian National University, Canberra ACT 2601, Australia
- ARC Centre of Excellence in Synthetic Biology, Research School of Biology, Australian National University, Canberra ACT 2601, Australia
| | - James Nichols
- Biological Data Science Institute, Australian National University, Canberra ACT 2601, Australia
| | - Nobuhiko Tokuriki
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| | - Colin J Jackson
- Research School of Chemistry, Australian National University, Canberra ACT 2601, Australia
- ARC Centre of Excellence for Innovations in Peptide & Protein Science, Research School of Chemistry, Australian National University, Canberra ACT 2601, Australia
- Biological Data Science Institute, Australian National University, Canberra ACT 2601, Australia
- ARC Centre of Excellence in Synthetic Biology, Research School of Biology, Australian National University, Canberra ACT 2601, Australia
| |
Collapse
|
2
|
Luppino F, Lenz S, Chow CFW, Toth-Petroczy A. Deep learning tools predict variants in disordered regions with lower sensitivity. BMC Genomics 2025; 26:367. [PMID: 40221640 PMCID: PMC11992697 DOI: 10.1186/s12864-025-11534-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2025] [Accepted: 03/27/2025] [Indexed: 04/14/2025] Open
Abstract
BACKGROUND The recent AI breakthrough of AlphaFold2 has revolutionized 3D protein structural modeling, proving crucial for protein design and variant effects prediction. However, intrinsically disordered regions-known for their lack of well-defined structure and lower sequence conservation-often yield low-confidence models. The latest Variant Effect Predictor (VEP), AlphaMissense, leverages AlphaFold2 models, achieving over 90% sensitivity and specificity in predicting variant effects. However, the effectiveness of tools for variants in disordered regions, which account for 30% of the human proteome, remains unclear. RESULTS In this study, we found that predicting pathogenicity for variants in disordered regions is less accurate than in ordered regions, particularly for mutations at the first N-Methionine site. Investigations into the efficacy of variant effect predictors on intrinsically disordered regions (IDRs) indicated that mutations in IDRs are predicted with lower sensitivity and the gap between sensitivity and specificity is largest in disordered regions, especially for AlphaMissense and VARITY. CONCLUSIONS The prevalence of IDRs within the human proteome, coupled with the increasing repertoire of biological functions they are known to perform, necessitated an investigation into the efficacy of state-of-the-art VEPs on such regions. This analysis revealed their consistently reduced sensitivity and differing prediction performance profile to ordered regions, indicating that new IDR-specific features and paradigms are needed to accurately classify disease mutations within those regions.
Collapse
Affiliation(s)
- Federica Luppino
- Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstrasse 108, 01307, Dresden, Germany
- Center for Systems Biology Dresden, Pfotenhauerstrasse 108, 01307, Dresden, Germany
| | - Swantje Lenz
- Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstrasse 108, 01307, Dresden, Germany
- Center for Systems Biology Dresden, Pfotenhauerstrasse 108, 01307, Dresden, Germany
| | - Chi Fung Willis Chow
- Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstrasse 108, 01307, Dresden, Germany
- Center for Systems Biology Dresden, Pfotenhauerstrasse 108, 01307, Dresden, Germany
- Cluster of Excellence Physics of Life, TU Dresden, 01062, Dresden, Germany
| | - Agnes Toth-Petroczy
- Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstrasse 108, 01307, Dresden, Germany.
- Center for Systems Biology Dresden, Pfotenhauerstrasse 108, 01307, Dresden, Germany.
- Cluster of Excellence Physics of Life, TU Dresden, 01062, Dresden, Germany.
| |
Collapse
|
3
|
Chow CFW, Lenz S, Scheremetjew M, Ghosh S, Richter D, Jegers C, von Appen A, Alberti S, Toth-Petroczy A. SHARK-capture identifies functional motifs in intrinsically disordered protein regions. Protein Sci 2025; 34:e70091. [PMID: 40100159 PMCID: PMC11917139 DOI: 10.1002/pro.70091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2024] [Revised: 01/30/2025] [Accepted: 02/20/2025] [Indexed: 03/20/2025]
Abstract
Increasing insights into how sequence motifs in intrinsically disordered regions (IDRs) provide functions underscore the need for systematic motif detection. Contrary to structured regions where motifs can be readily identified from sequence alignments, the rapid evolution of IDRs limits the usage of alignment-based tools in reliably detecting motifs within. Here, we developed SHARK-capture, an alignment-free motif detection tool designed for difficult-to-align regions. SHARK-capture innovates on word-based methods by flexibly incorporating amino acid physicochemistry to assess motif similarity without requiring rigid definitions of equivalency groups. SHARK-capture offers consistently strong performance in a systematic benchmark, with superior residue-level performance. SHARK-capture identified known functional motifs across orthologs of the microtubule-associated zinc finger protein BuGZ. We also identified a short motif in the IDR of S. cerevisiae RNA helicase Ded1p, which we experimentally verified to be capable of promoting ATPase activity. Our improved performance allows us to systematically calculate 10,889 motifs for 2695 yeast IDRs and provide it as a resource. SHARK-capture offers the most precise tool yet for the systematic identification of conserved regions in IDRs and is freely available as a Python package (https://pypi.org/project/bio-shark/) and on https://git.mpi-cbg.de/tothpetroczylab/shark.
Collapse
Affiliation(s)
- Chi Fung Willis Chow
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
- Center for Systems Biology Dresden, Dresden, Germany
- Cluster of Excellence Physics of Life, Technische Universität Dresden, Dresden, Germany
| | - Swantje Lenz
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
- Center for Systems Biology Dresden, Dresden, Germany
| | - Maxim Scheremetjew
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
- Center for Systems Biology Dresden, Dresden, Germany
| | - Soumyadeep Ghosh
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
- Center for Systems Biology Dresden, Dresden, Germany
| | - Doris Richter
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
| | - Ceciel Jegers
- Cluster of Excellence Physics of Life, Technische Universität Dresden, Dresden, Germany
- Biotechnology Center (BIOTEC), Center for Molecular and Cellular Bioengineering, Technische Universität Dresden, Dresden, Germany
| | - Alexander von Appen
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
- Center for Systems Biology Dresden, Dresden, Germany
| | - Simon Alberti
- Cluster of Excellence Physics of Life, Technische Universität Dresden, Dresden, Germany
- Biotechnology Center (BIOTEC), Center for Molecular and Cellular Bioengineering, Technische Universität Dresden, Dresden, Germany
| | - Agnes Toth-Petroczy
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
- Center for Systems Biology Dresden, Dresden, Germany
- Cluster of Excellence Physics of Life, Technische Universität Dresden, Dresden, Germany
| |
Collapse
|
4
|
Yang Y, Braga MV, Dean MD. Insertion-Deletion Events Are Depleted in Protein Regions with Predicted Secondary Structure. Genome Biol Evol 2024; 16:evae093. [PMID: 38735759 PMCID: PMC11102076 DOI: 10.1093/gbe/evae093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Revised: 04/16/2024] [Accepted: 04/21/2024] [Indexed: 05/14/2024] Open
Abstract
A fundamental goal in evolutionary biology and population genetics is to understand how selection shapes the fate of new mutations. Here, we test the null hypothesis that insertion-deletion (indel) events in protein-coding regions occur randomly with respect to secondary structures. We identified indels across 11,444 sequence alignments in mouse, rat, human, chimp, and dog genomes and then quantified their overlap with four different types of secondary structure-alpha helices, beta strands, protein bends, and protein turns-predicted by deep-learning methods of AlphaFold2. Indels overlapped secondary structures 54% as much as expected and were especially underrepresented over beta strands, which tend to form internal, stable regions of proteins. In contrast, indels were enriched by 155% over regions without any predicted secondary structures. These skews were stronger in the rodent lineages compared to the primate lineages, consistent with population genetic theory predicting that natural selection will be more efficient in species with larger effective population sizes. Nonsynonymous substitutions were also less common in regions of protein secondary structure, although not as strongly reduced as in indels. In a complementary analysis of thousands of human genomes, we showed that indels overlapping secondary structure segregated at significantly lower frequency than indels outside of secondary structure. Taken together, our study shows that indels are selected against if they overlap secondary structure, presumably because they disrupt the tertiary structure and function of a protein.
Collapse
Affiliation(s)
- Yi Yang
- Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Matthew V Braga
- Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Matthew D Dean
- Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
5
|
Bouvier JW, Emms DM, Kelly S. Rubisco is evolving for improved catalytic efficiency and CO 2 assimilation in plants. Proc Natl Acad Sci U S A 2024; 121:e2321050121. [PMID: 38442173 PMCID: PMC10945770 DOI: 10.1073/pnas.2321050121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 01/25/2024] [Indexed: 03/07/2024] Open
Abstract
Rubisco is the primary entry point for carbon into the biosphere. However, rubisco is widely regarded as inefficient leading many to question whether the enzyme can adapt to become a better catalyst. Through a phylogenetic investigation of the molecular and kinetic evolution of Form I rubisco we uncover the evolutionary trajectory of rubisco kinetic evolution in angiosperms. We show that rbcL is among the 1% of slowest-evolving genes and enzymes on Earth, accumulating one nucleotide substitution every 0.9 My and one amino acid mutation every 7.2 My. Despite this, rubisco catalysis has been continually evolving toward improved CO2/O2 specificity, carboxylase turnover, and carboxylation efficiency. Consistent with this kinetic adaptation, increased rubisco evolution has led to a concomitant improvement in leaf-level CO2 assimilation. Thus, rubisco has been slowly but continually evolving toward improved catalytic efficiency and CO2 assimilation in plants.
Collapse
Affiliation(s)
- Jacques W Bouvier
- Department of Biology, University of Oxford, Oxford OX1 3RB, United Kingdom
| | - David M Emms
- Department of Biology, University of Oxford, Oxford OX1 3RB, United Kingdom
| | - Steven Kelly
- Department of Biology, University of Oxford, Oxford OX1 3RB, United Kingdom
| |
Collapse
|
6
|
Luppino F, Adzhubei IA, Cassa CA, Toth-Petroczy A. DeMAG predicts the effects of variants in clinically actionable genes by integrating structural and evolutionary epistatic features. Nat Commun 2023; 14:2230. [PMID: 37076482 PMCID: PMC10115847 DOI: 10.1038/s41467-023-37661-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Accepted: 03/27/2023] [Indexed: 04/21/2023] Open
Abstract
Despite the increasing use of genomic sequencing in clinical practice, the interpretation of rare genetic variants remains challenging even in well-studied disease genes, resulting in many patients with Variants of Uncertain Significance (VUSs). Computational Variant Effect Predictors (VEPs) provide valuable evidence in variant assessment, but they are prone to misclassifying benign variants, contributing to false positives. Here, we develop Deciphering Mutations in Actionable Genes (DeMAG), a supervised classifier for missense variants trained using extensive diagnostic data available in 59 actionable disease genes (American College of Medical Genetics and Genomics Secondary Findings v2.0, ACMG SF v2.0). DeMAG improves performance over existing VEPs by reaching balanced specificity (82%) and sensitivity (94%) on clinical data, and includes a novel epistatic feature, the 'partners score', which leverages evolutionary and structural partnerships of residues. The 'partners score' provides a general framework for modeling epistatic interactions, integrating both clinical and functional information. We provide our tool and predictions for all missense variants in 316 clinically actionable disease genes (demag.org) to facilitate the interpretation of variants and improve clinical decision-making.
Collapse
Affiliation(s)
- Federica Luppino
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307, Dresden, Germany
- Center for Systems Biology Dresden, 01307, Dresden, Germany
| | - Ivan A Adzhubei
- Brigham and Women's Hospital Division of Genetics, Harvard Medical School, Boston, MA, 02115, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, 02115, USA
| | - Christopher A Cassa
- Brigham and Women's Hospital Division of Genetics, Harvard Medical School, Boston, MA, 02115, USA.
| | - Agnes Toth-Petroczy
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307, Dresden, Germany.
- Center for Systems Biology Dresden, 01307, Dresden, Germany.
- Cluster of Excellence Physics of Life, TU Dresden, 01062, Dresden, Germany.
| |
Collapse
|
7
|
Li ZL, Buck M. A proteome-scale analysis of vertebrate protein amino acid occurrence: Thermoadaptation shows a correlation with protein solvation but less so with dynamics. Proteins 2023; 91:3-15. [PMID: 36053994 PMCID: PMC10087973 DOI: 10.1002/prot.26404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2022] [Revised: 07/06/2022] [Accepted: 07/25/2022] [Indexed: 12/15/2022]
Abstract
Despite differences in behaviors and living conditions, vertebrate organisms share the great majority of proteins, often with subtle differences in amino acid sequence. Here, we present a simple way to analyze the difference in amino acid occurrence by comparing highly homologous proteins on a subproteome level between several vertebrate model organisms. Specifically, we use this method to identify a pattern of amino acid conservation as well as a shift in amino acid occurrence between homeotherms (warm-blooded species) and poikilotherms (cold-blooded species). Importantly, this general analysis and a specific example further establish a broad correlation, if not likely connection between the thermal adaptation of protein sequences and two of their physical features: on average a change in their protein dynamics and, even more strongly, in their solvation. For poikilotherms, such as frog and fish, the lower body temperature is expected to increase the protein-protein interaction due to a decrease in protein internal dynamics. In order to counteract the tendency for enhanced binding caused by low temperatures, poikilotherms enhance the solvation of their proteins by favoring polar amino acids. This feature appears to dominate over possible changes in dynamics for some proteins. The results suggest that a general trend for amino acid choice is part of the mechanism for thermoadaptation of vertebrate organisms at the molecular level.
Collapse
Affiliation(s)
- Zhen-Lu Li
- School of Life Science, Tianjin University, Tianjin, China.,Department of Physiology and Biophysics, School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| | - Matthias Buck
- Department of Physiology and Biophysics, School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA.,Departments of Pharmacology and of Neurosciences, School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| |
Collapse
|
8
|
Romero ML, Garcia Seisdedos H, Ibarra‐Molero B. Active site center redesign increases protein stability preserving catalysis in thioredoxin. Protein Sci 2022; 31:e4417. [PMID: 39287965 PMCID: PMC9601870 DOI: 10.1002/pro.4417] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 07/15/2022] [Accepted: 07/31/2022] [Indexed: 11/08/2022]
Abstract
The stabilization of natural proteins is a long-standing desired goal in protein engineering. Optimizing the hydrophobicity of the protein core often results in extensive stability enhancements. However, the presence of totally or partially buried catalytic charged residues, essential for protein function, has limited the applicability of this strategy. Here, focusing on the thioredoxin, we aimed to augment protein stability by removing buried charged residues in the active site without loss of catalytic activity. To this end, we performed a charged-to-hydrophobic substitution of a buried and functional group, resulting in a significant stability increase yet abolishing catalytic activity. Then, to simulate the catalytic role of the buried ionizable group, we designed a combinatorial library of variants targeting a set of seven surface residues adjacent to the active site. Notably, more than 50% of the library variants restored, to some extent, the catalytic activity. The combination of experimental study of 2% of the library with the prediction of the whole mutational space by partial least squares regression revealed that a single point mutation at the protein surface is sufficient to fully restore the catalytic activity without thermostability cost. As a result, we engineered one of the highest thermal stabilities reported for a protein with a natural occurring fold (137°C). Further, our hyperstable variant preserves the catalytic activity both in vitro and in vivo.
Collapse
Affiliation(s)
- Maria Luisa Romero
- Departamento de Química FísicaUniversidad de GranadaGranada
- Max Planck Institute of Molecular Cell Biology and GeneticsDresdenGermany
- Center for Systems Biology DresdenDresdenGermany
| | - Hector Garcia Seisdedos
- Departamento de Química FísicaUniversidad de GranadaGranada
- Department of Structural BiologyWeizmann Institute of ScienceRehovotIsrael
- Department of Structural BiologyInstituto de Biologia Molecular de Barcelona (IBMB‐CSIC)BarcelonaSpain
| | - Beatriz Ibarra‐Molero
- Departamento de Química FísicaUniversidad de GranadaGranada
- Department of Structural BiologyInstituto de Biologia Molecular de Barcelona (IBMB‐CSIC)BarcelonaSpain
| |
Collapse
|
9
|
Shuler G, Hagai T. Rapidly evolving viral motifs mostly target biophysically constrained binding pockets of host proteins. Cell Rep 2022; 40:111212. [PMID: 35977510 DOI: 10.1016/j.celrep.2022.111212] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 06/11/2022] [Accepted: 07/22/2022] [Indexed: 11/28/2022] Open
Abstract
Evolutionary changes in host-virus interactions can alter the course of infection, but the biophysical and regulatory constraints that shape interface evolution remain largely unexplored. Here, we focus on viral mimicry of host-like motifs that allow binding to host domains and modulation of cellular pathways. We observe that motifs from unrelated viruses preferentially target conserved, widely expressed, and highly connected host proteins, enriched with regulatory and essential functions. The interface residues within these host domains are more conserved and bind a larger number of cellular proteins than similar motif-binding domains that are not known to interact with viruses. In contrast, rapidly evolving viral-binding human proteins form few interactions with other cellular proteins and display high tissue specificity, and their interfaces have few inter-residue contacts. Our results distinguish between conserved and rapidly evolving host-virus interfaces and show how various factors limit host capacity to evolve, allowing for efficient viral subversion of host machineries.
Collapse
Affiliation(s)
- Gal Shuler
- Shmunis School of Biomedicine and Cancer Research, George S Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| | - Tzachi Hagai
- Shmunis School of Biomedicine and Cancer Research, George S Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel.
| |
Collapse
|
10
|
Matsumura I, Patrick WM. Dan Tawfik's Lessons for Protein Engineers about Enzymes Adapting to New Substrates. Biochemistry 2022; 62:158-162. [PMID: 35820168 PMCID: PMC9851151 DOI: 10.1021/acs.biochem.2c00230] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Natural evolution has been creating new complex systems for billions of years. The process is spontaneous and requires neither intelligence nor moral purpose but is nevertheless difficult to understand. The late Dan Tawfik spent years studying enzymes as they adapted to recognize new substrates. Much of his work focused on gaining fundamental insights, so the practical utility of his experiments may not be obvious even to accomplished protein engineers. Here we focus on two questions fundamental to any directed evolution experiment. Which proteins are the best starting points for such experiments? Which trait(s) of the chosen parental protein should be evolved to achieve the desired outcome? We summarize Tawfik's contributions to our understanding of these problems, to honor his memory and encourage those unfamiliar with his ideas to read his publications.
Collapse
Affiliation(s)
- Ichiro Matsumura
- O.
Wayne Rollins Research Center, 1510 Clifton Road NE, Room 4001, Atlanta, Georgia 30322, United States,E-mail:
| | - Wayne M. Patrick
- Centre
for Biodiscovery, School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand,E-mail:
| |
Collapse
|
11
|
Hassler HB, Probert B, Moore C, Lawson E, Jackson RW, Russell BT, Richards VP. Phylogenies of the 16S rRNA gene and its hypervariable regions lack concordance with core genome phylogenies. MICROBIOME 2022; 10:104. [PMID: 35799218 PMCID: PMC9264627 DOI: 10.1186/s40168-022-01295-y] [Citation(s) in RCA: 60] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Accepted: 05/23/2022] [Indexed: 05/02/2023]
Abstract
BACKGROUND The 16S rRNA gene is used extensively in bacterial phylogenetics, in species delineation, and now widely in microbiome studies. However, the gene suffers from intragenomic heterogeneity, and reports of recombination and an unreliable phylogenetic signal are accumulating. Here, we compare core gene phylogenies to phylogenies constructed using core gene concatenations to estimate the strength of signal for the 16S rRNA gene, its hypervariable regions, and all core genes at the intra- and inter-genus levels. Specifically, we perform four intra-genus analyses (Clostridium, n = 65; Legionella, n = 47; Staphylococcus, n = 36; and Campylobacter, n = 17) and one inter-genus analysis [41 core genera of the human gut microbiome (31 families, 17 orders, and 12 classes), n = 82]. RESULTS At both taxonomic levels, the 16S rRNA gene was recombinant and subject to horizontal gene transfer. At the intra-genus level, the gene showed one of the lowest levels of concordance with the core genome phylogeny (50.7% average). Concordance for hypervariable regions was lower still, with entropy masking providing little to no benefit. A major factor influencing concordance was SNP count, which showed a positive logarithmic association. Using this relationship, we determined that 690 ± 110 SNPs were required for 80% concordance (average 16S rRNA gene SNP count was 254). We also found a wide range in 16S-23S-5S rRNA operon copy number among genomes (1-27). At the inter-genus level, concordance for the whole 16S rRNA gene was markedly higher (73.8% - 10th out of 49 loci); however, the most concordant hypervariable regions (V4, V3-V4, and V1-V2) ranked in the third quartile (62.5 to 60.0%). CONCLUSIONS Ramifications of a poor phylogenetic performance for the 16S rRNA gene are far reaching. For example, in addition to incorrect species/strain delineation and phylogenetic inference, it has the potential to confound community diversity metrics if phylogenetic information is incorporated - for example, with popular approaches such as Faith's phylogenetic diversity and UniFrac. Our results highlight the problematic nature of these approaches and their use (along with entropy masking) is discouraged. Lastly, the wide range in 16S rRNA gene copy number among genomes also has a strong potential to confound diversity metrics. Video Abstract.
Collapse
Affiliation(s)
- Hayley B. Hassler
- Department of Biological Sciences, College of Science, Clemson University, Clemson, SC 29634 USA
| | - Brett Probert
- Department of Biological Sciences, College of Science, Clemson University, Clemson, SC 29634 USA
| | - Carson Moore
- Department of Biological Sciences, College of Science, Clemson University, Clemson, SC 29634 USA
| | - Elizabeth Lawson
- Department of Biological Sciences, College of Science, Clemson University, Clemson, SC 29634 USA
| | | | - Brook T. Russell
- School of Mathematical and Statistical Sciences, Clemson University, Clemson, SC 29634 USA
| | - Vincent P. Richards
- Department of Biological Sciences, College of Science, Clemson University, Clemson, SC 29634 USA
| |
Collapse
|
12
|
Jayaraman V, Toledo‐Patiño S, Noda‐García L, Laurino P. Mechanisms of protein evolution. Protein Sci 2022; 31:e4362. [PMID: 35762715 PMCID: PMC9214755 DOI: 10.1002/pro.4362] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 05/11/2022] [Accepted: 05/14/2022] [Indexed: 11/06/2022]
Abstract
How do proteins evolve? How do changes in sequence mediate changes in protein structure, and in turn in function? This question has multiple angles, ranging from biochemistry and biophysics to evolutionary biology. This review provides a brief integrated view of some key mechanistic aspects of protein evolution. First, we explain how protein evolution is primarily driven by randomly acquired genetic mutations and selection for function, and how these mutations can even give rise to completely new folds. Then, we also comment on how phenotypic protein variability, including promiscuity, transcriptional and translational errors, may also accelerate this process, possibly via "plasticity-first" mechanisms. Finally, we highlight open questions in the field of protein evolution, with respect to the emergence of more sophisticated protein systems such as protein complexes, pathways, and the emergence of pre-LUCA enzymes.
Collapse
Affiliation(s)
- Vijay Jayaraman
- Department of Molecular Cell BiologyWeizmann Institute of ScienceRehovotIsrael
| | - Saacnicteh Toledo‐Patiño
- Protein Engineering and Evolution UnitOkinawa Institute of Science and Technology Graduate UniversityOkinawaJapan
| | - Lianet Noda‐García
- Department of Plant Pathology and Microbiology, Institute of Environmental Sciences, Robert H. Smith Faculty of Agriculture, Food and EnvironmentHebrew University of JerusalemRehovotIsrael
| | - Paola Laurino
- Protein Engineering and Evolution UnitOkinawa Institute of Science and Technology Graduate UniversityOkinawaJapan
| |
Collapse
|
13
|
Biswas G, Ghosh S, Basu S, Bhattacharyya D, Datta AK, Banerjee R. Can the jigsaw puzzle model of protein folding re‐assemble a hydrophobic core? Proteins 2022; 90:1390-1412. [DOI: 10.1002/prot.26321] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 01/11/2022] [Accepted: 01/28/2022] [Indexed: 12/30/2022]
Affiliation(s)
- Gargi Biswas
- Saha Institute of Nuclear Physics Kolkata India
- Homi Bhabha National Institute Mumbai India
| | | | - Sankar Basu
- Saha Institute of Nuclear Physics Kolkata India
| | | | | | - Rahul Banerjee
- Saha Institute of Nuclear Physics Kolkata India
- Homi Bhabha National Institute Mumbai India
| |
Collapse
|
14
|
Jackson C, Toth-Petroczy A, Kolodny R, Hollfelder F, Fuxreiter M, Caroline Lynn Kamerlin S, Tokuriki N. Adventures on the routes of protein evolution — in memoriam Dan Salah Tawfik (1955 - 2021). J Mol Biol 2022; 434:167462. [DOI: 10.1016/j.jmb.2022.167462] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Accepted: 01/17/2022] [Indexed: 12/21/2022]
|
15
|
Wang X, Wong LM, McElvain ME, Martire S, Lee WH, Li CZ, Fisher FA, Maheshwari RL, Wu ML, Imun MC, Murad R, Warshaviak DT, Yin J, Kamb A, Xu H. A rational approach to assess off-target reactivity of a dual-signal integrator for T cell therapy. Toxicol Appl Pharmacol 2022; 437:115894. [DOI: 10.1016/j.taap.2022.115894] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Revised: 01/15/2022] [Accepted: 01/19/2022] [Indexed: 01/16/2023]
|
16
|
Dubreuil B, Levy ED. Abundance Imparts Evolutionary Constraints of Similar Magnitude on the Buried, Surface, and Disordered Regions of Proteins. Front Mol Biosci 2021; 8:626729. [PMID: 33996892 PMCID: PMC8119896 DOI: 10.3389/fmolb.2021.626729] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 03/29/2021] [Indexed: 12/02/2022] Open
Abstract
An understanding of the forces shaping protein conservation is key, both for the fundamental knowledge it represents and to allow for optimal use of evolutionary information in practical applications. Sequence conservation is typically examined at one of two levels. The first is a residue-level, where intra-protein differences are analyzed and the second is a protein-level, where inter-protein differences are studied. At a residue level, we know that solvent-accessibility is a prime determinant of conservation. By inverting this logic, we inferred that disordered regions are slightly more solvent-accessible on average than the most exposed surface residues in domains. By integrating abundance information with evolutionary data within and across proteins, we confirmed a previously reported strong surface-core association in the evolution of structured regions, but we found a comparatively weak association between disordered and structured regions. The facts that disordered and structured regions experience different structural constraints and evolve independently provide a unique setup to examine an outstanding question: why is a protein’s abundance the main determinant of its sequence conservation? Indeed, any structural or biophysical property linked to the abundance-conservation relationship should increase the relative conservation of regions concerned with that property (e.g., disordered residues with mis-interactions, domain residues with misfolding). Surprisingly, however, we found the conservation of disordered and structured regions to increase in equal proportion with abundance. This observation implies that either abundance-related constraints are structure-independent, or multiple constraints apply to different regions and perfectly balance each other.
Collapse
Affiliation(s)
- Benjamin Dubreuil
- Department of Structural Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Emmanuel D Levy
- Department of Structural Biology, Weizmann Institute of Science, Rehovot, Israel
| |
Collapse
|
17
|
Sharir-Ivry A, Xia Y. Quantifying evolutionary importance of protein sites: A Tale of two measures. PLoS Genet 2021; 17:e1009476. [PMID: 33826605 PMCID: PMC8026052 DOI: 10.1371/journal.pgen.1009476] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Accepted: 03/09/2021] [Indexed: 12/05/2022] Open
Abstract
A key challenge in evolutionary biology is the accurate quantification of selective pressure on proteins and other biological macromolecules at single-site resolution. The evolutionary importance of a protein site under purifying selection is typically measured by the degree of conservation of the protein site itself. A possible alternative measure is the strength of the site-induced conservation gradient in the rest of the protein structure. However, the quantitative relationship between these two measures remains unknown. Here, we show that despite major differences, there is a strong linear relationship between the two measures such that more conserved protein sites also induce stronger conservation gradient in the rest of the protein. This linear relationship is universal as it holds for different types of proteins and functional sites in proteins. Our results show that the strong selective pressure acting on the functional site in general percolates through the rest of the protein via residue-residue contacts. Surprisingly however, catalytic sites in enzymes are the principal exception to this rule. Catalytic sites induce significantly stronger conservation gradients in the rest of the protein than expected from the degree of conservation of the site alone. The unique requirement for the active site to selectively stabilize the transition state of the catalyzed chemical reaction imposes additional selective constraints on the rest of the enzyme. Sites within proteins which are important for stability or function are under stronger selective pressure and evolve more slowly than other sites. Catalytic sites in enzymes are such highly conserved sites with relatively low evolutionary rates. Recently, catalytic sites were shown to induce a strong gradient of conservation such that the closer a residue is to the catalytic site, the more conserved it is. Here we show that there is a universal linear relationship between the degree of evolutionary conservation of a protein site and the conservation gradient it induces in the protein tertiary structure, applicable to all types of sites. Our findings suggest that selective pressure acting on a protein site generally percolates through the rest of the protein via residue-residue contacts. Remarkably however, catalytic sites induce significantly stronger conservation gradients than expected from their degree of conservation alone. Our results indicate that the strong conservation gradient induced by catalytic sites is driven by the unique function of enzyme catalysis, which requires the participation of many residues beyond the few key catalytic residues. Our results provide insights into evolutionary conservation patterns of and surrounding proteins functional sites, with implications for functional site prediction and protein design.
Collapse
Affiliation(s)
- Avital Sharir-Ivry
- Department of Bioengineering, McGill University, Montreal, Quebec, Canada
| | - Yu Xia
- Department of Bioengineering, McGill University, Montreal, Quebec, Canada
- * E-mail:
| |
Collapse
|
18
|
Wang CK, Craik DJ. Linking molecular evolution to molecular grafting. J Biol Chem 2021; 296:100425. [PMID: 33600801 PMCID: PMC8005815 DOI: 10.1016/j.jbc.2021.100425] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Revised: 02/09/2021] [Accepted: 02/13/2021] [Indexed: 12/01/2022] Open
Abstract
Molecular grafting is a strategy for the engineering of molecular scaffolds into new functional agents, such as next-generation therapeutics. Despite its wide use, studies so far have focused almost exclusively on demonstrating its utility rather than understanding the factors that lead to either poor or successful grafting outcomes. Here, we examine protein evolution and identify parallels between the natural process of protein functional diversification and the artificial process of molecular grafting. We discuss features of natural proteins that are correlated to innovability-the capacity to acquire new functions-and describe their implications to molecular grafting scaffolds. Disulfide-rich peptides are used as exemplars because they are particularly promising scaffolds onto which new functions can be grafted. This article provides a perspective on why some scaffolds are more suitable for grafting than others, identifying opportunities on how molecular grafting might be improved.
Collapse
Affiliation(s)
- Conan K Wang
- Institute for Molecular Bioscience and Australian Research Council Centre of Excellence for Innovations in Peptide and Protein Science, The University of Queensland, Brisbane, Queensland, Australia.
| | - David J Craik
- Institute for Molecular Bioscience and Australian Research Council Centre of Excellence for Innovations in Peptide and Protein Science, The University of Queensland, Brisbane, Queensland, Australia
| |
Collapse
|
19
|
Outer membrane protein evolution. Curr Opin Struct Biol 2021; 68:122-128. [PMID: 33493965 DOI: 10.1016/j.sbi.2021.01.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Revised: 12/16/2020] [Accepted: 01/02/2021] [Indexed: 01/31/2023]
Abstract
Outer membrane proteins have remarkably homogeneous structure. They are all up down β-barrels. Up down barrels themselves are composed of repeated sets of β-hairpins. The consistency of the usage of the β-hairpin throughout the outer membrane milieu allows for interrogation of the evolution of these repetitive structures. Here we describe recent investigations of outer membrane protein evolution and how evolutionary precepts have been used for novel outer membrane protein design.
Collapse
|
20
|
Pontes C, Ruiz-Serra V, Lepore R, Valencia A. Unraveling the molecular basis of host cell receptor usage in SARS-CoV-2 and other human pathogenic β-CoVs. Comput Struct Biotechnol J 2021; 19:759-766. [PMID: 33456724 PMCID: PMC7802526 DOI: 10.1016/j.csbj.2021.01.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Revised: 01/07/2021] [Accepted: 01/07/2021] [Indexed: 01/13/2023] Open
Abstract
The recent emergence of the novel SARS-CoV-2 in China and its rapid spread in the human population has led to a public health crisis worldwide. Like in SARS-CoV, horseshoe bats currently represent the most likely candidate animal source for SARS-CoV-2. Yet, the specific mechanisms of cross-species transmission and adaptation to the human host remain unknown. Here we show that the unsupervised analysis of conservation patterns across the β-CoV spike protein family, using sequence information alone, can provide valuable insights on the molecular basis of the specificity of β-CoVs to different host cell receptors. More precisely, our results indicate that host cell receptor usage is encoded in the amino acid sequences of different CoV spike proteins in the form of a set of specificity determining positions (SDPs). Furthermore, by integrating structural data, in silico mutagenesis and coevolution analysis we could elucidate the role of SDPs in mediating ACE2 binding across the Sarbecovirus lineage, either by engaging the receptor through direct intermolecular interactions or by affecting the local environment of the receptor binding motif. Finally, by the analysis of coevolving mutations across a paired MSA we were able to identify key intermolecular contacts occurring at the spike-ACE2 interface. These results show that effective mining of the evolutionary records held in the sequence of the spike protein family can help tracing the molecular mechanisms behind the evolution and host-receptor adaptation of circulating and future novel β-CoVs.
Collapse
Key Words
- APC, average product correction
- CoVs, Coronaviruses
- EV, evolutionary rate
- Functional specificity
- MCA, multiple correspondence analysis
- MI, mutual information
- MSA, multiple sequence alignment
- NTD, N-terminal domain
- Phylogenetic analysis
- Protein subfamilies
- RBD, receptor binding domain
- RBM, receptor binding motif
- SARS-CoV-2
- SDPs, specificity determining positions
- Specificity Determining Positions
- Spike protein evolution
- hACE2, human angiotensin converting enzyme 2
Collapse
Affiliation(s)
- Camila Pontes
- Barcelona Supercomputing Center (BSC), 08034 Barcelona, Spain
- University of Brasília (UnB), 70910-900, Brasília - DF, Brazil
| | | | - Rosalba Lepore
- Barcelona Supercomputing Center (BSC), 08034 Barcelona, Spain
| | - Alfonso Valencia
- Barcelona Supercomputing Center (BSC), 08034 Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| |
Collapse
|
21
|
Tanaka SI, Tsutaki M, Yamamoto S, Mizutani H, Kurahashi R, Hirata A, Takano K. Exploring mutable conserved sites and fatal non-conserved sites by random mutation of esterase from Sulfolobus tokodaii and subtilisin from Thermococcus kodakarensis. Int J Biol Macromol 2020; 170:343-353. [PMID: 33383075 DOI: 10.1016/j.ijbiomac.2020.12.171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Revised: 12/21/2020] [Accepted: 12/22/2020] [Indexed: 10/22/2022]
Abstract
Homologous proteins differ in their amino acid sequences at several positions. Generally, conserved sites are recognized as not suitable for amino acid substitution, and thus in evolutionary protein engineering, non-conserved sites are often selected as mutation sites. However, there have also been reports of possible mutations in conserved sites. In this study, we explored mutable conserved sites and immutable non-conserved sites by testing random mutations of two thermostable proteins, an esterase from Sulfolobus tokodaii (Sto-Est) and a subtilisin from Thermococcus kodakarensis (Tko-Sub). The subtilisin domain of Tko-Sub needs Ca2+ ions and the propeptide domain for stability, folding and maturation. The results from the two proteins showed that about one-third of the mutable sites were detected in conserved sites and some non-conserved sites lost enzymatic activity at high temperatures due to mutation. Of the conserved sites in Sto-Est, the sites on the loop, on the surface, and far from the active site are more resistant to mutation. In Tko-Sub, the sites flanking Ca2+-binding sites and propeptide were undesirable for mutation. The results presented here serve as an index for selecting mutation sites and contribute to the expansion of available sequence range by introducing mutations at conserved sites.
Collapse
Affiliation(s)
- Shun-Ichi Tanaka
- Department of Biomolecular Chemistry, Kyoto Prefectural University, Hangi-cho, Shimogamo, Sakyo-ku, Kyoto 606-8522, Japan
| | - Minami Tsutaki
- Department of Biomolecular Chemistry, Kyoto Prefectural University, Hangi-cho, Shimogamo, Sakyo-ku, Kyoto 606-8522, Japan
| | - Seira Yamamoto
- Department of Biomolecular Chemistry, Kyoto Prefectural University, Hangi-cho, Shimogamo, Sakyo-ku, Kyoto 606-8522, Japan
| | - Hayate Mizutani
- Department of Biomolecular Chemistry, Kyoto Prefectural University, Hangi-cho, Shimogamo, Sakyo-ku, Kyoto 606-8522, Japan
| | - Ryo Kurahashi
- Department of Biomolecular Chemistry, Kyoto Prefectural University, Hangi-cho, Shimogamo, Sakyo-ku, Kyoto 606-8522, Japan
| | - Azumi Hirata
- Department of Anatomy and Cell Biology, Osaka Medical College, Daigaku-machi, Takatsuki, Osaka 569-8686, Japan
| | - Kazufumi Takano
- Department of Biomolecular Chemistry, Kyoto Prefectural University, Hangi-cho, Shimogamo, Sakyo-ku, Kyoto 606-8522, Japan.
| |
Collapse
|
22
|
Palopoli N, Marchetti J, Monzon AM, Zea DJ, Tosatto SCE, Fornasari MS, Parisi G. Intrinsically Disordered Protein Ensembles Shape Evolutionary Rates Revealing Conformational Patterns. J Mol Biol 2020; 433:166751. [PMID: 33310020 DOI: 10.1016/j.jmb.2020.166751] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Revised: 12/01/2020] [Accepted: 12/05/2020] [Indexed: 10/22/2022]
Abstract
Intrinsically disordered proteins (IDPs) lack stable tertiary structure under physiological conditions. The unique composition and complex dynamical behaviour of IDPs make them a challenge for structural biology and molecular evolution studies. Using NMR ensembles, we found that IDPs evolve under a strong site-specific evolutionary rate heterogeneity, mainly originated by different constraints derived from their inter-residue contacts. Evolutionary rate profiles correlate with the experimentally observed conformational diversity of the protein, allowing the description of different conformational patterns possibly related to their structure-function relationships. The correlation between evolutionary rates and contact information improves when structural information is taken not from any individual conformer or the whole ensemble, but from combining a limited number of conformers. Our results suggest that residue contacts in disordered regions constrain evolutionary rates to conserve the dynamic behaviour of the ensemble and that evolutionary rates can be used as a proxy for the conformational diversity of IDPs.
Collapse
Affiliation(s)
- Nicolas Palopoli
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, CONICET, Bernal, Buenos Aires, Argentina
| | - Julia Marchetti
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, CONICET, Bernal, Buenos Aires, Argentina
| | | | - Diego J Zea
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), Paris, France
| | | | - Maria S Fornasari
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, CONICET, Bernal, Buenos Aires, Argentina
| | - Gustavo Parisi
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, CONICET, Bernal, Buenos Aires, Argentina.
| |
Collapse
|
23
|
Mozzi A, Forni D, Cagliani R, Clerici M, Pozzoli U, Sironi M. Intrinsically disordered regions are abundant in simplexvirus proteomes and display signatures of positive selection. Virus Evol 2020; 6:veaa028. [PMID: 32411391 PMCID: PMC7211401 DOI: 10.1093/ve/veaa028] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Whereas the majority of herpesviruses co-speciated with their mammalian hosts, human herpes simplex virus 2 (HSV-2, genus Simplexvirus) most likely originated from the cross-species transmission of chimpanzee herpesvirus 1 to an ancestor of modern humans. We exploited the peculiar evolutionary history of HSV-2 to investigate the selective events that drove herpesvirus adaptation to a new host. We show that HSV-2 intrinsically disordered regions (IDRs)-that is, protein domains that do not adopt compact three-dimensional structures-are strongly enriched in positive selection signals. Analysis of viral proteomes indicated that a significantly higher portion of simplexvirus proteins is disordered compared with the proteins of other human herpesviruses. IDR abundance in simplexvirus proteomes was not a consequence of the base composition of their genomes (high G + C content). Conversely, protein function determines the IDR fraction, which is significantly higher in viral proteins that interact with human factors. We also found that the average extent of disorder in herpesvirus proteins tends to parallel that of their human interactors. These data suggest that viruses that interact with fast-evolving, disordered human proteins, in turn, evolve disordered viral interactors poised for innovation. We propose that the high IDR fraction present in simplexvirus proteomes contributes to their wider host range compared with other herpesviruses.
Collapse
Affiliation(s)
- Alessandra Mozzi
- Scientific Institute, IRCCS E. MEDEA, Bioinformatics, Bosisio Parini 23842, Italy
| | - Diego Forni
- Scientific Institute, IRCCS E. MEDEA, Bioinformatics, Bosisio Parini 23842, Italy
| | - Rachele Cagliani
- Scientific Institute, IRCCS E. MEDEA, Bioinformatics, Bosisio Parini 23842, Italy
| | - Mario Clerici
- Department of Physiopathology and Transplantation, University of Milan, Milan 20090, Italy.,Don C. Gnocchi Foundation ONLUS, IRCCS, Milan 20148, Italy
| | - Uberto Pozzoli
- Scientific Institute, IRCCS E. MEDEA, Bioinformatics, Bosisio Parini 23842, Italy
| | - Manuela Sironi
- Scientific Institute, IRCCS E. MEDEA, Bioinformatics, Bosisio Parini 23842, Italy
| |
Collapse
|
24
|
Abstract
The distribution of fitness effects of mutation plays a central role in constraining protein evolution. The underlying mechanisms by which mutations lead to fitness effects are typically attributed to changes in protein specific activity or abundance. Here, we reveal the importance of a mutation's collateral fitness effects, which we define as effects that do not derive from changes in the protein's ability to perform its physiological function. We comprehensively measured the collateral fitness effects of missense mutations in the Escherichia coli TEM-1 β-lactamase antibiotic resistance gene using growth competition experiments in the absence of antibiotic. At least 42% of missense mutations in TEM-1 were deleterious, indicating that for some proteins collateral fitness effects occur as frequently as effects on protein activity and abundance. Deleterious mutations caused improper posttranslational processing, incorrect disulfide-bond formation, protein aggregation, changes in gene expression, and pleiotropic effects on cell phenotype. Deleterious collateral fitness effects occurred more frequently in TEM-1 than deleterious effects on antibiotic resistance in environments with low concentrations of the antibiotic. The surprising prevalence of deleterious collateral fitness effects suggests they may play a role in constraining protein evolution, particularly for highly expressed proteins, for proteins under intermittent selection for their physiological function, and for proteins whose contribution to fitness is buffered against deleterious effects on protein activity and protein abundance.
Collapse
|
25
|
Integrated structural and evolutionary analysis reveals common mechanisms underlying adaptive evolution in mammals. Proc Natl Acad Sci U S A 2020; 117:5977-5986. [PMID: 32123117 DOI: 10.1073/pnas.1916786117] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Understanding the molecular basis of adaptation to the environment is a central question in evolutionary biology, yet linking detected signatures of positive selection to molecular mechanisms remains challenging. Here we demonstrate that combining sequence-based phylogenetic methods with structural information assists in making such mechanistic interpretations on a genomic scale. Our integrative analysis shows that positively selected sites tend to colocalize on protein structures and that positively selected clusters are found in functionally important regions of proteins, indicating that positive selection can contravene the well-known principle of evolutionary conservation of functionally important regions. This unexpected finding, along with our discovery that positive selection acts on structural clusters, opens previously unexplored strategies for the development of better models of protein evolution. Remarkably, proteins where we detect the strongest evidence of clustering belong to just two functional groups: Components of immune response and metabolic enzymes. This gives a coherent picture of pathogens and xenobiotics as important drivers of adaptive evolution of mammals.
Collapse
|
26
|
Klesmith JR, Wu L, Lobb RR, Rennert PD, Hackel BJ. Fine Epitope Mapping of the CD19 Extracellular Domain Promotes Design. Biochemistry 2019; 58:4869-4881. [PMID: 31702909 DOI: 10.1021/acs.biochem.9b00808] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The B-cell surface protein CD19 is present throughout the cell life cycle and is uniformly expressed in leukemias, making it a target for chimeric antigen receptor engineered immune cell therapy. Identifying the sequence dependence of the binding of CD19 to antibodies empowers fundamental study and more tailored development of CD19-targeted therapeutics. To identify the antibody-binding epitopes on CD19, we screened a comprehensive single-site saturation mutation library of the human CD19 extracellular domain to identify mutations detrimental to binding FMC63-the dominant CD19 antibody used in chimeric antigen receptor development-as well as 4G7-2E3 and 3B10, which have been used in various types of CD19 research and development. All three antibodies had partially overlapping, yet distinct, epitopes near the published epitope of antibody B43. The FMC63 conformational epitope spans spatially adjacent, but genetically distant, loops in exons 3 and 4. The 3B10 epitope is a linear peptide sequence that binds CD19 with 440 pM affinity. Along with their primary goal of epitope mapping, the mutational tolerance data also empowered additional CD19 variant design and analysis. A designed CD19 variant with all N-linked glycosylation sites removed successfully bound antibody in the yeast display context, which provides a lead for aglycosylated applications. Screening for thermally stable variants identified mutations to guide further CD19 stabilization for fusion protein applications and revealed evolutionary affinity-stability trade-offs. These fundamental insights into CD19 sequence-function relationships enhance our understanding of antibody-mediated CD19-targeted therapeutics.
Collapse
Affiliation(s)
- Justin R Klesmith
- Department of Chemical Engineering and Materials Science , University of Minnesota-Twin Cities , 421 Washington Avenue SE , Minneapolis , Minnesota 55455 , United States
| | - Lan Wu
- Aleta Biotherapeutics , 27 Strathmore Road , Natick , Massachusetts 01760 , United States
| | - Roy R Lobb
- Aleta Biotherapeutics , 27 Strathmore Road , Natick , Massachusetts 01760 , United States
| | - Paul D Rennert
- Aleta Biotherapeutics , 27 Strathmore Road , Natick , Massachusetts 01760 , United States
| | - Benjamin J Hackel
- Department of Chemical Engineering and Materials Science , University of Minnesota-Twin Cities , 421 Washington Avenue SE , Minneapolis , Minnesota 55455 , United States
| |
Collapse
|
27
|
Sharir-Ivry A, Xia Y. Non-catalytic Binding Sites Induce Weaker Long-Range Evolutionary Rate Gradients than Catalytic Sites in Enzymes. J Mol Biol 2019; 431:3860-3870. [DOI: 10.1016/j.jmb.2019.07.019] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2019] [Revised: 06/26/2019] [Accepted: 07/11/2019] [Indexed: 01/02/2023]
|
28
|
Abstract
Diffusional motion within the crowded environment of the cell is known to be crucial to cellular function as it drives the interactions of proteins. However, the relationships between protein diffusion, shape and interaction, and the evolutionary selection mechanisms that arise as a consequence, have not been investigated. Here, we study the dynamics of triaxial ellipsoids of equivalent steric volume to proteins at different aspect ratios and volume fractions using a combination of Brownian molecular dynamics and geometric packing. In general, proteins are found to have a shape, approximately Golden in aspect ratio, that give rise to the highest critical volume fraction resisting gelation, corresponding to the fastest long-time self-diffusion in the cell. The ellipsoidal shape also directs random collisions between proteins away from sites that would promote aggregation and loss of function to more rapidly evolving nonsticky regions on the surface, and further provides a greater tolerance to mutation.
Collapse
|
29
|
Allosteric Modulation of Binding Specificity by Alternative Packing of Protein Cores. J Mol Biol 2018; 431:336-350. [PMID: 30471255 DOI: 10.1016/j.jmb.2018.11.018] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2018] [Revised: 11/04/2018] [Accepted: 11/14/2018] [Indexed: 11/21/2022]
Abstract
Hydrophobic cores are often viewed as tightly packed and rigid, but they do show some plasticity and could thus be attractive targets for protein design. Here we explored the role of different functional pressures on the core packing and ligand recognition of the SH3 domain from human Fyn tyrosine kinase. We randomized the hydrophobic core and used phage display to select variants that bound to each of three distinct ligands. The three evolved groups showed remarkable differences in core composition, illustrating the effect of different selective pressures on the core. Changes in the core did not significantly alter protein stability, but were linked closely to changes in binding affinity and specificity. Structural analysis and molecular dynamics simulations revealed the structural basis for altered specificity. The evolved domains had significantly reduced core volumes, which in turn induced increased backbone flexibility. These motions were propagated from the core to the binding surface and induced significant conformational changes. These results show that alternative core packing and consequent allosteric modulation of binding interfaces could be used to engineer proteins with novel functions.
Collapse
|
30
|
Skibinski DOF, Ghiselli F, Diz AP, Milani L, Mullins JGL. Structure-Related Differences between Cytochrome Oxidase I Proteins in a Stable Heteroplasmic Mitochondrial System. Genome Biol Evol 2018; 9:3265-3281. [PMID: 29149282 PMCID: PMC5726481 DOI: 10.1093/gbe/evx235] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/13/2017] [Indexed: 12/27/2022] Open
Abstract
Many bivalve species have two types of mitochondrial DNA passed independently through the female line (F genome) and male line (M genome). Here we study the cytochrome oxidase I protein in such bivalve species and provide evidence for differences between the F and M proteins in amino acid property values, particularly relating to hydrophobicity and helicity. The magnitude of these differences varies between different regions of the protein and the change from the ancestor is most marked in the M protein. The observed changes occur in parallel and in the same direction in the different species studied. Two possible causes are considered, first relaxation of purifying selection with drift and second positive selection. These may operate in different ways in different regions of the protein. Many different amino acid substitutions contribute in a small way to the observed variation, but substitutions involving alanine and serine have a quantitatively large effect. Some of these substitutions are potential targets for phosphorylation and some are close to residues of functional importance in the catalytic mechanism. We propose that the observed changes in the F and M proteins might contribute to functional differences between them relating to ATP production and mitochondrial membrane potential with implications for sperm function.
Collapse
Affiliation(s)
- David O F Skibinski
- Institute of Life Science, Swansea University Medical School, United Kingdom
| | - Fabrizio Ghiselli
- Department of Biological, Geological, and Environmental Sciences, University of Bologna, Italy
| | - Angel P Diz
- Department of Biochemistry, Genetics and Immunology, University of Vigo, Spain
| | - Liliana Milani
- Department of Biological, Geological, and Environmental Sciences, University of Bologna, Italy
| | | |
Collapse
|
31
|
Beyond Thermodynamic Constraints: Evolutionary Sampling Generates Realistic Protein Sequence Variation. Genetics 2018; 208:1387-1395. [PMID: 29382650 DOI: 10.1534/genetics.118.300699] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2018] [Accepted: 01/25/2018] [Indexed: 01/01/2023] Open
Abstract
Biological evolution generates a surprising amount of site-specific variability in protein sequences. Yet, attempts at modeling this process have been only moderately successful, and current models based on protein structural metrics explain, at best, 60% of the observed variation. Surprisingly, simple measures of protein structure, such as solvent accessibility, are often better predictors of site-specific variability than more complex models employing all-atom energy functions and detailed structural modeling. We suggest here that these more complex models perform poorly because they lack consideration of the evolutionary process, which is, in part, captured by the simpler metrics. We compare protein sequences that are computationally designed to sequences that are computationally evolved using the same protein-design energy function and to homologous natural sequences. We find that, by a wide variety of metrics, evolved sequences are much more similar to natural sequences than are designed sequences. In particular, designed sequences are too conserved on the protein surface relative to natural sequences, whereas evolved sequences are not. Our results suggest that evolutionary simulation produces a realistic sampling of sequence space. By contrast, protein design-at least as currently implemented-does not. Existing energy functions seem to be sufficiently accurate to correctly describe the key thermodynamic constraints acting on protein sequences, but they need to be paired with realistic sampling schemes to generate realistic sequence alignments.
Collapse
|
32
|
Fontanillas E, Galzitskaya OV, Lecompte O, Lobanov MY, Tanguy A, Mary J, Girguis PR, Hourdez S, Jollivet D. Proteome Evolution of Deep-Sea Hydrothermal Vent Alvinellid Polychaetes Supports the Ancestry of Thermophily and Subsequent Adaptation to Cold in Some Lineages. Genome Biol Evol 2017; 9:279-296. [PMID: 28082607 PMCID: PMC5381640 DOI: 10.1093/gbe/evw298] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/19/2016] [Indexed: 12/22/2022] Open
Abstract
Temperature, perhaps more than any other environmental factor, is likely to influence the evolution of all organisms. It is also a very interesting factor to understand how genomes are shaped by selection over evolutionary timescales, as it potentially affects the whole genome. Among thermophilic prokaryotes, temperature affects both codon usage and protein composition to increase the stability of the transcriptional/translational machinery, and the resulting proteins need to be functional at high temperatures. Among eukaryotes less is known about genome evolution, and the tube-dwelling worms of the family Alvinellidae represent an excellent opportunity to test hypotheses about the emergence of thermophily in ectothermic metazoans. The Alvinellidae are a group of worms that experience varying thermal regimes, presumably having evolved into these niches over evolutionary times. Here we analyzed 423 putative orthologous loci derived from 6 alvinellid species including the thermophilic Alvinella pompejana and Paralvinella sulfincola. This comparative approach allowed us to assess amino acid composition, codon usage, divergence, direction of residue changes and the strength of selection along the alvinellid phylogeny, and to design a new eukaryotic thermophilic criterion based on significant differences in the residue composition of proteins. Contrary to expectations, the alvinellid ancestor of all present-day species seems to have been thermophilic, a trait subsequently maintained by purifying selection in lineages that still inhabit higher temperature environments. In contrast, lineages currently living in colder habitats likely evolved under selective relaxation, with some degree of positive selection for low-temperature adaptation at the protein level.
Collapse
Affiliation(s)
- Eric Fontanillas
- Sorbonne Universités, UPMC Univ. Paris 06, CNRS UMR 7144, Adaptation et Diversité en Milieu Marin, Equipe ABICE, Station Biologique de Roscoff, 29688 Roscoff, France
| | - Oxana V Galzitskaya
- Laboratory of Protein Physics, Institute of Protein Research, RAS, Institutskaya street, 4, 142290 Pushchino, Moscow, Russia
| | - Odile Lecompte
- CSTB - ICUBE, UMR7357, Faculté de Médecine, 4 rue Kirschleger, 67085 Strasbourg, France
| | - Mikhail Y Lobanov
- Laboratory of Protein Physics, Institute of Protein Research, RAS, Institutskaya street, 4, 142290 Pushchino, Moscow, Russia
| | - Arnaud Tanguy
- Sorbonne Universités, UPMC Univ. Paris 06, CNRS UMR 7144, Adaptation et Diversité en Milieu Marin, Equipe ABICE, Station Biologique de Roscoff, 29688 Roscoff, France
| | - Jean Mary
- Sorbonne Universités, UPMC Univ. Paris 06, CNRS UMR 7144, Adaptation et Diversité en Milieu Marin, Equipe ABICE, Station Biologique de Roscoff, 29688 Roscoff, France
| | - Peter R Girguis
- Department of Organismic & Evolutionary Biology, Harvard University Biological Laboratories, Cambridge, MA
| | - Stéphane Hourdez
- Sorbonne Universités, UPMC Univ. Paris 06, CNRS UMR 7144, Adaptation et Diversité en Milieu Marin, Equipe ABICE, Station Biologique de Roscoff, 29688 Roscoff, France
| | - Didier Jollivet
- Sorbonne Universités, UPMC Univ. Paris 06, CNRS UMR 7144, Adaptation et Diversité en Milieu Marin, Equipe ABICE, Station Biologique de Roscoff, 29688 Roscoff, France
| |
Collapse
|
33
|
Abrusán G, Marsh JA. Alpha Helices Are More Robust to Mutations than Beta Strands. PLoS Comput Biol 2016; 12:e1005242. [PMID: 27935949 PMCID: PMC5147804 DOI: 10.1371/journal.pcbi.1005242] [Citation(s) in RCA: 77] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2016] [Accepted: 11/08/2016] [Indexed: 12/30/2022] Open
Abstract
The rapidly increasing amount of data on human genetic variation has resulted in a growing demand to identify pathogenic mutations computationally, as their experimental validation is currently beyond reach. Here we show that alpha helices and beta strands differ significantly in their ability to tolerate mutations: helices can accumulate more mutations than strands without change, due to the higher numbers of inter-residue contacts in helices. This results in two patterns: a) the same number of mutations causes less structural change in helices than in strands; b) helices diverge more rapidly in sequence than strands within the same domains. Additionally, both helices and strands are significantly more robust than coils. Based on this observation we show that human missense mutations that change secondary structure are more likely to be pathogenic than those that do not. Moreover, inclusion of predicted secondary structure changes shows significant utility for improving upon state-of-the-art pathogenicity predictions. The factors that determine the robustness and evolvability of proteins are still largely unknown. In this work the authors show that different secondary structure elements of proteins (helices and strands) differ in their ability to tolerate mutations, and demonstrate that it is caused by differences in the number of non-covalent residue interactions within these secondary structure units. The results suggest that engineering de novo all-alpha proteins should be easier than all-beta ones, as more sequences can to fold to the same topology. Additionally, secondary structure can be used to improve current methods of pathogenicity predictions; mutations that change secondary structure are more likely to be pathogenic than mutations that do not, due to their strong destabilizing effect on protein structure.
Collapse
Affiliation(s)
- György Abrusán
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Crewe Road, Edinburgh EH4 2XU, United Kingdom
- Institute of Biochemistry, Biological Research Centre of the Hungarian Academy of Sciences, Szeged, Temesvári krt. 62, Hungary
- * E-mail:
| | - Joseph A. Marsh
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Crewe Road, Edinburgh EH4 2XU, United Kingdom
| |
Collapse
|
34
|
Steinberg B, Ostermeier M. Shifting Fitness and Epistatic Landscapes Reflect Trade-offs along an Evolutionary Pathway. J Mol Biol 2016; 428:2730-43. [DOI: 10.1016/j.jmb.2016.04.033] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2016] [Revised: 04/18/2016] [Accepted: 04/29/2016] [Indexed: 01/04/2023]
|
35
|
Newton MS, Arcus VL, Patrick WM. Rapid bursts and slow declines: on the possible evolutionary trajectories of enzymes. J R Soc Interface 2016; 12:rsif.2015.0036. [PMID: 25926697 DOI: 10.1098/rsif.2015.0036] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The evolution of enzymes is often viewed as following a smooth and steady trajectory, from barely functional primordial catalysts to the highly active and specific enzymes that we observe today. In this review, we summarize experimental data that suggest a different reality. Modern examples, such as the emergence of enzymes that hydrolyse human-made pesticides, demonstrate that evolution can be extraordinarily rapid. Experiments to infer and resurrect ancient sequences suggest that some of the first organisms present on the Earth are likely to have possessed highly active enzymes. Reconciling these observations, we argue that rapid bursts of strong selection for increased catalytic efficiency are interspersed with much longer periods in which the catalytic power of an enzyme erodes, through neutral drift and selection for other properties such as cellular energy efficiency or regulation. Thus, many enzymes may have already passed their catalytic peaks.
Collapse
Affiliation(s)
- Matilda S Newton
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | - Vickery L Arcus
- School of Biology, University of Waikato, Hamilton, New Zealand
| | - Wayne M Patrick
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| |
Collapse
|
36
|
Echave J, Spielman SJ, Wilke CO. Causes of evolutionary rate variation among protein sites. Nat Rev Genet 2016; 17:109-21. [PMID: 26781812 DOI: 10.1038/nrg.2015.18] [Citation(s) in RCA: 180] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
It has long been recognized that certain sites within a protein, such as sites in the protein core or catalytic residues in enzymes, are evolutionarily more conserved than other sites. However, our understanding of rate variation among sites remains surprisingly limited. Recent progress to address this includes the development of a wide array of reliable methods to estimate site-specific substitution rates from sequence alignments. In addition, several molecular traits have been identified that correlate with site-specific mutation rates, and novel mechanistic biophysical models have been proposed to explain the observed correlations. Nonetheless, current models explain, at best, approximately 60% of the observed variance, highlighting the limitations of current methods and models and the need for new research directions.
Collapse
Affiliation(s)
- Julian Echave
- Escuela de Ciencia y Tecnología, Universidad Nacional de San Martín, 1650 San Martín, Buenos Aires, Argentina
| | - Stephanie J Spielman
- Department of Integrative Biology, Center for Computational Biology and Bioinformatics, and Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, Texas 78712, USA
| | - Claus O Wilke
- Department of Integrative Biology, Center for Computational Biology and Bioinformatics, and Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, Texas 78712, USA
| |
Collapse
|
37
|
Isaac AE, Sinha S. Analysis of core-periphery organization in protein contact networks reveals groups of structurally and functionally critical residues. J Biosci 2015; 40:683-99. [PMID: 26564971 DOI: 10.1007/s12038-015-9554-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
The representation of proteins as networks of interacting amino acids, referred to as protein contact networks (PCN), and their subsequent analyses using graph theoretic tools, can provide novel insights into the key functional roles of specific groups of residues. We have characterized the networks corresponding to the native states of 66 proteins (belonging to different families) in terms of their core-periphery organization. The resulting hierarchical classification of the amino acid constituents of a protein arranges the residues into successive layers - having higher core order - with increasing connection density, ranging from a sparsely linked periphery to a densely intra-connected core (distinct from the earlier concept of protein core defined in terms of the three-dimensional geometry of the native state, which has least solvent accessibility). Our results show that residues in the inner cores are more conserved than those at the periphery. Underlining the functional importance of the network core, we see that the receptor sites for known ligand molecules of most proteins occur in the innermost core. Furthermore, the association of residues with structural pockets and cavities in binding or active sites increases with the core order. From mutation sensitivity analysis, we show that the probability of deleterious or intolerant mutations also increases with the core order. We also show that stabilization centre residues are in the innermost cores, suggesting that the network core is critically important in maintaining the structural stability of the protein. A publicly available Web resource for performing core-periphery analysis of any protein whose native state is known has been made available by us at http://www.imsc.res.in/ ~sitabhra/proteinKcore/index.html.
Collapse
Affiliation(s)
- Arnold Emerson Isaac
- Bioinformatics Division, School of Bio Sciences and Technology, VIT University, Vellore, India
| | | |
Collapse
|
38
|
Rockah-Shmuel L, Tóth-Petróczy Á, Tawfik DS. Systematic Mapping of Protein Mutational Space by Prolonged Drift Reveals the Deleterious Effects of Seemingly Neutral Mutations. PLoS Comput Biol 2015; 11:e1004421. [PMID: 26274323 PMCID: PMC4537296 DOI: 10.1371/journal.pcbi.1004421] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2015] [Accepted: 06/30/2015] [Indexed: 11/18/2022] Open
Abstract
Systematic mappings of the effects of protein mutations are becoming increasingly popular. Unexpectedly, these experiments often find that proteins are tolerant to most amino acid substitutions, including substitutions in positions that are highly conserved in nature. To obtain a more realistic distribution of the effects of protein mutations, we applied a laboratory drift comprising 17 rounds of random mutagenesis and selection of M.HaeIII, a DNA methyltransferase. During this drift, multiple mutations gradually accumulated. Deep sequencing of the drifted gene ensembles allowed determination of the relative effects of all possible single nucleotide mutations. Despite being averaged across many different genetic backgrounds, about 67% of all nonsynonymous, missense mutations were evidently deleterious, and an additional 16% were likely to be deleterious. In the early generations, the frequency of most deleterious mutations remained high. However, by the 17th generation, their frequency was consistently reduced, and those remaining were accepted alongside compensatory mutations. The tolerance to mutations measured in this laboratory drift correlated with sequence exchanges seen in M.HaeIII’s natural orthologs. The biophysical constraints dictating purging in nature and in this laboratory drift also seemed to overlap. Our experiment therefore provides an improved method for measuring the effects of protein mutations that more closely replicates the natural evolutionary forces, and thereby a more realistic view of the mutational space of proteins. Understanding and predicting the effects of single nucleotide polymorphisms (SNPs) is of fundamental importance in many fields. Systematic experimental mappings of the effects of such mutations within a given gene/protein comprise an essential experimental tool for determining protein function and for refining models of protein evolution, as well as an important resource for improving prediction algorithms. Here, we present the results of a laboratory system that mimics the manner by which protein sequences diverge in nature: a prolonged process of gradually accumulating random mutations that retain the protein’s structure and function. The change in frequencies of mutations over generations, as obtained by deep sequencing, enabled us to assess the relative effects of all possible SNPs at the background of an accumulating number of mutations. Compared to previous reports, we found that > 80% of all possible amino acid exchanges have potential deleterious effects, with 67% being clearly deleterious. Tolerance vs. purging of mutations in our prolonged drift also showed better correlation with natural diversity. Overall, our experimental setup provides a better understanding of how protein sequences diverge in nature, plus a new basis for improving the prediction accuracy of the effects of protein mutations, and specifically of SNPs.
Collapse
Affiliation(s)
- Liat Rockah-Shmuel
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot, Israel
| | - Ágnes Tóth-Petróczy
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot, Israel
| | - Dan S. Tawfik
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot, Israel
- * E-mail:
| |
Collapse
|
39
|
Bar-Even A, Milo R, Noor E, Tawfik DS. The Moderately Efficient Enzyme: Futile Encounters and Enzyme Floppiness. Biochemistry 2015. [DOI: 10.1021/acs.biochem.5b00621] [Citation(s) in RCA: 75] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Arren Bar-Even
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
| | | | - Elad Noor
- Institute
of Molecular Systems Biology, ETH Zurich, Auguste-Piccard-Hof 1, CH-8093 Zurich, Switzerland
| | | |
Collapse
|
40
|
Dellus-Gur E, Elias M, Caselli E, Prati F, Salverda MLM, de Visser JAGM, Fraser JS, Tawfik DS. Negative Epistasis and Evolvability in TEM-1 β-Lactamase--The Thin Line between an Enzyme's Conformational Freedom and Disorder. J Mol Biol 2015; 427:2396-409. [PMID: 26004540 DOI: 10.1016/j.jmb.2015.05.011] [Citation(s) in RCA: 91] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2015] [Revised: 05/08/2015] [Accepted: 05/12/2015] [Indexed: 12/28/2022]
Abstract
Epistasis is a key factor in evolution since it determines which combinations of mutations provide adaptive solutions and which mutational pathways toward these solutions are accessible by natural selection. There is growing evidence for the pervasiveness of sign epistasis--a complete reversion of mutational effects, particularly in protein evolution--yet its molecular basis remains poorly understood. We describe the structural basis of sign epistasis between G238S and R164S, two adaptive mutations in TEM-1 β-lactamase--an enzyme that endows antibiotics resistance. Separated by 10 Å, these mutations initiate two separate trajectories toward increased hydrolysis rates and resistance toward second and third-generation cephalosporins antibiotics. Both mutations allow the enzyme's active site to adopt alternative conformations and accommodate the new antibiotics. By solving the corresponding set of crystal structures, we found that R164S causes local disorder whereas G238S induces discrete conformations. When combined, the mutations in 238 and 164 induce local disorder whereby nonproductive conformations that perturb the enzyme's catalytic preorganization dominate. Specifically, Asn170 that coordinates the deacylating water molecule is misaligned, in both the free form and the inhibitor-bound double mutant. This local disorder is not restored by stabilizing global suppressor mutations and thus leads to an evolutionary cul-de-sac. Conformational dynamism therefore underlines the reshaping potential of protein's structures and functions but also limits protein evolvability because of the fragility of the interactions networks that maintain protein structures.
Collapse
Affiliation(s)
- Eynat Dellus-Gur
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Mikael Elias
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot 76100, Israel.
| | - Emilia Caselli
- Department of Chemistry, University of Modena, Modena 41100, Italy
| | - Fabio Prati
- Department of Chemistry, University of Modena, Modena 41100, Italy
| | - Merijn L M Salverda
- Institute for Translational Vaccinology (Intravacc), Bilthoven 3720 AL, The Netherlands
| | - J Arjan G M de Visser
- Laboratory of Genetics, Department of Plant Sciences, Wageningen University, Wageningen 6700 AH, The Netherlands
| | - James S Fraser
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA 94143, USA.
| | - Dan S Tawfik
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot 76100, Israel.
| |
Collapse
|
41
|
Protein evolution analysis of S-hydroxynitrile lyase by complete sequence design utilizing the INTMSAlign software. Sci Rep 2015; 5:8193. [PMID: 25645341 PMCID: PMC4648443 DOI: 10.1038/srep08193] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2014] [Accepted: 01/12/2015] [Indexed: 01/05/2023] Open
Abstract
Development of software and methods for design of complete sequences of functional proteins could contribute to studies of protein engineering and protein evolution. To this end, we developed the INTMSAlign software, and used it to design functional proteins and evaluate their usefulness. The software could assign both consensus and correlation residues of target proteins. We generated three protein sequences with S-selective hydroxynitrile lyase (S-HNL) activity, which we call designed S-HNLs; these proteins folded as efficiently as the native S-HNL. Sequence and biochemical analysis of the designed S-HNLs suggested that accumulation of neutral mutations occurs during the process of S-HNLs evolution from a low-activity form to a high-activity (native) form. Taken together, our results demonstrate that our software and the associated methods could be applied not only to design of complete sequences, but also to predictions of protein evolution, especially within families such as esterases and S-HNLs.
Collapse
|
42
|
Durão P, Aigner H, Nagy P, Mueller-Cajar O, Hartl FU, Hayer-Hartl M. Opposing effects of folding and assembly chaperones on evolvability of Rubisco. Nat Chem Biol 2015; 11:148-55. [PMID: 25558973 DOI: 10.1038/nchembio.1715] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2014] [Accepted: 10/27/2014] [Indexed: 12/29/2022]
Abstract
Ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) catalyzes the fixation of CO2 in photosynthesis. Despite its pivotal role, Rubisco is an inefficient enzyme and thus is a key target for directed evolution. Rubisco biogenesis depends on auxiliary factors, including the GroEL/ES-type chaperonin for folding and the chaperone RbcX for assembly. Here we performed directed evolution of cyanobacterial form I Rubisco using a Rubisco-dependent Escherichia coli strain. Overexpression of GroEL/ES enhanced Rubisco solubility and tended to expand the range of permissible mutations. In contrast, the specific assembly chaperone RbcX had a negative effect on evolvability by preventing a subset of mutants from forming holoenzyme. Mutation F140I in the large Rubisco subunit, isolated in the absence of RbcX, increased carboxylation efficiency approximately threefold without reducing CO2 specificity. The F140I mutant resulted in a ∼55% improved photosynthesis rate in Synechocystis PCC6803. The requirement of specific biogenesis factors downstream of chaperonin may have retarded the natural evolution of Rubisco.
Collapse
Affiliation(s)
- Paulo Durão
- Department of Cellular Biochemistry, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Harald Aigner
- Department of Cellular Biochemistry, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Péter Nagy
- Department of Cellular Biochemistry, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Oliver Mueller-Cajar
- Department of Cellular Biochemistry, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - F Ulrich Hartl
- Department of Cellular Biochemistry, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Manajit Hayer-Hartl
- Department of Cellular Biochemistry, Max Planck Institute of Biochemistry, Martinsried, Germany
| |
Collapse
|
43
|
Gitlin L, Hagai T, LaBarbera A, Solovey M, Andino R. Rapid evolution of virus sequences in intrinsically disordered protein regions. PLoS Pathog 2014; 10:e1004529. [PMID: 25502394 PMCID: PMC4263755 DOI: 10.1371/journal.ppat.1004529] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2014] [Accepted: 10/20/2014] [Indexed: 11/18/2022] Open
Abstract
Nodamura Virus (NoV) is a nodavirus originally isolated from insects that can replicate in a wide variety of hosts, including mammals. Because of their simplicity and ability to replicate in many diverse hosts, NoV, and the Nodaviridae in general, provide a unique window into the evolution of viruses and host-virus interactions. Here we show that the C-terminus of the viral polymerase exhibits extreme structural and evolutionary flexibility. Indeed, fewer than 10 positively charged residues from the 110 amino acid-long C-terminal region of protein A are required to support RNA1 replication. Strikingly, this region can be replaced by completely unrelated protein sequences, yet still produce a functional replicase. Structure predictions, as well as evolutionary and mutational analyses, indicate that the C-terminal region is structurally disordered and evolves faster than the rest of the viral proteome. Thus, the function of an intrinsically unstructured protein region can be independent of most of its primary sequence, conferring both functional robustness and sequence plasticity on the protein. Our results provide an experimental explanation for rapid evolution of unstructured regions, which enables an effective exploration of the sequence space, and likely function space, available to the virus. Proteins often contain regions with defined structures that enable their function. While important for maintaining the overall architecture of the protein, structural conservation adds constraints on the ability of the protein to mutate, and thus evolve. Viruses of eukaryotes, however, often encode for proteins with unstructured regions. As these regions are less constrained, they are more likely to accumulate mutations, which in turn can facilitate the appearance of novel functions during the evolution of the virus. Even though it has been known that such “disordered protein regions” have been particularly malleable in evolution, their functions and their ability to withstand extensive mutations have not been explored in detail. Here, we discovered that a disordered part of the Nodamura Virus polymerase is both required for replication of the viral genome, and extremely variable among different nodaviruses. We examined the tolerance of this protein region to mutations and found an unexpected ability to accommodate very diverse protein sequences. We propose that disordered protein regions can be a reservoir for evolutionary innovation that can play important roles in virus adaptation to new environments.
Collapse
Affiliation(s)
- Leonid Gitlin
- Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, California, United States of America
| | - Tzachi Hagai
- Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, California, United States of America
| | - Anthony LaBarbera
- Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, California, United States of America
| | - Mark Solovey
- Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, California, United States of America
| | - Raul Andino
- Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, California, United States of America
- * E-mail:
| |
Collapse
|
44
|
Bank C, Hietpas RT, Jensen JD, Bolon DNA. A systematic survey of an intragenic epistatic landscape. Mol Biol Evol 2014; 32:229-38. [PMID: 25371431 DOI: 10.1093/molbev/msu301] [Citation(s) in RCA: 94] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Mutations are the source of evolutionary variation. The interactions of multiple mutations can have important effects on fitness and evolutionary trajectories. We have recently described the distribution of fitness effects of all single mutations for a nine-amino-acid region of yeast Hsp90 (Hsp82) implicated in substrate binding. Here, we report and discuss the distribution of intragenic epistatic effects within this region in seven Hsp90 point mutant backgrounds of neutral to slightly deleterious effect, resulting in an analysis of more than 1,000 double mutants. We find negative epistasis between substitutions to be common, and positive epistasis to be rare--resulting in a pattern that indicates a drastic change in the distribution of fitness effects one step away from the wild type. This can be well explained by a concave relationship between phenotype and genotype (i.e., a concave shape of the local fitness landscape), suggesting mutational robustness intrinsic to the local sequence space. Structural analyses indicate that, in this region, epistatic effects are most pronounced when a solvent-inaccessible position is involved in the interaction. In contrast, all 18 observations of positive epistasis involved at least one mutation at a solvent-exposed position. By combining the analysis of evolutionary and biophysical properties of an epistatic landscape, these results contribute to a more detailed understanding of the complexity of protein evolution.
Collapse
Affiliation(s)
- Claudia Bank
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Ryan T Hietpas
- Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA
| | - Jeffrey D Jensen
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Daniel N A Bolon
- Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA
| |
Collapse
|
45
|
Tóth-Petróczy A, Tawfik DS. The robustness and innovability of protein folds. Curr Opin Struct Biol 2014; 26:131-8. [PMID: 25038399 DOI: 10.1016/j.sbi.2014.06.007] [Citation(s) in RCA: 97] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2013] [Revised: 06/26/2014] [Accepted: 06/26/2014] [Indexed: 11/30/2022]
Abstract
Assignment of protein folds to functions indicates that >60% of folds carry out one or two enzymatic functions, while few folds, for example, the TIM-barrel and Rossmann folds, exhibit hundreds. Are there structural features that make a fold amenable to functional innovation (innovability)? Do these features relate to robustness--the ability to readily accumulate sequence changes? We discuss several hypotheses regarding the relationship between the architecture of a protein and its evolutionary potential. We describe how, in a seemingly paradoxical manner, opposite properties, such as high stability and rigidity versus conformational plasticity and structural order versus disorder, promote robustness and/or innovability. We hypothesize that polarity--differentiation and low connectivity between a protein's scaffold and its active-site--is a key prerequisite for innovability.
Collapse
Affiliation(s)
- Agnes Tóth-Petróczy
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Dan S Tawfik
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot 76100, Israel.
| |
Collapse
|
46
|
Ivankov DN, Finkelstein AV, Kondrashov FA. A structural perspective of compensatory evolution. Curr Opin Struct Biol 2014; 26:104-12. [PMID: 24981969 PMCID: PMC4141909 DOI: 10.1016/j.sbi.2014.05.004] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2014] [Revised: 04/11/2014] [Accepted: 05/16/2014] [Indexed: 11/25/2022]
Abstract
The study of molecular evolution is important because it reveals how protein functions emerge and evolve. Recently, several types of studies indicated that substitutions in molecular evolution occur in a compensatory manner, whereby the occurrence of a substitution depends on the amino acid residues at other sites. However, a molecular or structural basis behind the compensation often remains obscure. Here, we review studies on the interface of structural biology and molecular evolution that revealed novel aspects of compensatory evolution. In many cases structural studies benefit from evolutionary data while structural data often add a functional dimension to the study of molecular evolution.
Collapse
Affiliation(s)
- Dmitry N Ivankov
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), 88 Dr. Aiguader, 08003 Barcelona, Spain; Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain; Laboratory of Protein Physics, Institute of Protein Research of the Russian Academy of Sciences, 4 Institutskaya str., Pushchino, Moscow Region, 142290, Russia
| | - Alexei V Finkelstein
- Laboratory of Protein Physics, Institute of Protein Research of the Russian Academy of Sciences, 4 Institutskaya str., Pushchino, Moscow Region, 142290, Russia
| | - Fyodor A Kondrashov
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), 88 Dr. Aiguader, 08003 Barcelona, Spain; Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), 23 Pg. Lluís Companys, 08010 Barcelona, Spain.
| |
Collapse
|
47
|
Mallik S, Kundu S. Molecular interactions within the halophilic, thermophilic, and mesophilic prokaryotic ribosomal complexes: clues to environmental adaptation. J Biomol Struct Dyn 2014; 33:639-56. [PMID: 24697502 DOI: 10.1080/07391102.2014.900457] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Using the available crystal structures of 50S ribosomal subunits from three prokaryotic species: Escherichia coli (mesophilic), Thermus thermophilus (thermophilic), and Haloarcula marismortui (halophilic), we have analyzed different structural features of ribosomal RNAs (rRNAs), proteins, and of their interfaces. We have correlated these structural features with the environmental adaptation strategies of the corresponding species. While dense intra-rRNA packing is observed in thermophilic, loose intra-rRNA packing is observed in halophilic (both compared to mesophilic). Interestingly, protein-rRNA interfaces of both the extremophiles are densely packed compared to that of the mesophilic. The intersubunit bridge regions are almost devoid of cavities, probably ensuring the proper formation of each bridge (by not allowing any loosely packed region nearby). During rRNA binding, the ribosomal proteins experience some structural transitions. Here, we have analyzed the intrinsically disordered and ordered regions of the ribosomal proteins, which are subjected to such transitions. The intrinsically disordered and disorder-to-order transition sites of the thermophilic and mesophilic ribosomal proteins are simultaneously (i) highly conserved and (ii) slowly evolving compared to rest of the protein structure. Although high conservation is observed at such sites of halophilic ribosomal proteins, but slow rate of evolution is absent. Such differences between thermophilic, mesophilic, and halophilic can be explained from their environmental adaptation strategy. Interestingly, a universal biophysical principle evident by a linear relationship between the free energy of interface formation, interface area, and structural changes of r-proteins during assembly is always maintained, irrespective of the environmental conditions.
Collapse
Affiliation(s)
- Saurav Mallik
- a Department of Biophysics, Molecular Biology and Bioinformatics , University of Calcutta , 92, APC Road, Kolkata 700009 , India
| | | |
Collapse
|
48
|
Li J, Jia J, Li H, Yu J, Sun H, He Y, Lv D, Yang X, Glocker MO, Ma L, Yang J, Li L, Li W, Zhang G, Liu Q, Li Y, Xie L. SysPTM 2.0: an updated systematic resource for post-translational modification. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2014; 2014:bau025. [PMID: 24705204 PMCID: PMC3975108 DOI: 10.1093/database/bau025] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Post-translational modifications (PTMs) of proteins play essential roles in almost all cellular processes, and are closely related to physiological activity and disease development of living organisms. The development of tandem mass spectrometry (MS/MS) has resulted in a rapid increase of PTMs identified on proteins from different species. The collection and systematic ordering of PTM data should provide invaluable information for understanding cellular processes and signaling pathways regulated by PTMs. For this original purpose we developed SysPTM, a systematic resource installed with comprehensive PTM data and a suite of web tools for annotation of PTMs in 2009. Four years later, there has been a significant advance with the generation of PTM data and, consequently, more sophisticated analysis requirements have to be met. Here we submit an updated version of SysPTM 2.0 (http://lifecenter.sgst.cn/SysPTM/), with almost doubled data content, enhanced web-based analysis tools of PTMBlast, PTMPathway, PTMPhylog, PTMCluster. Moreover, a new session SysPTM-H is constructed to graphically represent the combinatorial histone PTMs and dynamic regulation of histone modifying enzymes, and a new tool PTMGO is added for functional annotation and enrichment analysis. SysPTM 2.0 not only facilitates resourceful annotation of PTM sites but allows systematic investigation of PTM functions by the user. Database URL: http://lifecenter.sgst.cn/SysPTM/.
Collapse
Affiliation(s)
- Jing Li
- Key Laboratory of Biomedical Photonics of Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, P. R. China, Shanghai Center for Bioinformation Technology, Shanghai Institutes of Biomedicine, Shanghai Academy of Science and Technology, Shanghai 201203, P. R. China, Britton Chance Center for Biomedical Photonics, Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan 430074, P. R. China, Department of Bioinformatics and Biostatistics, Shanghai Jiaotong University, Shanghai 200240, P. R. China, Key Laboratory of Systems Biology, Chinese Academy of Sciences, Shanghai 200031, P. R. China and Proteome Center Rostock, Department for Proteome Research, Institute of Immunology, University of Rostock, Rostock 18055, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
49
|
Zhang SD, Ling LZ. Genome-wide identification and evolutionary analysis of the SBP-box gene family in castor bean. PLoS One 2014; 9:e86688. [PMID: 24466202 PMCID: PMC3899293 DOI: 10.1371/journal.pone.0086688] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2013] [Accepted: 12/16/2013] [Indexed: 11/25/2022] Open
Abstract
Genes in the SQUAMOSA promoter-binding-protein (SBP-box) gene family encode transcriptional regulators and perform a variety of regulatory functions that involved in the developmental and physiological processes of plants. In this study, a comprehensive computational analysis identified 15 candidates of the SBP-box gene family in the castor bean (Ricinus communis). The phylogenetic and domain analysis indicated that these genes were divided into two groups (group I and II). The group II was a big branch and was further classified into three subgroups (subgroup II-1 to 3) based on the phylogeny, gene structures and conserved motifs. It was observed that the genes of subgroup II-1 had distinct evolutionary features from those of the other two subgroups, however, were more similar to those of group I. Therefore, we inferred that group I and subgroup II-1 might retain ancient signals, whereas the subgroup II-2 and 3 exhibited the divergence during evolutionary process. Estimation of evolutionary parameters (dN and dN/dS) further supported our hypothesis. At first, the group I was more constrained by strong purifying selection and evolved slowly with a lower substitution rate than group II. As regards the three subgroups, subgroup II-1 had the lowest rate of substitution and was under strong purifying selection. By contrast, subgroups II-2 and 3 evolved more rapidly and experienced less purifying selection. These results indicated that the different evolutionary rates and selection strength caused the different evolutionary patterns of the members of SBP-box genes in castor bean. Taken together, these results provide better insights into understanding evolutionary divergence of the members of SBP-box gene family in castor bean and provide a guide for future functional diverse analyses of this gene family.
Collapse
Affiliation(s)
- Shu-Dong Zhang
- Key Laboratory of Biodiversity and Biogeography, Kunming Institute of Botany, the Chinese Academy of Sciences, Kunming, China
- Plant Germplasm and Genomics Center, Germplasm Bank of Wild Species, Kunming Institute of Botany, the Chinese Academy of Sciences, Kunming, China
| | - Li-Zhen Ling
- Key Laboratory of Biodiversity and Biogeography, Kunming Institute of Botany, the Chinese Academy of Sciences, Kunming, China
- Plant Germplasm and Genomics Center, Germplasm Bank of Wild Species, Kunming Institute of Botany, the Chinese Academy of Sciences, Kunming, China
| |
Collapse
|
50
|
Bridgham JT, Keay J, Ortlund EA, Thornton JW. Vestigialization of an allosteric switch: genetic and structural mechanisms for the evolution of constitutive activity in a steroid hormone receptor. PLoS Genet 2014; 10:e1004058. [PMID: 24415950 PMCID: PMC3886901 DOI: 10.1371/journal.pgen.1004058] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2013] [Accepted: 11/08/2013] [Indexed: 11/30/2022] Open
Abstract
An important goal in molecular evolution is to understand the genetic and physical mechanisms by which protein functions evolve and, in turn, to characterize how a protein's physical architecture influences its evolution. Here we dissect the mechanisms for an evolutionary shift in function in the mollusk ortholog of the steroid hormone receptors (SRs), a family of biologically essential transcription factors. In vertebrates, the activity of SRs allosterically depends on binding a hormonal ligand; in mollusks, however, the SR ortholog (called ER, because of high sequence similarity to vertebrate estrogen receptors) activates transcription in the absence of ligand and does not respond to steroid hormones. To understand how this shift in regulation evolved, we combined evolutionary, structural, and functional analyses. We first determined the X-ray crystal structure of the ER of the Pacific oyster Crassostrea gigas (CgER), and found that its ligand pocket is filled with bulky residues that prevent ligand occupancy. To understand the genetic basis for the evolution of mollusk ERs' unique functions, we resurrected an ancient SR progenitor and characterized the effect of historical amino acid replacements on its functions. We found that reintroducing just two ancient replacements from the lineage leading to mollusk ERs recapitulates the evolution of full constitutive activity and the loss of ligand activation. These substitutions stabilize interactions among key helices, causing the allosteric switch to become “stuck” in the active conformation and making activation independent of ligand binding. Subsequent changes filled the ligand pocket without further affecting activity; by degrading the allosteric switch, these substitutions vestigialized elements of the protein's architecture required for ligand regulation and made reversal to the ancestral function more complex. These findings show how the physical architecture of allostery enabled a few large-effect mutations to trigger a profound evolutionary change in the protein's function and shaped the genetics of evolutionary reversibility. An important goal in evolutionary genetics is to understand how genetic mutations cause the evolution of new protein functions and how a protein's structure shapes its evolution. Here we address these questions by studying a dramatic lineage-specific shift in function in steroid hormone receptors (SRs), a physiologically important family of transcription factors. In vertebrates, SRs bind hormones and then undergo a structural change that allows them to activate gene expression. In mollusks, SRs do not bind hormone and are always active. We identified the genetic and structural mechanisms for the evolution of constitutive activity in the mollusk SRs by using X-ray crystallography, ancestral sequence reconstruction, and experimental studies of the effects of ancient mutations on protein structure and function. We found that constitutive activity evolved due to just two historical substitutions that subtly stabilized elements of the active conformation, and subsequent mutations filled the hormone-binding cavity. The structural characteristics required for a hormone-sensitive activator were thus vestigialized, much the same way that a whale's hindlimbs became vestiges of their ancestral form after they became dispensable. Our findings show how the architecture of a protein can shape its evolution, allowing radically different functions to evolve by a few large-effect mutations.
Collapse
Affiliation(s)
- Jamie T. Bridgham
- Institute of Ecology and Evolution, University of Oregon, Eugene, Oregon, United States of America
| | - June Keay
- Institute of Ecology and Evolution, University of Oregon, Eugene, Oregon, United States of America
| | - Eric A. Ortlund
- Biochemistry Department, Emory University School of Medicine, Atlanta, Georgia, United States of America
| | - Joseph W. Thornton
- Institute of Ecology and Evolution, University of Oregon, Eugene, Oregon, United States of America
- Departments of Human Genetics and Ecology & Evolution, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail:
| |
Collapse
|