1
|
Romero-Pérez PS, Moran HM, Horani A, Truong A, Manriquez-Sandoval E, Ramirez JF, Martinez A, Gollub E, Hunter K, Lotthammer JM, Emenecker RJ, Liu H, Iwasa JH, Boothby TC, Holehouse AS, Fried SD, Sukenik S. Protein surface chemistry encodes an adaptive tolerance to desiccation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.28.604841. [PMID: 39131385 PMCID: PMC11312438 DOI: 10.1101/2024.07.28.604841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]
Abstract
Cellular desiccation - the loss of nearly all water from the cell - is a recurring stress in an increasing number of ecosystems that can drive protein unfolding and aggregation. For cells to survive, at least some of the proteome must resume function upon rehydration. Which proteins tolerate desiccation, and the molecular determinants that underlie this tolerance, are largely unknown. Here, we apply quantitative and structural proteomic mass spectrometry to show that certain proteins possess an innate capacity to tolerate rehydration following extreme water loss. Structural analysis points to protein surface chemistry as a key determinant for desiccation tolerance, which we test by showing that rational surface mutants can convert a desiccation sensitive protein into a tolerant one. Desiccation tolerance also has strong overlap with cellular function, with highly tolerant proteins responsible for production of small molecule building blocks, and intolerant proteins involved in energy-consuming processes such as ribosome biogenesis. As a result, the rehydrated proteome is preferentially enriched with metabolite and small molecule producers and depleted of some of the cell's heaviest consumers. We propose this functional bias enables cells to kickstart their metabolism and promote cell survival following desiccation and rehydration. Teaser Proteins can resist extreme dryness by tuning the amino acids on their surfaces.
Collapse
|
2
|
Bradley D, Hogrebe A, Dandage R, Dubé AK, Leutert M, Dionne U, Chang A, Villén J, Landry CR. The fitness cost of spurious phosphorylation. EMBO J 2024; 43:4720-4751. [PMID: 39256561 PMCID: PMC11480408 DOI: 10.1038/s44318-024-00200-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 07/23/2024] [Accepted: 07/24/2024] [Indexed: 09/12/2024] Open
Abstract
The fidelity of signal transduction requires the binding of regulatory molecules to their cognate targets. However, the crowded cell interior risks off-target interactions between proteins that are functionally unrelated. How such off-target interactions impact fitness is not generally known. Here, we use Saccharomyces cerevisiae to inducibly express tyrosine kinases. Because yeast lacks bona fide tyrosine kinases, the resulting tyrosine phosphorylation is biologically spurious. We engineered 44 yeast strains each expressing a tyrosine kinase, and quantitatively analysed their phosphoproteomes. This analysis resulted in ~30,000 phosphosites mapping to ~3500 proteins. The number of spurious pY sites generated correlates strongly with decreased growth, and we predict over 1000 pY events to be deleterious. However, we also find that many of the spurious pY sites have a negligible effect on fitness, possibly because of their low stoichiometry. This result is consistent with our evolutionary analyses demonstrating a lack of phosphotyrosine counter-selection in species with tyrosine kinases. Our results suggest that, alongside the risk for toxicity, the cell can tolerate a large degree of non-functional crosstalk as interaction networks evolve.
Collapse
Affiliation(s)
- David Bradley
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
- Department of Biochemistry, Microbiology and Bioinformatics, Université Laval, Québec, QC, Canada
- Quebec Network for Research on Protein Function, Engineering, and Applications (PROTEO), Université du Québec à Montréal, Montréal, QC, Canada
- Université Laval Big Data Research Center (BDRC_UL), Québec, QC, Canada
- Department of Biology, Université Laval, Québec, QC, Canada
| | - Alexander Hogrebe
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Rohan Dandage
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
- Department of Biochemistry, Microbiology and Bioinformatics, Université Laval, Québec, QC, Canada
- Quebec Network for Research on Protein Function, Engineering, and Applications (PROTEO), Université du Québec à Montréal, Montréal, QC, Canada
- Université Laval Big Data Research Center (BDRC_UL), Québec, QC, Canada
- Department of Biology, Université Laval, Québec, QC, Canada
| | - Alexandre K Dubé
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
- Department of Biochemistry, Microbiology and Bioinformatics, Université Laval, Québec, QC, Canada
- Quebec Network for Research on Protein Function, Engineering, and Applications (PROTEO), Université du Québec à Montréal, Montréal, QC, Canada
- Université Laval Big Data Research Center (BDRC_UL), Québec, QC, Canada
- Department of Biology, Université Laval, Québec, QC, Canada
| | - Mario Leutert
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Institute of Molecular Systems Biology, ETH Zürich, Zürich, Switzerland
| | - Ugo Dionne
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
- Department of Biochemistry, Microbiology and Bioinformatics, Université Laval, Québec, QC, Canada
- Quebec Network for Research on Protein Function, Engineering, and Applications (PROTEO), Université du Québec à Montréal, Montréal, QC, Canada
- Université Laval Big Data Research Center (BDRC_UL), Québec, QC, Canada
- Department of Biology, Université Laval, Québec, QC, Canada
| | - Alexis Chang
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Judit Villén
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
| | - Christian R Landry
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada.
- Department of Biochemistry, Microbiology and Bioinformatics, Université Laval, Québec, QC, Canada.
- Quebec Network for Research on Protein Function, Engineering, and Applications (PROTEO), Université du Québec à Montréal, Montréal, QC, Canada.
- Université Laval Big Data Research Center (BDRC_UL), Québec, QC, Canada.
- Department of Biology, Université Laval, Québec, QC, Canada.
| |
Collapse
|
3
|
Liu Y, Zhu Q, Wang Z, Zheng H, Zheng X, Ling P, Tang M. Integrative Analysis of Transcriptome and Metabolome Reveals the Pivotal Role of the NAM Family Genes in Oncidium hybridum Lodd. Pseudobulb Growth. Int J Mol Sci 2024; 25:10355. [PMID: 39408686 PMCID: PMC11476975 DOI: 10.3390/ijms251910355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2024] [Revised: 09/14/2024] [Accepted: 09/24/2024] [Indexed: 10/20/2024] Open
Abstract
Oncidium hybridum Lodd. is an important ornamental flower that is used as both a cut flower and a potted plant around the world; additionally, its pseudobulbs serve as essential carriers for floral organs and flower development. The NAM gene family is crucial for managing responses to various stresses as well as regulating growth in plants. However, the mechanisms by which NAM genes regulate the development of pseudobulbs remain unclear. In this study, a total of 144 NAM genes harboring complete structural domains were identified in O. hybridum. The 144 NAM genes were systematically classified into 14 distinct subfamilies via phylogenetic analysis. Delving deeper into the conserved motifs revealed that motifs 1-6 exhibited remarkable conservation, while motifs 7-10 presented in a few NAM genes only. Notably, NAM genes sharing identical specific motifs were classified into the same subfamily, indicating functional relatedness. Furthermore, the examination of occurrences of gene duplication indicated that the NAM genes display 16 pairs of tandem duplications along with five pairs of segmental duplications, suggesting their role in genetic diversity and potential adaptive evolution. By conducting a correlation analysis integrating transcriptomics and metabolomics at four stages of pseudobulb development, we found that OhNAM023, OhNAM030, OhNAM007, OhNAM019, OhNAM083, OhNAM047, OhNAM089, and OhNAM025 exhibited significant relationships with the endogenous plant hormones jasmonates (JAs), hinting at their potential involvement in hormonal signaling. Additionally, OhNAM089, OhNAM025, OhNAM119, OhNAM055, and OhNAM136 showed strong links with abscisic acid (ABA) and abscisic acid glucose ester (ABA-GE), suggesting the possible regulatory function of these NAM genes in plant growth and stress responses. The 144 NAM genes identified in this study provide a basis for subsequent research and contribute to elucidating the intricate molecular mechanisms of NAM genes in Oncidium and potentially in other species.
Collapse
Affiliation(s)
| | | | | | | | | | - Peng Ling
- Key Laboratory of Genetics and Germplasm Innovation of Tropical Special Forest Trees and Ornamental Plants (Ministry of Education), Collaborative Innovation Center, School of Tropical Agriculture and Forestry, Hainan University, Haikou 570228, China; (Y.L.); (Q.Z.); (Z.W.); (H.Z.); (X.Z.)
| | - Minqiang Tang
- Key Laboratory of Genetics and Germplasm Innovation of Tropical Special Forest Trees and Ornamental Plants (Ministry of Education), Collaborative Innovation Center, School of Tropical Agriculture and Forestry, Hainan University, Haikou 570228, China; (Y.L.); (Q.Z.); (Z.W.); (H.Z.); (X.Z.)
| |
Collapse
|
4
|
McBride JM, Tlusty T. AI-Predicted Protein Deformation Encodes Energy Landscape Perturbation. PHYSICAL REVIEW LETTERS 2024; 133:098401. [PMID: 39270162 DOI: 10.1103/physrevlett.133.098401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 02/27/2024] [Accepted: 07/24/2024] [Indexed: 09/15/2024]
Abstract
AI algorithms have proven to be excellent predictors of protein structure, but whether and how much these algorithms can capture the underlying physics remains an open question. Here, we aim to test this question using the Alphafold2 (AF) algorithm: We use AF to predict the subtle structural deformation induced by single mutations, quantified by strain, and compare with experimental datasets of corresponding perturbations in folding free energy ΔΔG. Unexpectedly, we find that physical strain alone-without any additional data or computation-correlates almost as well with ΔΔG as state-of-the-art energy-based and machine-learning predictors. This indicates that the AF-predicted structures alone encode fine details about the energy landscape. In particular, the structures encode significant information on stability, enough to estimate (de-)stabilizing effects of mutations, thus paving the way for the development of novel, structure-based stability predictors for protein design and evolution.
Collapse
Affiliation(s)
- John M McBride
- Center for Algorithmic and Robotized Synthesis, Institute for Basic Science, Ulsan 44919, South Korea
| | | |
Collapse
|
5
|
Pollet L, Xia Y. Structure-guided Evolutionary Analysis of Interactome Network Rewiring at Single Residue Resolution in Yeasts. J Mol Biol 2024; 436:168641. [PMID: 38844045 DOI: 10.1016/j.jmb.2024.168641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 04/30/2024] [Accepted: 06/01/2024] [Indexed: 06/16/2024]
Abstract
Protein-protein interactions (PPIs) are known to rewire extensively during evolution leading to lineage-specific and species-specific changes in molecular processes. However, the detailed molecular evolutionary mechanisms underlying interactome network rewiring are not well-understood. Here, we combine high-confidence PPI data, high-resolution three-dimensional structures of protein complexes, and homology-based structural annotation transfer to construct structurally-resolved interactome networks for the two yeasts S. cerevisiae and S. pombe. We then classify PPIs according to whether they are preserved or different between the two yeast species and compare site-specific evolutionary rates of interfacial versus non-interfacial residues for these different categories of PPIs. We find that residues in PPI interfaces evolve significantly more slowly than non-interfacial residues when using lineage-specific measures of evolutionary rate, but not when using non-lineage-specific measures. Furthermore, both lineage-specific and non-lineage-specific evolutionary rate measures can distinguish interfacial residues from non-interfacial residues for preserved PPIs between the two yeasts, but only the lineage-specific measure is appropriate for rewired PPIs. Finally, both lineage-specific and non-lineage-specific evolutionary rate measures are appropriate for elucidating structural determinants of protein evolution for residues outside of PPI interfaces. Overall, our results demonstrate that unlike tertiary structures of single proteins, PPIs and PPI interfaces can be highly volatile in their evolution, thus requiring the use of lineage-specific measures when studying their evolution. These results yield insight into the evolutionary design principles of PPIs and the mechanisms by which interactions are preserved or rewired between species, improving our understanding of the molecular evolution of PPIs and PPI interfaces at the residue level.
Collapse
Affiliation(s)
- Léah Pollet
- Department of Bioengineering, Faculty of Engineering, McGill University, Montreal, QC, Canada
| | - Yu Xia
- Department of Bioengineering, Faculty of Engineering, McGill University, Montreal, QC, Canada.
| |
Collapse
|
6
|
Yang Y, Braga MV, Dean MD. Insertion-Deletion Events Are Depleted in Protein Regions with Predicted Secondary Structure. Genome Biol Evol 2024; 16:evae093. [PMID: 38735759 PMCID: PMC11102076 DOI: 10.1093/gbe/evae093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Revised: 04/16/2024] [Accepted: 04/21/2024] [Indexed: 05/14/2024] Open
Abstract
A fundamental goal in evolutionary biology and population genetics is to understand how selection shapes the fate of new mutations. Here, we test the null hypothesis that insertion-deletion (indel) events in protein-coding regions occur randomly with respect to secondary structures. We identified indels across 11,444 sequence alignments in mouse, rat, human, chimp, and dog genomes and then quantified their overlap with four different types of secondary structure-alpha helices, beta strands, protein bends, and protein turns-predicted by deep-learning methods of AlphaFold2. Indels overlapped secondary structures 54% as much as expected and were especially underrepresented over beta strands, which tend to form internal, stable regions of proteins. In contrast, indels were enriched by 155% over regions without any predicted secondary structures. These skews were stronger in the rodent lineages compared to the primate lineages, consistent with population genetic theory predicting that natural selection will be more efficient in species with larger effective population sizes. Nonsynonymous substitutions were also less common in regions of protein secondary structure, although not as strongly reduced as in indels. In a complementary analysis of thousands of human genomes, we showed that indels overlapping secondary structure segregated at significantly lower frequency than indels outside of secondary structure. Taken together, our study shows that indels are selected against if they overlap secondary structure, presumably because they disrupt the tertiary structure and function of a protein.
Collapse
Affiliation(s)
- Yi Yang
- Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Matthew V Braga
- Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Matthew D Dean
- Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
7
|
Ali F. Patterns of Change in Nucleotide Diversity Over Gene Length. Genome Biol Evol 2024; 16:evae078. [PMID: 38608148 PMCID: PMC11040516 DOI: 10.1093/gbe/evae078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 03/26/2024] [Accepted: 04/03/2024] [Indexed: 04/14/2024] Open
Abstract
Nucleotide diversity at a site is influenced by the relative strengths of neutral and selective population genetic processes. Therefore, attempts to estimate Effective population size based on the diversity of synonymous sites demand a better understanding of their selective constraints. The nucleotide diversity of a gene was previously found to correlate with its length. In this work, I measure nucleotide diversity at synonymous sites and uncover a pattern of low diversity towards the translation initiation site of a gene. The degree of reduction in diversity at the translation initiation site and the length of this region of reduced diversity can be quantified as "Effect Size" and "Effect Length" respectively, using parameters of an asymptotic regression model. Estimates of Effect Length across bacteria covaried with recombination rates as well as with a multitude of translation-associated traits such as the avoidance of mRNA secondary structure around translation initiation site, the number of rRNAs, and relative codon usage of ribosomal genes. Evolutionary simulations under purifying selection reproduce the observed patterns and diversity-length correlation and highlight that selective constraints on the 5'-region of a gene may be more extensive than previously believed. These results have implications for the estimation of effective population size, and relative mutation rates, and for genome scans of genes under positive selection based on "silent-site" diversity.
Collapse
Affiliation(s)
- Farhan Ali
- Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe, AZ 85281, USA
| |
Collapse
|
8
|
Ferreiro D, Branco C, Arenas M. Selection among site-dependent structurally constrained substitution models of protein evolution by approximate Bayesian computation. Bioinformatics 2024; 40:btae096. [PMID: 38374231 PMCID: PMC10914458 DOI: 10.1093/bioinformatics/btae096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 01/15/2024] [Accepted: 02/16/2024] [Indexed: 02/21/2024] Open
Abstract
MOTIVATION The selection among substitution models of molecular evolution is fundamental for obtaining accurate phylogenetic inferences. At the protein level, evolutionary analyses are traditionally based on empirical substitution models but these models make unrealistic assumptions and are being surpassed by structurally constrained substitution (SCS) models. The SCS models often consider site-dependent evolution, a process that provides realism but complicates their implementation into likelihood functions that are commonly used for substitution model selection. RESULTS We present a method to perform selection among site-dependent SCS models, also among empirical and site-dependent SCS models, based on the approximate Bayesian computation (ABC) approach and its implementation into the computational framework ProteinModelerABC. The framework implements ABC with and without regression adjustments and includes diverse empirical and site-dependent SCS models of protein evolution. Using extensive simulated data, we found that it provides selection among SCS and empirical models with acceptable accuracy. As illustrative examples, we applied the framework to analyze a variety of protein families observing that SCS models fit them better than the corresponding best-fitting empirical substitution models. AVAILABILITY AND IMPLEMENTATION ProteinModelerABC is freely available from https://github.com/DavidFerreiro/ProteinModelerABC, can run in parallel and includes a graphical user interface. The framework is distributed with detailed documentation and ready-to-use examples.
Collapse
Affiliation(s)
- David Ferreiro
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain
- Department of Biochemistry, Genetics and Immunology, Universidade de Vigo, 36310 Vigo, Spain
| | - Catarina Branco
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain
- Department of Biochemistry, Genetics and Immunology, Universidade de Vigo, 36310 Vigo, Spain
| | - Miguel Arenas
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain
- Department of Biochemistry, Genetics and Immunology, Universidade de Vigo, 36310 Vigo, Spain
| |
Collapse
|
9
|
Ferreiro D, Khalil R, Sousa SF, Arenas M. Substitution Models of Protein Evolution with Selection on Enzymatic Activity. Mol Biol Evol 2024; 41:msae026. [PMID: 38314876 PMCID: PMC10873502 DOI: 10.1093/molbev/msae026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 01/25/2024] [Accepted: 01/31/2024] [Indexed: 02/07/2024] Open
Abstract
Substitution models of evolution are necessary for diverse evolutionary analyses including phylogenetic tree and ancestral sequence reconstructions. At the protein level, empirical substitution models are traditionally used due to their simplicity, but they ignore the variability of substitution patterns among protein sites. Next, in order to improve the realism of the modeling of protein evolution, a series of structurally constrained substitution models were presented, but still they usually ignore constraints on the protein activity. Here, we present a substitution model of protein evolution with selection on both protein structure and enzymatic activity, and that can be applied to phylogenetics. In particular, the model considers the binding affinity of the enzyme-substrate complex as well as structural constraints that include the flexibility of structural flaps, hydrogen bonds, amino acids backbone radius of gyration, and solvent-accessible surface area that are quantified through molecular dynamics simulations. We applied the model to the HIV-1 protease and evaluated it by phylogenetic likelihood in comparison with the best-fitting empirical substitution model and a structurally constrained substitution model that ignores the enzymatic activity. We found that accounting for selection on the protein activity improves the fitting of the modeled functional regions with the real observations, especially in data with high molecular identity, which recommends considering constraints on the protein activity in the development of substitution models of evolution.
Collapse
Affiliation(s)
- David Ferreiro
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain
- Department of Biochemistry, Genetics and Immunology, Universidade de Vigo, 36310 Vigo, Spain
| | - Ruqaiya Khalil
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain
- Department of Biochemistry, Genetics and Immunology, Universidade de Vigo, 36310 Vigo, Spain
| | - Sergio F Sousa
- UCIBIO/REQUIMTE, BioSIM, Departamento de Biomedicina, Faculdade de Medicina da Universidade do Porto, 4200-319 Porto, Portugal
| | - Miguel Arenas
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain
- Department of Biochemistry, Genetics and Immunology, Universidade de Vigo, 36310 Vigo, Spain
| |
Collapse
|
10
|
Costa ISD, Junot T, Silva FL, Felix W, Cardozo Fh JL, Pereira de Araujo AF, Pais do Amaral C, Gonçalves S, Santos NC, Leite JRSA, Bloch C, Brand GD. Occurrence and evolutionary conservation analysis of α-helical cationic amphiphilic segments in the human proteome. FEBS J 2024; 291:547-565. [PMID: 37945538 DOI: 10.1111/febs.16997] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 09/14/2023] [Accepted: 10/20/2023] [Indexed: 11/12/2023]
Abstract
The existence of encrypted fragments with antimicrobial activity in human proteins has been thoroughly demonstrated in the literature. Recently, algorithms for the large-scale identification of these segments in whole proteomes were developed, and the pervasiveness of this phenomenon was stated. These algorithms typically mine encrypted cationic and amphiphilic segments of proteins, which, when synthesized as individual polypeptide sequences, exert antimicrobial activity by membrane disruption. In the present report, the human reference proteome was submitted to the software kamal for the uncovering of protein segments that correspond to putative intragenic antimicrobial peptides (IAPs). The assessment of the identity of these segments, frequency, functional classes of parent proteins, structural relevance, and evolutionary conservation of amino acid residues within their corresponding proteins was conducted in silico. Additionally, the antimicrobial and anticancer activity of six selected synthetic peptides was evaluated. Our results indicate that cationic and amphiphilic segments can be found in 2% of all human proteins, but are more common in transmembrane and peripheral membrane proteins. These segments are surface-exposed basic patches whose amino acid residues present similar conservation scores to other residues with similar solvent accessibility. Moreover, the antimicrobial and anticancer activity of the synthetic putative IAP sequences was irrespective to whether these are associated to membranes in the cellular setting. Our study discusses these findings in light of the current understanding of encrypted peptide sequences, offering some insights into the relevance of these segments to the organism in the context of their harboring proteins or as separate polypeptide sequences.
Collapse
Affiliation(s)
- Igor S D Costa
- Laboratório de Síntese e Análise de Biomoléculas - LSAB, Instituto de Química, Universidade de Brasília, Brazil
| | - Tiago Junot
- Laboratório de Síntese e Análise de Biomoléculas - LSAB, Instituto de Química, Universidade de Brasília, Brazil
| | - Fernanda L Silva
- Laboratório de Síntese e Análise de Biomoléculas - LSAB, Instituto de Química, Universidade de Brasília, Brazil
| | - Wanessa Felix
- Núcleo de Pesquisa em Morfologia e Imunologia Aplicada - NuPMIA, Faculdade de Medicina, Universidade de Brasília, Brazil
| | - José L Cardozo Fh
- Laboratório de Espectrometria de Massa - LEM, Embrapa Recursos Genéticos e Biotecnologia, Brasília, Brazil
| | - Antonio F Pereira de Araujo
- Laboratório de Biofísica Teórica e Computacional, Departamento de Biologia Celular, Universidade de Brasília, Brazil
| | | | - Sónia Gonçalves
- Instituto de Medicina Molecular, Faculdade de Medicina, Universidade de Lisboa, Portugal
| | - Nuno C Santos
- Instituto de Medicina Molecular, Faculdade de Medicina, Universidade de Lisboa, Portugal
| | - José R S A Leite
- Núcleo de Pesquisa em Morfologia e Imunologia Aplicada - NuPMIA, Faculdade de Medicina, Universidade de Brasília, Brazil
| | - Carlos Bloch
- Laboratório de Espectrometria de Massa - LEM, Embrapa Recursos Genéticos e Biotecnologia, Brasília, Brazil
| | - Guilherme D Brand
- Laboratório de Síntese e Análise de Biomoléculas - LSAB, Instituto de Química, Universidade de Brasília, Brazil
| |
Collapse
|
11
|
Shen L, Liu Y, Chen L, Lei T, Ren P, Ji M, Song W, Lin H, Su W, Wang S, Rooman M, Pucci F. Genomic basis of environmental adaptation in the widespread poly-extremophilic Exiguobacterium group. THE ISME JOURNAL 2024; 18:wrad020. [PMID: 38365240 PMCID: PMC10837837 DOI: 10.1093/ismejo/wrad020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Revised: 12/04/2023] [Accepted: 12/05/2023] [Indexed: 02/18/2024]
Abstract
Delineating cohesive ecological units and determining the genetic basis for their environmental adaptation are among the most important objectives in microbiology. In the last decade, many studies have been devoted to characterizing the genetic diversity in microbial populations to address these issues. However, the impact of extreme environmental conditions, such as temperature and salinity, on microbial ecology and evolution remains unclear so far. In order to better understand the mechanisms of adaptation, we studied the (pan)genome of Exiguobacterium, a poly-extremophile bacterium able to grow in a wide range of environments, from permafrost to hot springs. To have the genome for all known Exiguobacterium type strains, we first sequenced those that were not yet available. Using a reverse-ecology approach, we showed how the integration of phylogenomic information, genomic features, gene and pathway enrichment data, regulatory element analyses, protein amino acid composition, and protein structure analyses of the entire Exiguobacterium pangenome allows to sharply delineate ecological units consisting of mesophilic, psychrophilic, halophilic-mesophilic, and halophilic-thermophilic ecotypes. This in-depth study clarified the genetic basis of the defined ecotypes and identified some key mechanisms driving the environmental adaptation to extreme environments. Our study points the way to organizing the vast microbial diversity into meaningful ecologically units, which, in turn, provides insight into how microbial communities adapt and respond to different environmental conditions in a changing world.
Collapse
Affiliation(s)
- Liang Shen
- College of Life Sciences, Anhui Normal University, Wuhu 241000, China
- Anhui Provincial Key Laboratory of Molecular Enzymology and Mechanism of Major Diseases, and Anhui Provincial Engineering Research Centre for Molecular Detection and Diagnostics, Anhui Normal University, Wuhu 241000, China
| | - Yongqin Liu
- Center for the Pan-Third Pole Environment, Lanzhou University, Lanzhou 730000, China
| | - Liangzhong Chen
- College of Life Sciences, Anhui Normal University, Wuhu 241000, China
| | - Tingting Lei
- College of Life Sciences, Anhui Normal University, Wuhu 241000, China
| | - Ping Ren
- College of Life Sciences, Anhui Normal University, Wuhu 241000, China
| | - Mukan Ji
- Center for the Pan-Third Pole Environment, Lanzhou University, Lanzhou 730000, China
| | - Weizhi Song
- Centre for Marine Bio-Innovation, University of New South Wales, Sydney, NSW 2052, Australia
| | - Hao Lin
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Wei Su
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Sheng Wang
- Shanghai Zelixir Biotech Company Ltd., Shanghai 200030, China
| | - Marianne Rooman
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels 1050, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, Brussels 1050, Belgium
| | - Fabrizio Pucci
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels 1050, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, Brussels 1050, Belgium
| |
Collapse
|
12
|
Xie WJ, Liu D, Wang X, Zhang A, Wei Q, Nandi A, Dong S, Warshel A. Enhancing luciferase activity and stability through generative modeling of natural enzyme sequences. Proc Natl Acad Sci U S A 2023; 120:e2312848120. [PMID: 37983512 PMCID: PMC10691223 DOI: 10.1073/pnas.2312848120] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 10/09/2023] [Indexed: 11/22/2023] Open
Abstract
The availability of natural protein sequences synergized with generative AI provides new paradigms to engineer enzymes. Although active enzyme variants with numerous mutations have been designed using generative models, their performance often falls short of their wild type counterparts. Additionally, in practical applications, choosing fewer mutations that can rival the efficacy of extensive sequence alterations is usually more advantageous. Pinpointing beneficial single mutations continues to be a formidable task. In this study, using the generative maximum entropy model to analyze Renilla luciferase (RLuc) homologs, and in conjunction with biochemistry experiments, we demonstrated that natural evolutionary information could be used to predictively improve enzyme activity and stability by engineering the active center and protein scaffold, respectively. The success rate to improve either luciferase activity or stability of designed single mutants is ~50%. This finding highlights nature's ingenious approach to evolving proficient enzymes, wherein diverse evolutionary pressures are preferentially applied to distinct regions of the enzyme, ultimately culminating in an overall high performance. We also reveal an evolutionary preference in RLuc toward emitting blue light that holds advantages in terms of water penetration compared to other light spectra. Taken together, our approach facilitates navigation through enzyme sequence space and offers effective strategies for computer-aided rational enzyme engineering.
Collapse
Affiliation(s)
- Wen Jun Xie
- Department of Chemistry, University of Southern California, Los Angeles, CA90089
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, Genetics Institute, University of Florida, Gainesville, FL32610
| | - Dangliang Liu
- State Key Laboratory of Natural and Biomimetic Drugs, Chemical Biology Center, School of Pharmaceutical Sciences, Peking University, Beijing100191, China
| | - Xiaoya Wang
- State Key Laboratory of Natural and Biomimetic Drugs, Chemical Biology Center, School of Pharmaceutical Sciences, Peking University, Beijing100191, China
| | - Aoxuan Zhang
- Department of Chemistry, University of Southern California, Los Angeles, CA90089
| | - Qijia Wei
- State Key Laboratory of Natural and Biomimetic Drugs, Chemical Biology Center, School of Pharmaceutical Sciences, Peking University, Beijing100191, China
| | - Ashim Nandi
- Department of Chemistry, University of Southern California, Los Angeles, CA90089
| | - Suwei Dong
- State Key Laboratory of Natural and Biomimetic Drugs, Chemical Biology Center, School of Pharmaceutical Sciences, Peking University, Beijing100191, China
| | - Arieh Warshel
- Department of Chemistry, University of Southern California, Los Angeles, CA90089
| |
Collapse
|
13
|
Cao W, Wu LY, Xia XY, Chen X, Wang ZX, Pan XM. A sequence-based evolutionary distance method for Phylogenetic analysis of highly divergent proteins. Sci Rep 2023; 13:20304. [PMID: 37985846 PMCID: PMC10662474 DOI: 10.1038/s41598-023-47496-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 11/14/2023] [Indexed: 11/22/2023] Open
Abstract
Because of the limited effectiveness of prevailing phylogenetic methods when applied to highly divergent protein sequences, the phylogenetic analysis problem remains challenging. Here, we propose a sequence-based evolutionary distance algorithm termed sequence distance (SD), which innovatively incorporates site-to-site correlation within protein sequences into the distance estimation. In protein superfamilies, SD can effectively distinguish evolutionary relationships both within and between protein families, producing phylogenetic trees that closely align with those based on structural information, even with sequence identity less than 20%. SD is highly correlated with the similarity of the protein structure, and can calculate evolutionary distances for thousands of protein pairs within seconds using a single CPU, which is significantly faster than most protein structure prediction methods that demand high computational resources and long run times. The development of SD will significantly advance phylogenetics, providing researchers with a more accurate and reliable tool for exploring evolutionary relationships.
Collapse
Affiliation(s)
- Wei Cao
- Key Laboratory of Ministry of Education for Protein Science, School of Life Sciences, Tsinghua University, Beijing, 100084, China
| | - Lu-Yun Wu
- Key Laboratory of Ministry of Education for Protein Science, School of Life Sciences, Tsinghua University, Beijing, 100084, China
| | - Xia-Yu Xia
- Key Laboratory of Ministry of Education for Protein Science, School of Life Sciences, Tsinghua University, Beijing, 100084, China
| | - Xiang Chen
- Key Laboratory of Ministry of Education for Protein Science, School of Life Sciences, Tsinghua University, Beijing, 100084, China
| | - Zhi-Xin Wang
- Key Laboratory of Ministry of Education for Protein Science, School of Life Sciences, Tsinghua University, Beijing, 100084, China.
| | - Xian-Ming Pan
- Key Laboratory of Ministry of Education for Protein Science, School of Life Sciences, Tsinghua University, Beijing, 100084, China.
| |
Collapse
|
14
|
Bradley D, Hogrebe A, Dandage R, Dubé AK, Leutert M, Dionne U, Chang A, Villén J, Landry CR. The fitness cost of spurious phosphorylation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.08.561337. [PMID: 37873463 PMCID: PMC10592693 DOI: 10.1101/2023.10.08.561337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
The fidelity of signal transduction requires the binding of regulatory molecules to their cognate targets. However, the crowded cell interior risks off-target interactions between proteins that are functionally unrelated. How such off-target interactions impact fitness is not generally known, but quantifying this is required to understand the constraints faced by cell systems as they evolve. Here, we use the model organism S. cerevisiae to inducibly express tyrosine kinases. Because yeast lacks bona fide tyrosine kinases, most of the resulting tyrosine phosphorylation is spurious. This provides a suitable system to measure the impact of artificial protein interactions on fitness. We engineered 44 yeast strains each expressing a tyrosine kinase, and quantitatively analysed their phosphoproteomes. This analysis resulted in ~30,000 phosphosites mapping to ~3,500 proteins. Examination of the fitness costs in each strain revealed a strong correlation between the number of spurious pY sites and decreased growth. Moreover, the analysis of pY effects on protein structure and on protein function revealed over 1000 pY events that we predict to be deleterious. However, we also find that a large number of the spurious pY sites have a negligible effect on fitness, possibly because of their low stoichiometry. This result is consistent with our evolutionary analyses demonstrating a lack of phosphotyrosine counter-selection in species with bona fide tyrosine kinases. Taken together, our results suggest that, alongside the risk for toxicity, the cell can tolerate a large degree of non-functional crosstalk as interaction networks evolve.
Collapse
Affiliation(s)
- David Bradley
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
- Department of Biochemistry, Microbiology and Bioinformatics, Université Laval, Québec, QC, Canada
- Quebec Network for Research on Protein Function, Engineering, and Applications (PROTEO), Université du Québec à Montréal, Montréal, QC, Canada
- Université Laval Big Data Research Center (BDRC_UL), Québec, QC, Canada
- Department of Biology, Université Laval, Québec, QC, Canada
| | - Alexander Hogrebe
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Rohan Dandage
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
- Department of Biochemistry, Microbiology and Bioinformatics, Université Laval, Québec, QC, Canada
- Quebec Network for Research on Protein Function, Engineering, and Applications (PROTEO), Université du Québec à Montréal, Montréal, QC, Canada
- Université Laval Big Data Research Center (BDRC_UL), Québec, QC, Canada
- Department of Biology, Université Laval, Québec, QC, Canada
| | - Alexandre K Dubé
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
- Department of Biochemistry, Microbiology and Bioinformatics, Université Laval, Québec, QC, Canada
- Quebec Network for Research on Protein Function, Engineering, and Applications (PROTEO), Université du Québec à Montréal, Montréal, QC, Canada
- Université Laval Big Data Research Center (BDRC_UL), Québec, QC, Canada
- Department of Biology, Université Laval, Québec, QC, Canada
| | - Mario Leutert
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Institute of Molecular Systems Biology, ETH Zürich, Zürich, Switzerland
| | - Ugo Dionne
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
- Department of Biochemistry, Microbiology and Bioinformatics, Université Laval, Québec, QC, Canada
- Quebec Network for Research on Protein Function, Engineering, and Applications (PROTEO), Université du Québec à Montréal, Montréal, QC, Canada
- Université Laval Big Data Research Center (BDRC_UL), Québec, QC, Canada
- Department of Biology, Université Laval, Québec, QC, Canada
| | - Alexis Chang
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Judit Villén
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Christian R Landry
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada
- Department of Biochemistry, Microbiology and Bioinformatics, Université Laval, Québec, QC, Canada
- Quebec Network for Research on Protein Function, Engineering, and Applications (PROTEO), Université du Québec à Montréal, Montréal, QC, Canada
- Université Laval Big Data Research Center (BDRC_UL), Québec, QC, Canada
- Department of Biology, Université Laval, Québec, QC, Canada
| |
Collapse
|
15
|
Xie WJ, Liu D, Wang X, Zhang A, Wei Q, Nandi A, Dong S, Warshel A. Enhancing Luciferase Activity and Stability through Generative Modeling of Natural Enzyme Sequences. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.18.558367. [PMID: 37786693 PMCID: PMC10541610 DOI: 10.1101/2023.09.18.558367] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/04/2023]
Abstract
The availability of natural protein sequences synergized with generative artificial intelligence (AI) provides new paradigms to create enzymes. Although active enzyme variants with numerous mutations have been produced using generative models, their performance often falls short compared to their wild-type counterparts. Additionally, in practical applications, choosing fewer mutations that can rival the efficacy of extensive sequence alterations is usually more advantageous. Pinpointing beneficial single mutations continues to be a formidable task. In this study, using the generative maximum entropy model to analyze Renilla luciferase homologs, and in conjunction with biochemistry experiments, we demonstrated that natural evolutionary information could be used to predictively improve enzyme activity and stability by engineering the active center and protein scaffold, respectively. The success rate of designed single mutants is ~50% to improve either luciferase activity or stability. These finding highlights nature's ingenious approach to evolving proficient enzymes, wherein diverse evolutionary pressures are preferentially applied to distinct regions of the enzyme, ultimately culminating in an overall high performance. We also reveal an evolutionary preference in Renilla luciferase towards emitting blue light that holds advantages in terms of water penetration compared to other light spectrum. Taken together, our approach facilitates navigation through enzyme sequence space and offers effective strategies for computer-aided rational enzyme engineering.
Collapse
Affiliation(s)
- Wen Jun Xie
- Department of Chemistry, University of Southern California, Los Angeles, CA, USA
- Departmet of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development (CNPD3), Genetics Institute, University of Florida, Gainesville, FL, USA
| | - Dangliang Liu
- State Key Laboratory of Natural and Biomimetic Drugs, Chemical Biology Center, and School of Pharmaceutical Sciences, Peking University, Beijing, China
| | - Xiaoya Wang
- State Key Laboratory of Natural and Biomimetic Drugs, Chemical Biology Center, and School of Pharmaceutical Sciences, Peking University, Beijing, China
| | - Aoxuan Zhang
- Department of Chemistry, University of Southern California, Los Angeles, CA, USA
| | - Qijia Wei
- State Key Laboratory of Natural and Biomimetic Drugs, Chemical Biology Center, and School of Pharmaceutical Sciences, Peking University, Beijing, China
| | - Ashim Nandi
- Department of Chemistry, University of Southern California, Los Angeles, CA, USA
| | - Suwei Dong
- State Key Laboratory of Natural and Biomimetic Drugs, Chemical Biology Center, and School of Pharmaceutical Sciences, Peking University, Beijing, China
| | - Arieh Warshel
- Department of Chemistry, University of Southern California, Los Angeles, CA, USA
| |
Collapse
|
16
|
Man J, Harrington TA, Lally K, Bartlett ME. Asymmetric Evolution of Protein Domains in the Leucine-Rich Repeat Receptor-Like Kinase Family of Plant Signaling Proteins. Mol Biol Evol 2023; 40:msad220. [PMID: 37787619 PMCID: PMC10588794 DOI: 10.1093/molbev/msad220] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Revised: 08/29/2023] [Accepted: 09/26/2023] [Indexed: 10/04/2023] Open
Abstract
The coding sequences of developmental genes are expected to be deeply conserved, with cis-regulatory change driving the modulation of gene function. In contrast, proteins with roles in defense are expected to evolve rapidly, in molecular arms races with pathogens. However, some gene families include both developmental and defense genes. In these families, does the tempo and mode of evolution differ between genes with divergent functions, despite shared ancestry and structure? The leucine-rich repeat receptor-like kinase (LRR-RLKs) protein family includes members with roles in plant development and defense, thus providing an ideal system for answering this question. LRR-RLKs are receptors that traverse plasma membranes. LRR domains bind extracellular ligands; RLK domains initiate intracellular signaling cascades in response to ligand binding. In LRR-RLKs with roles in defense, LRR domains evolve faster than RLK domains. To determine whether this asymmetry extends to LRR-RLKs that function primarily in development, we assessed evolutionary rates and tested for selection acting on 11 subfamilies of LRR-RLKs, using deeply sampled protein trees. To assess functional evolution, we performed heterologous complementation assays in Arabidopsis thaliana (Arabidopsis). We found that the LRR domains of all tested LRR-RLK proteins evolved faster than their cognate RLK domains. All tested subfamilies of LRR-RLKs had strikingly similar patterns of molecular evolution, despite divergent functions. Heterologous transformation experiments revealed that multiple mechanisms likely contribute to the evolution of LRR-RLK function, including escape from adaptive conflict. Our results indicate specific and distinct evolutionary pressures acting on LRR versus RLK domains, despite diverse organismal roles for LRR-RLK proteins.
Collapse
Affiliation(s)
- Jarrett Man
- Department of Biology, University of Massachusetts Amherst, Amherst, MA 01002, USA
| | - T A Harrington
- Department of Biology, University of Massachusetts Amherst, Amherst, MA 01002, USA
| | - Kyra Lally
- Department of Biology, University of Massachusetts Amherst, Amherst, MA 01002, USA
| | - Madelaine E Bartlett
- Department of Biology, University of Massachusetts Amherst, Amherst, MA 01002, USA
| |
Collapse
|
17
|
Basalla JL, Mak CA, Byrne JA, Ghalmi M, Hoang Y, Vecchiarelli AG. Dissecting the phase separation and oligomerization activities of the carboxysome positioning protein McdB. eLife 2023; 12:e81362. [PMID: 37668016 PMCID: PMC10554743 DOI: 10.7554/elife.81362] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Accepted: 09/01/2023] [Indexed: 09/06/2023] Open
Abstract
Across bacteria, protein-based organelles called bacterial microcompartments (BMCs) encapsulate key enzymes to regulate their activities. The model BMC is the carboxysome that encapsulates enzymes for CO2 fixation to increase efficiency and is found in many autotrophic bacteria, such as cyanobacteria. Despite their importance in the global carbon cycle, little is known about how carboxysomes are spatially regulated. We recently identified the two-factor system required for the maintenance of carboxysome distribution (McdAB). McdA drives the equal spacing of carboxysomes via interactions with McdB, which associates with carboxysomes. McdA is a ParA/MinD ATPase, a protein family well studied in positioning diverse cellular structures in bacteria. However, the adaptor proteins like McdB that connect these ATPases to their cargos are extremely diverse. In fact, McdB represents a completely unstudied class of proteins. Despite the diversity, many adaptor proteins undergo phase separation, but functional roles remain unclear. Here, we define the domain architecture of McdB from the model cyanobacterium Synechococcus elongatus PCC 7942, and dissect its mode of biomolecular condensate formation. We identify an N-terminal intrinsically disordered region (IDR) that modulates condensate solubility, a central coiled-coil dimerizing domain that drives condensate formation, and a C-terminal domain that trimerizes McdB dimers and provides increased valency for condensate formation. We then identify critical basic residues in the IDR, which we mutate to glutamines to solubilize condensates. Finally, we find that a condensate-defective mutant of McdB has altered association with carboxysomes and influences carboxysome enzyme content. The results have broad implications for understanding spatial organization of BMCs and the molecular grammar of protein condensates.
Collapse
Affiliation(s)
- Joseph L Basalla
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan-Ann ArborAnn ArborUnited States
| | - Claudia A Mak
- Department of Biological Chemistry, University of Michigan-Ann ArborAnn ArborUnited States
| | - Jordan A Byrne
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan-Ann ArborAnn ArborUnited States
| | - Maria Ghalmi
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan-Ann ArborAnn ArborUnited States
| | - Y Hoang
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan-Ann ArborAnn ArborUnited States
| | - Anthony G Vecchiarelli
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan-Ann ArborAnn ArborUnited States
| |
Collapse
|
18
|
Ho W, Huang H, Huang J. IFF: Identifying key residues in intrinsically disordered regions of proteins using machine learning. Protein Sci 2023; 32:e4739. [PMID: 37498545 PMCID: PMC10443345 DOI: 10.1002/pro.4739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Revised: 06/21/2023] [Accepted: 07/25/2023] [Indexed: 07/28/2023]
Abstract
Conserved residues in protein homolog sequence alignments are structurally or functionally important. For intrinsically disordered proteins or proteins with intrinsically disordered regions (IDRs), however, alignment often fails because they lack a steric structure to constrain evolution. Although sequences vary, the physicochemical features of IDRs may be preserved in maintaining function. Therefore, a method to retrieve common IDR features may help identify functionally important residues. We applied unsupervised contrastive learning to train a model with self-attention neuronal networks on human IDR orthologs. Parameters in the model were trained to match sequences in ortholog pairs but not in other IDRs. The trained model successfully identifies previously reported critical residues from experimental studies, especially those with an overall pattern (e.g., multiple aromatic residues or charged blocks) rather than short motifs. This predictive model can be used to identify potentially important residues in other proteins, improving our understanding of their functions. The trained model can be run directly from the Jupyter Notebook in the GitHub repository using Binder (mybinder.org). The only required input is the primary sequence. The training scripts are available on GitHub (https://github.com/allmwh/IFF). The training datasets have been deposited in an Open Science Framework repository (https://osf.io/jk29b).
Collapse
Affiliation(s)
- Wen‐Lin Ho
- Institute of Biochemistry and Molecular Biology, National Yang Ming Chiao Tung UniversityTaipeiTaiwan
| | - Hsuan‐Cheng Huang
- Institute of Biomedical Informatics, National Yang Ming Chiao Tung UniversityTaipeiTaiwan
| | - Jie‐rong Huang
- Institute of Biochemistry and Molecular Biology, National Yang Ming Chiao Tung UniversityTaipeiTaiwan
- Institute of Biomedical Informatics, National Yang Ming Chiao Tung UniversityTaipeiTaiwan
- Department of Life Sciences and Institute of Genome SciencesNational Yang Ming Chiao Tung UniversityTaipeiTaiwan
| |
Collapse
|
19
|
Cui L, Cheng H, Yang Z, Xia C, Zhang L, Kong X. Comparative Analysis Reveals Different Evolutionary Fates and Biological Functions in Wheat Duplicated Genes ( Triticum aestivum L.). PLANTS (BASEL, SWITZERLAND) 2023; 12:3021. [PMID: 37687268 PMCID: PMC10489728 DOI: 10.3390/plants12173021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 08/20/2023] [Accepted: 08/21/2023] [Indexed: 09/10/2023]
Abstract
Wheat (Triticum aestivum L.) is a staple food crop that provides 20% of total human calorie consumption. Gene duplication has been considered to play an important role in evolution by providing new genetic resources. However, the evolutionary fates and biological functions of the duplicated genes in wheat remain to be elucidated. In this study, the resulting data showed that the duplicated genes evolved faster with shorter gene lengths, higher codon usage bias, lower expression levels, and higher tissue specificity when compared to non-duplicated genes. Our analysis further revealed functions of duplicated genes in various biological processes with significant enrichment to environmental stresses. In addition, duplicated genes derived from dispersed, proximal, tandem, transposed, and whole-genome duplication differed in abundance, evolutionary rate, gene compactness, expression pattern, and genetic diversity. Tandem and proximal duplicates experienced stronger selective pressure and showed a more compact gene structure with diverse expression profiles than other duplication modes. Moreover, genes derived from different duplication modes showed an asymmetrical evolutionary pattern for wheat A, B, and D subgenomes. Several candidate duplication hotspots associated with wheat domestication or polyploidization were characterized as potential targets for wheat molecular breeding. Our comprehensive analysis revealed the evolutionary trajectory of duplicated genes and laid the foundation for future functional studies on wheat.
Collapse
Affiliation(s)
- Licao Cui
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China; (L.C.); (H.C.); (Z.Y.); (C.X.); (L.Z.)
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang 330045, China
| | - Hao Cheng
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China; (L.C.); (H.C.); (Z.Y.); (C.X.); (L.Z.)
- State Key Laboratory of Crop Stress Biology for Arid Areas, College of Life Sciences, Northwest A&F University, Yangling 712100, China
| | - Zhe Yang
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China; (L.C.); (H.C.); (Z.Y.); (C.X.); (L.Z.)
| | - Chuan Xia
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China; (L.C.); (H.C.); (Z.Y.); (C.X.); (L.Z.)
| | - Lichao Zhang
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China; (L.C.); (H.C.); (Z.Y.); (C.X.); (L.Z.)
| | - Xiuying Kong
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China; (L.C.); (H.C.); (Z.Y.); (C.X.); (L.Z.)
| |
Collapse
|
20
|
Prillo S, Deng Y, Boyeau P, Li X, Chen PY, Song YS. CherryML: scalable maximum likelihood estimation of phylogenetic models. Nat Methods 2023; 20:1232-1236. [PMID: 37386188 PMCID: PMC10644697 DOI: 10.1038/s41592-023-01917-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 05/18/2023] [Indexed: 07/01/2023]
Abstract
Phylogenetic models of molecular evolution are central to numerous biological applications spanning diverse timescales, from hundreds of millions of years involving orthologous proteins to just tens of days relating to single cells within an organism. A fundamental problem in these applications is estimating model parameters, for which maximum likelihood estimation is typically employed. Unfortunately, maximum likelihood estimation is a computationally expensive task, in some cases prohibitively so. To address this challenge, we here introduce CherryML, a broadly applicable method that achieves several orders of magnitude speedup by using a quantized composite likelihood over cherries in the trees. The massive speedup offered by our method should enable researchers to consider more complex and biologically realistic models than previously possible. Here we demonstrate CherryML's utility by applying it to estimate a general 400 × 400 rate matrix for residue-residue coevolution at contact sites in three-dimensional protein structures; we estimate that using current state-of-the-art methods such as the expectation-maximization algorithm for the same task would take >100,000 times longer.
Collapse
Affiliation(s)
- Sebastian Prillo
- Computer Science Division, University of California, Berkeley, CA, USA
| | - Yun Deng
- Graduate Group in Computational Biology, University of California, Berkeley, CA, USA
| | - Pierre Boyeau
- Computer Science Division, University of California, Berkeley, CA, USA
| | - Xingyu Li
- Computer Science Division, University of California, Berkeley, CA, USA
| | - Po-Yen Chen
- Computer Science Division, University of California, Berkeley, CA, USA
| | - Yun S Song
- Computer Science Division, University of California, Berkeley, CA, USA.
- Department of Statistics, University of California, Berkeley, CA, USA.
| |
Collapse
|
21
|
Nagar N, Tubiana J, Loewenthal G, Wolfson HJ, Ben Tal N, Pupko T. EvoRator2: Predicting Site-specific Amino Acid Substitutions Based on Protein Structural Information Using Deep Learning. J Mol Biol 2023; 435:168155. [PMID: 37356902 DOI: 10.1016/j.jmb.2023.168155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Revised: 05/13/2023] [Accepted: 05/17/2023] [Indexed: 06/27/2023]
Abstract
Multiple sequence alignments (MSAs) are the workhorse of molecular evolution and structural biology research. From MSAs, the amino acids that are tolerated at each site during protein evolution can be inferred. However, little is known regarding the repertoire of tolerated amino acids in proteins when only a few or no sequence homologs are available, such as orphan and de novo designed proteins. Here we present EvoRator2, a deep-learning algorithm trained on over 15,000 protein structures that can predict which amino acids are tolerated at any given site, based exclusively on protein structural information mined from atomic coordinate files. We show that EvoRator2 obtained satisfying results for the prediction of position-weighted scoring matrices (PSSM). We further show that EvoRator2 obtained near state-of-the-art performance on proteins with high quality structures in predicting the effect of mutations in deep mutation scanning (DMS) experiments and that for certain DMS targets, EvoRator2 outperformed state-of-the-art methods. We also show that by combining EvoRator2's predictions with those obtained by a state-of-the-art deep-learning method that accounts for the information in the MSA, the prediction of the effect of mutation in DMS experiments was improved in terms of both accuracy and stability. EvoRator2 is designed to predict which amino-acid substitutions are tolerated in such proteins without many homologous sequences, including orphan or de novo designed proteins. We implemented our approach in the EvoRator web server (https://evorator.tau.ac.il).
Collapse
Affiliation(s)
- Natan Nagar
- The Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| | - Jérôme Tubiana
- Blavatnik School of Computer Science, Raymond & Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| | - Gil Loewenthal
- The Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| | - Haim J Wolfson
- Blavatnik School of Computer Science, Raymond & Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| | - Nir Ben Tal
- School of Neurobiology, Biochemistry & Biophysics, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| | - Tal Pupko
- The Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel.
| |
Collapse
|
22
|
Ali F. Patterns of change in nucleotide diversity over gene length. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.13.548940. [PMID: 37503020 PMCID: PMC10369989 DOI: 10.1101/2023.07.13.548940] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Nucleotide diversity at a site is influenced by the relative strengths of neutral and selective population genetic processes. Therefore, attempts to identify sites under positive selection require an understanding of the expected diversity in its absence. The nucleotide diversity of a gene was previously found to correlate with its length. In this work, I measure nucleotide diversity at synonymous sites and uncover a pattern of low diversity towards the translation initiation site (TIS) of a gene. The degree of reduction in diversity at the TIS and the length of this region of reduced diversity can be quantified as "Effect Size" and "Effect Length" respectively, using parameters of an asymptotic regression model. Estimates of Effect Length across bacteria covaried with recombination rates as well as with a multitude of fast-growth adaptations such as the avoidance of mRNA secondary structure around TIS, the number of rRNAs, and relative codon usage of ribosomal genes. Thus, the dependence of nucleotide diversity on gene length is governed by a combination of selective and non-selective processes. These results have implications for the estimation of effective population size and relative mutation rates based on "silent-site" diversity, and for pN/pS-based prediction of genes under selection.
Collapse
Affiliation(s)
- Farhan Ali
- Biodesign Institute, Arizona State University, Tempe, Arizona
| |
Collapse
|
23
|
Del Amparo R, Arenas M. Influence of substitution model selection on protein phylogenetic tree reconstruction. Gene 2023; 865:147336. [PMID: 36871672 DOI: 10.1016/j.gene.2023.147336] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Revised: 02/22/2023] [Accepted: 02/28/2023] [Indexed: 03/06/2023]
Abstract
Probabilistic phylogenetic tree reconstruction is traditionally performed under a best-fitting substitution model of molecular evolution previously selected according to diverse statistical criteria. Interestingly, some recent studies proposed that this procedure is unnecessary for phylogenetic tree reconstruction leading to a debate in the field. In contrast to DNA sequences, phylogenetic tree reconstruction from protein sequences is traditionally based on empirical exchangeability matrices that can differ among taxonomic groups and protein families. Considering this aspect, here we investigated the influence of selecting a substitution model of protein evolution on phylogenetic tree reconstruction by the analyses of real and simulated data. We found that phylogenetic tree reconstructions based on a selected best-fitting substitution model of protein evolution are the most accurate, in terms of topology and branch lengths, compared with those derived from substitution models with amino acid replacement matrices far from the selected best-fitting model, especially when the data has large genetic diversity. Indeed, we found that substitution models with similar amino acid replacement matrices produce similar reconstructed phylogenetic trees, suggesting the use of substitution models as similar as possible to a selected best-fitting model when the latter cannot be used. Therefore, we recommend the use of the traditional protocol of selection among substitution models of evolution for protein phylogenetic tree reconstruction.
Collapse
Affiliation(s)
- Roberto Del Amparo
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain; Department of Biochemistry, Genetics and Immunology, Universidade de Vigo, 36310 Vigo, Spain.
| | - Miguel Arenas
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain; Department of Biochemistry, Genetics and Immunology, Universidade de Vigo, 36310 Vigo, Spain; Galicia Sur Health Research Institute (IIS Galicia Sur), 36310 Vigo, Spain.
| |
Collapse
|
24
|
Bricout R, Weil D, Stroebel D, Genovesio A, Roest Crollius H. Evolution is not Uniform Along Coding Sequences. Mol Biol Evol 2023; 40:7060063. [PMID: 36857092 PMCID: PMC10025431 DOI: 10.1093/molbev/msad042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 02/15/2023] [Accepted: 02/16/2023] [Indexed: 03/02/2023] Open
Abstract
Amino acids evolve at different speeds within protein sequences, because their functional and structural roles are different. Notably, amino acids located at the surface of proteins are known to evolve more rapidly than those in the core. In particular, amino acids at the N- and C-termini of protein sequences are likely to be more exposed than those at the core of the folded protein due to their location in the peptidic chain, and they are known to be less structured. Because of these reasons, we would expect that amino acids located at protein termini would evolve faster than residues located inside the chain. Here we test this hypothesis and found that amino acids evolve almost twice as fast at protein termini compared with those in the center, hinting at a strong topological bias along the sequence length. We further show that the distribution of solvent-accessible residues and functional domains in proteins readily explain how structural and functional constraints are weaker at their termini, leading to the observed excess of amino acid substitutions. Finally, we show that the specific evolutionary rates at protein termini may have direct consequences, notably misleading in silico methods used to infer sites under positive selection within genes. These results suggest that accounting for positional information should improve evolutionary models.
Collapse
Affiliation(s)
- Raphaël Bricout
- Département de biologie, École normale supérieure, Institut de Biologie de l'ENS (IBENS), CNRS, INSERM, Paris, France
| | - Dominique Weil
- Laboratoire de Biologie du Développement, Sorbonne Université, CNRS, Institut de Biologie Paris-Seine (IBPS), Paris, France
| | - David Stroebel
- Département de biologie, École normale supérieure, Institut de Biologie de l'ENS (IBENS), CNRS, INSERM, Paris, France
| | - Auguste Genovesio
- Département de biologie, École normale supérieure, Institut de Biologie de l'ENS (IBENS), CNRS, INSERM, Paris, France
| | - Hugues Roest Crollius
- Département de biologie, École normale supérieure, Institut de Biologie de l'ENS (IBENS), CNRS, INSERM, Paris, France
| |
Collapse
|
25
|
Tang M, Liu L, Hu X, Zheng H, Wang Z, Liu Y, Zhu Q, Cui L, Xie S. Genome-wide characterization of R2R3-MYB gene family in Santalum album and their expression analysis under cold stress. FRONTIERS IN PLANT SCIENCE 2023; 14:1142562. [PMID: 36938022 PMCID: PMC10017448 DOI: 10.3389/fpls.2023.1142562] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 02/13/2023] [Indexed: 06/18/2023]
Abstract
Sandalwood (Santalum album) is a high-value multifunctional tree species that is rich in aromatic substances and is used in medicine and global cosmetics. Due to the scarcity of land resources in tropical and subtropical regions, land in temperate regions is a potential resource for the development of S. album plantations in order to meet the needs of S. album production and medicine. The R2R3-MYB transcription factor family is one of the largest in plants and plays an important role in the response to various abiotic stresses. However, the R2R3-MYB gene family of S. album has not been studied. In this study, 144 R2R3-MYB genes were successfully identified in the assembly genome sequence, and their characteristics and expression patterns were investigated under various durations of low temperature stress. According to the findings, 31 of the 114 R2R3-MYB genes showed significant differences in expression after cold treatment. Combining transcriptome and weighted gene co-expression network analysis (WGCNA) revealed three key candidate genes (SaMYB098, SaMYB015, and SaMYB068) to be significantly involved in the regulation of cold resistance in S. album. The structural characteristics, evolution, and expression pattern of the R2R3-MYB gene in S. album were systematically examined at the whole genome level for the first time in this study. It will provide important information for future research into the function of the R2R3-MYB genes and the mechanism of cold stress response in S. album.
Collapse
Affiliation(s)
- Minqiang Tang
- Key Laboratory of Genetics and Germplasm Innovation of Tropical Special Forest Trees and Ornamental Plants (Ministry of Education), School of Forestry, Hainan University, Haikou, China
| | - Le Liu
- Key Laboratory of Genetics and Germplasm Innovation of Tropical Special Forest Trees and Ornamental Plants (Ministry of Education), School of Forestry, Hainan University, Haikou, China
| | - Xu Hu
- Key Laboratory of Genetics and Germplasm Innovation of Tropical Special Forest Trees and Ornamental Plants (Ministry of Education), School of Forestry, Hainan University, Haikou, China
| | - Haoyue Zheng
- Key Laboratory of Genetics and Germplasm Innovation of Tropical Special Forest Trees and Ornamental Plants (Ministry of Education), School of Forestry, Hainan University, Haikou, China
| | - Zukai Wang
- Key Laboratory of Genetics and Germplasm Innovation of Tropical Special Forest Trees and Ornamental Plants (Ministry of Education), School of Forestry, Hainan University, Haikou, China
| | - Yi Liu
- Key Laboratory of Genetics and Germplasm Innovation of Tropical Special Forest Trees and Ornamental Plants (Ministry of Education), School of Forestry, Hainan University, Haikou, China
| | - Qing Zhu
- Key Laboratory of Genetics and Germplasm Innovation of Tropical Special Forest Trees and Ornamental Plants (Ministry of Education), School of Forestry, Hainan University, Haikou, China
| | - Licao Cui
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, China
| | - Shangqian Xie
- Key Laboratory of Genetics and Germplasm Innovation of Tropical Special Forest Trees and Ornamental Plants (Ministry of Education), School of Forestry, Hainan University, Haikou, China
| |
Collapse
|
26
|
Queiroz JPF, Lourenzoni MR, Rocha BAM. Structural evolution of an amphibian-specific globin: A computational evolutionary biochemistry approach. COMPARATIVE BIOCHEMISTRY AND PHYSIOLOGY. PART D, GENOMICS & PROTEOMICS 2023; 45:101055. [PMID: 36566682 DOI: 10.1016/j.cbd.2022.101055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 12/14/2022] [Accepted: 12/15/2022] [Indexed: 12/24/2022]
Abstract
Studies on the globin family are continuously revealing insights into the mechanisms of gene and protein evolution. The rise of a new globin gene type in Pelobatoidea and Neobatrachia (Amphibia:Anura) from an α-globin precursor provides the opportunity to investigate the genetic and physical mechanisms underlying the origin of new protein structural and functional properties. This amphibian-specific globin (globin A/GbA) discovered in the heart of Rana catesbeiana is a monomer. As the ancestral oligomeric state of α-globins is a homodimer, we inferred that the ancestral state was lost somewhere in the GbA lineage. Here, we combined computational molecular evolution with structural bioinformatics to determine the extent to which the loss of the homodimeric state is pervasive in the GbA clade. We also characterized the loci of GbA genes in Bufo bufo. We found two GbA clades in Neobatrachia. One was deleted in Ranidae, but retained and expanded to yield a new globin cluster in Bufonidae species. Loss of the ancestral oligomeric state seems to be pervasive in the GbA clade. However, a taxonomic sampling that includes more Pelobatoidea, as well as early Neobatrachia, lineages would be necessary to determine the oligomeric state of the last common ancestor of all GbA. The evidence presented here points out a possible loss of oligomerization in Pelobatoidea GbA as a result of amino acid substitutions that weaken the homodimeric state. In contrast, the loss of oligomerization in both Neobatrachia GbA clades was linked to independent deletions that disrupted many packing contacts at the homodimer interface.
Collapse
Affiliation(s)
- João Pedro Fernandes Queiroz
- Laboratorio de Biocristalografia - LABIC, Departamento de Bioquimica e Biologia Molecular, Universidade Federal do Ceara, Campus do Pici s.n., bloco 907, Av. Mister Hull, Fortaleza, Ceara, 60440-970, Brazil.
| | - Marcos Roberto Lourenzoni
- Protein Engineering and Health Solutions Group - GEPeSS Fundacao Oswaldo Cruz - Ceara, Eusébio, Ceara, 60175-047, Brazil.
| | - Bruno Anderson Matias Rocha
- Laboratorio de Biocristalografia - LABIC, Departamento de Bioquimica e Biologia Molecular, Universidade Federal do Ceara, Campus do Pici s.n., bloco 907, Av. Mister Hull, Fortaleza, Ceara, 60440-970, Brazil.
| |
Collapse
|
27
|
Kiefl E, Esen OC, Miller SE, Kroll KL, Willis AD, Rappé MS, Pan T, Eren AM. Structure-informed microbial population genetics elucidate selective pressures that shape protein evolution. SCIENCE ADVANCES 2023; 9:eabq4632. [PMID: 36812328 DOI: 10.1126/sciadv.abq4632] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Accepted: 01/18/2023] [Indexed: 06/18/2023]
Abstract
Comprehensive sampling of natural genetic diversity with metagenomics enables highly resolved insights into the interplay between ecology and evolution. However, resolving adaptive, neutral, or purifying processes of evolution from intrapopulation genomic variation remains a challenge, partly due to the sole reliance on gene sequences to interpret variants. Here, we describe an approach to analyze genetic variation in the context of predicted protein structures and apply it to a marine microbial population within the SAR11 subclade 1a.3.V, which dominates low-latitude surface oceans. Our analyses reveal a tight association between genetic variation and protein structure. In a central gene in nitrogen metabolism, we observe decreased occurrence of nonsynonymous variants from ligand-binding sites as a function of nitrate concentrations, revealing genetic targets of distinct evolutionary pressures maintained by nutrient availability. Our work yields insights into the governing principles of evolution and enables structure-aware investigations of microbial population genetics.
Collapse
Affiliation(s)
- Evan Kiefl
- Department of Medicine, University of Chicago, Chicago, IL 60637, USA
- Graduate Program in Biophysical Sciences, University of Chicago, Chicago, IL 60637, USA
| | - Ozcan C Esen
- Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Samuel E Miller
- Department of Medicine, University of Chicago, Chicago, IL 60637, USA
- Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, MA 02543, USA
| | - Kourtney L Kroll
- Graduate Program in Biophysical Sciences, University of Chicago, Chicago, IL 60637, USA
| | - Amy D Willis
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA
| | - Michael S Rappé
- Hawai'i Institute of Marine Biology, University of Hawai'i at Mānoa, Kāne'ohe, HI 96822, USA
| | - Tao Pan
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL 60637, USA
| | - A Murat Eren
- Department of Medicine, University of Chicago, Chicago, IL 60637, USA
- Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, MA 02543, USA
- Institute for Chemistry and Biology of the Marine Environment, University of Oldenburg, Oldenburg, Germany
- Alfred Wegener Institute for Polar and Marine Research, Bremerhaven, Germany
- Helmholtz Institute for Functional Marine Biodiversity, Oldenburg, Germany
| |
Collapse
|
28
|
Tao W, Li R, Li T, Li Z, Li Y, Cui L. The evolutionary patterns, expression profiles, and genetic diversity of expanded genes in barley. FRONTIERS IN PLANT SCIENCE 2023; 14:1168124. [PMID: 37180392 PMCID: PMC10171312 DOI: 10.3389/fpls.2023.1168124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Accepted: 03/28/2023] [Indexed: 05/16/2023]
Abstract
Gene duplication resulting from whole-genome duplication (WGD), small-scale duplication (SSD), or unequal hybridization plays an important role in the expansion of gene families. Gene family expansion can also mediate species formation and adaptive evolution. Barley (Hordeum vulgare) is the world's fourth largest cereal crop, and it contains valuable genetic resources due to its ability to tolerate various types of environmental stress. In this study, 27,438 orthogroups in the genomes of seven Poaceae were identified, and 214 of them were significantly expanded in barley. The evolutionary rates, gene properties, expression profiles, and nucleotide diversity between expanded and non-expanded genes were compared. Expanded genes evolved more rapidly and experienced lower negative selection. Expanded genes, including their exons and introns, were shorter, they had fewer exons, their GC content was lower, and their first exons were longer compared with non-expanded genes. Codon usage bias was also lower for expanded genes than for non-expanded genes; the expression levels of expanded genes were lower than those of non-expanded genes, and the expression of expanded genes showed higher tissue specificity than that of non-expanded genes. Several stress-response-related genes/gene families were identified, and these genes could be used to breed barley plants with greater resistance to environmental stress. Overall, our analysis revealed evolutionary, structural, and functional differences between expanded and non-expanded genes in barley. Additional studies are needed to clarify the functions of the candidate genes identified in our study and evaluate their utility for breeding barley plants with greater stress resistance.
Collapse
Affiliation(s)
- Wenjing Tao
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, China
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, Xiamen, Fujian, China
| | - Ruiying Li
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, China
| | - Tingting Li
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, China
| | - Zhimin Li
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, China
| | - Yihan Li
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, China
- *Correspondence: Yihan Li, ; Licao Cui,
| | - Licao Cui
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, China
- *Correspondence: Yihan Li, ; Licao Cui,
| |
Collapse
|
29
|
Adam PS, Kolyfetis GE, Bornemann TLV, Vorgias CE, Probst AJ. Genomic remnants of ancestral methanogenesis and hydrogenotrophy in Archaea drive anaerobic carbon cycling. SCIENCE ADVANCES 2022; 8:eabm9651. [PMID: 36332026 PMCID: PMC9635834 DOI: 10.1126/sciadv.abm9651] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 09/19/2022] [Indexed: 05/19/2023]
Abstract
Anaerobic methane metabolism is among the hallmarks of Archaea, originating very early in their evolution. Here, we show that the ancestor of methane metabolizers was an autotrophic CO2-reducing hydrogenotrophic methanogen that possessed the two main complexes, methyl-CoM reductase (Mcr) and tetrahydromethanopterin-CoM methyltransferase (Mtr), the anaplerotic hydrogenases Eha and Ehb, and a set of other genes collectively called "methanogenesis markers" but could not oxidize alkanes. Overturning recent inferences, we demonstrate that methyl-dependent hydrogenotrophic methanogenesis has emerged multiple times independently, either due to a loss of Mtr while Mcr is inherited vertically or from an ancient lateral acquisition of Mcr. Even if Mcr is lost, Mtr, Eha, Ehb, and the markers can persist, resulting in mixotrophic metabolisms centered around the Wood-Ljungdahl pathway. Through their methanogenesis remnants, Thorarchaeia and two newly reconstructed order-level lineages in Archaeoglobi and Bathyarchaeia act as metabolically versatile players in carbon cycling of anoxic environments across the globe.
Collapse
Affiliation(s)
- Panagiotis S. Adam
- Environmental Microbiology and Biotechnology, Faculty of Chemistry, University of Duisburg-Essen, Universitätsstraße 5, 45141 Essen, Germany
- Corresponding author.
| | - George E. Kolyfetis
- Environmental Microbiology and Biotechnology, Faculty of Chemistry, University of Duisburg-Essen, Universitätsstraße 5, 45141 Essen, Germany
- Department of Biochemistry and Molecular Biology, Faculty of Biology, National and Kapodistrian University of Athens, Panepistimiopolis Zografou, 15784 Athens, Greece
| | - Till L. V. Bornemann
- Environmental Microbiology and Biotechnology, Faculty of Chemistry, University of Duisburg-Essen, Universitätsstraße 5, 45141 Essen, Germany
| | - Constantinos E. Vorgias
- Department of Biochemistry and Molecular Biology, Faculty of Biology, National and Kapodistrian University of Athens, Panepistimiopolis Zografou, 15784 Athens, Greece
| | - Alexander J. Probst
- Environmental Microbiology and Biotechnology, Faculty of Chemistry, University of Duisburg-Essen, Universitätsstraße 5, 45141 Essen, Germany
- Centre for Water and Environmental Research (ZWU), University of Duisburg-Essen, Universitätsstraße 5, 45141 Essen, Germany
- Research Center One Health Ruhr, Research Alliance Ruhr, Environmental Metagenomics, University of Duisburg-Essen, Universitätsstraße 5, 45141 Essen, Germany
| |
Collapse
|
30
|
Behrendt A, Golchin P, König F, Mulnaes D, Stalke A, Dröge C, Keitel V, Gohlke H. Vasor: Accurate prediction of variant effects for amino acid substitutions in multidrug resistance protein 3. Hepatol Commun 2022; 6:3098-3111. [PMID: 36111625 PMCID: PMC9592774 DOI: 10.1002/hep4.2088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/15/2022] [Revised: 07/26/2022] [Accepted: 08/16/2022] [Indexed: 12/14/2022] Open
Abstract
The phosphatidylcholine floppase multidrug resistance protein 3 (MDR3) is an essential hepatobiliary transport protein. MDR3 dysfunction is associated with various liver diseases, ranging from severe progressive familial intrahepatic cholestasis to transient forms of intrahepatic cholestasis of pregnancy and familial gallstone disease. Single amino acid substitutions are often found as causative of dysfunction, but identifying the substitution effect in in vitro studies is time and cost intensive. We developed variant assessor of MDR3 (Vasor), a machine learning-based model to classify novel MDR3 missense variants into the categories benign or pathogenic. Vasor was trained on the largest data set to date that is specific for benign and pathogenic variants of MDR3 and uses general predictors, namely Evolutionary Models of Variant Effects (EVE), EVmutation, PolyPhen-2, I-Mutant2.0, MUpro, MAESTRO, and PON-P2 along with other variant properties, such as half-sphere exposure and posttranslational modification site, as input. Vasor consistently outperformed the integrated general predictors and the external prediction tool MutPred2, leading to the current best prediction performance for MDR3 single-site missense variants (on an external test set: F1-score, 0.90; Matthew's correlation coefficient, 0.80). Furthermore, Vasor predictions cover the entire sequence space of MDR3. Vasor is accessible as a webserver at https://cpclab.uni-duesseldorf.de/mdr3_predictor/ for users to rapidly obtain prediction results and a visualization of the substitution site within the MDR3 structure. The MDR3-specific prediction tool Vasor can provide reliable predictions of single-site amino acid substitutions, giving users a fast way to initially assess whether a variant is benign or pathogenic.
Collapse
Affiliation(s)
- Annika Behrendt
- Institute for Pharmaceutical and Medicinal ChemistryHeinrich Heine University DüsseldorfDüsseldorfGermany
| | - Pegah Golchin
- Department of Electrical Engineering and Information TechnologyTechnische Universität DarmstadtDarmstadtGermany
| | - Filip König
- Institute for Pharmaceutical and Medicinal ChemistryHeinrich Heine University DüsseldorfDüsseldorfGermany
| | - Daniel Mulnaes
- Institute for Pharmaceutical and Medicinal ChemistryHeinrich Heine University DüsseldorfDüsseldorfGermany
| | - Amelie Stalke
- Department of Human GeneticsHannover Medical SchoolHannoverGermany,Division of Kidney, Department of Pediatric Gastroenterology and Hepatology, Liver, and Metabolic DiseasesHannover Medical SchoolHannoverGermany
| | - Carola Dröge
- Department for Gastroenterology, Hepatology, and Infectious Diseases, Medical FacultyOtto von Guericke UniversityMagdeburgGermany,Department for Gastroenterology, Hepatology, and Infectious DiseasesUniversity Hospital, Medical FacultyHeinrich Heine University DüsseldorfDüsseldorfGermany
| | - Verena Keitel
- Department for Gastroenterology, Hepatology, and Infectious Diseases, Medical FacultyOtto von Guericke UniversityMagdeburgGermany,Department for Gastroenterology, Hepatology, and Infectious DiseasesUniversity Hospital, Medical FacultyHeinrich Heine University DüsseldorfDüsseldorfGermany
| | - Holger Gohlke
- Institute for Pharmaceutical and Medicinal ChemistryHeinrich Heine University DüsseldorfDüsseldorfGermany,John‐von‐Neumann‐Institute for Computing, Jülich Supercomputing Center, Institute of Biological Information Processing (IBI‐7: Structural Biochemistry), and Institute of Bio‐ and Geosciences (IBG‐4: Bioinformatics)Forschungszentrum Jülich GmbHJülichGermany
| |
Collapse
|
31
|
Pollet L, Lambourne L, Xia Y. Structural Determinants of Yeast Protein-Protein Interaction Interface Evolution at the Residue Level. J Mol Biol 2022; 434:167750. [PMID: 35850298 DOI: 10.1016/j.jmb.2022.167750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 06/09/2022] [Accepted: 07/12/2022] [Indexed: 12/01/2022]
Abstract
Interfaces of contact between proteins play important roles in determining the proper structure and function of protein-protein interactions (PPIs). Therefore, to fully understand PPIs, we need to better understand the evolutionary design principles of PPI interfaces. Previous studies have uncovered that interfacial sites are more evolutionarily conserved than other surface protein sites. Yet, little is known about the nature and relative importance of evolutionary constraints in PPI interfaces. Here, we explore constraints imposed by the structure of the microenvironment surrounding interfacial residues on residue evolutionary rate using a large dataset of over 700 structural models of baker's yeast PPIs. We find that interfacial residues are, on average, systematically more conserved than all other residues with a similar degree of total burial as measured by relative solvent accessibility (RSA). Besides, we find that RSA of the residue when the PPI is formed is a better predictor of interfacial residue evolutionary rate than RSA in the monomer state. Furthermore, we investigate four structure-based measures of residue interfacial involvement, including change in RSA upon binding (ΔRSA), number of residue-residue contacts across the interface, and distance from the center or the periphery of the interface. Integrated modeling for evolutionary rate prediction in interfaces shows that ΔRSA plays a dominant role among the four measures of interfacial involvement, with minor, but independent contributions from other measures. These results yield insight into the evolutionary design of interfaces, improving our understanding of the role that structure plays in the molecular evolution of PPIs at the residue level.
Collapse
Affiliation(s)
- Léah Pollet
- Department of Bioengineering, Faculty of Engineering, McGill University, Montreal, QC, Canada
| | - Luke Lambourne
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA; Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA; Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA.
| | - Yu Xia
- Department of Bioengineering, Faculty of Engineering, McGill University, Montreal, QC, Canada.
| |
Collapse
|
32
|
Li L, Li M, Wu J, Yin H, Dunwell JM, Zhang S. Genome-wide identification and comparative evolutionary analysis of sorbitol metabolism pathway genes in four Rosaceae species and three model plants. BMC PLANT BIOLOGY 2022; 22:341. [PMID: 35836134 PMCID: PMC9284748 DOI: 10.1186/s12870-022-03729-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/02/2022] [Accepted: 06/29/2022] [Indexed: 06/15/2023]
Abstract
In contrast to most land plant species, sorbitol, instead of sucrose, is the major photosynthetic product in many Rosaceae species. It has been well illustrated that three key functional genes encoding sorbitol-6-phosphate dehydrogenase (S6PDH), sorbitol dehydrogenase (SDH), and sorbitol transporter (SOT), are mainly responsible for the synthesis, degradation and transportation of sorbitol. In this study, the genome-wide identification of S6PDH, SDH and SOT genes was conducted in four Rosaceae species, peach, mei, apple and pear, and showed the sorbitol bio-pathway to be dominant (named sorbitol present group, SPG); another three related species, including tomato, poplar and Arabidopsis, showed a non-sorbitol bio-pathway (named sorbitol absent group, SAG). To understand the evolutionary differences of the three important gene families between SAG and SPG, their corresponding gene duplication, evolutionary rate, codon bias and positive selection patterns have been analyzed and compared. The sorbitol pathway genes in SPG were found to be expanded through dispersed and tandem gene duplications. Branch-specific model analyses revealed SDH and S6PDH clade A were under stronger purifying selection in SPG. A higher frequency of optimal codons was found in S6PDH and SDH than that of SOT in SPG, confirming the purifying selection effect on them. In addition, branch-site model analyses revealed SOT genes were under positive selection in SPG. Expression analyses showed diverse expression patterns of sorbitol-related genes. Overall, these findings provide new insights in the evolutionary characteristics for the three key sorbitol metabolism-related gene families in Rosaceae and other non-sorbitol dominant pathway species.
Collapse
Affiliation(s)
- Leiting Li
- College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu China
- Shanghai Center for Plant Stress Biology and CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Meng Li
- College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu China
| | - Juyou Wu
- College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu China
| | - Hao Yin
- College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu China
| | - Jim M. Dunwell
- School of Agriculture, Policy and Development, University of Reading, Earley Gate, Reading, UK
| | - Shaoling Zhang
- College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu China
| |
Collapse
|
33
|
Del Amparo R, Arenas M. Consequences of Substitution Model Selection on Protein Ancestral Sequence Reconstruction. Mol Biol Evol 2022; 39:6628884. [PMID: 35789388 PMCID: PMC9254009 DOI: 10.1093/molbev/msac144] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
The selection of the best-fitting substitution model of molecular evolution is a traditional step for phylogenetic inferences, including ancestral sequence reconstruction (ASR). However, a few recent studies suggested that applying this procedure does not affect the accuracy of phylogenetic tree reconstruction. Here, we revisited this debate topic by analyzing the influence of selection among substitution models of protein evolution, with focus on exchangeability matrices, on the accuracy of ASR using simulated and real data. We found that the selected best-fitting substitution model produces the most accurate ancestral sequences, especially if the data present large genetic diversity. Indeed, ancestral sequences reconstructed under substitution models with similar exchangeability matrices were similar, suggesting that if the selected best-fitting model cannot be used for the reconstruction, applying a model similar to the selected one is preferred. We conclude that selecting among substitution models of protein evolution is recommended for reconstructing accurate ancestral sequences.
Collapse
Affiliation(s)
- Roberto Del Amparo
- CINBIO, Universidade de Vigo, Vigo, Spain.,Departamento de Bioquímica, Xenética e Immunoloxía, Universidade de Vigo, Vigo, Spain
| | - Miguel Arenas
- CINBIO, Universidade de Vigo, Vigo, Spain.,Departamento de Bioquímica, Xenética e Immunoloxía, Universidade de Vigo, Vigo, Spain.,Galicia Sur Health Research Institute (IIS Galicia Sur), Vigo, Spain
| |
Collapse
|
34
|
Jayaraman V, Toledo‐Patiño S, Noda‐García L, Laurino P. Mechanisms of protein evolution. Protein Sci 2022; 31:e4362. [PMID: 35762715 PMCID: PMC9214755 DOI: 10.1002/pro.4362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 05/11/2022] [Accepted: 05/14/2022] [Indexed: 11/06/2022]
Abstract
How do proteins evolve? How do changes in sequence mediate changes in protein structure, and in turn in function? This question has multiple angles, ranging from biochemistry and biophysics to evolutionary biology. This review provides a brief integrated view of some key mechanistic aspects of protein evolution. First, we explain how protein evolution is primarily driven by randomly acquired genetic mutations and selection for function, and how these mutations can even give rise to completely new folds. Then, we also comment on how phenotypic protein variability, including promiscuity, transcriptional and translational errors, may also accelerate this process, possibly via "plasticity-first" mechanisms. Finally, we highlight open questions in the field of protein evolution, with respect to the emergence of more sophisticated protein systems such as protein complexes, pathways, and the emergence of pre-LUCA enzymes.
Collapse
Affiliation(s)
- Vijay Jayaraman
- Department of Molecular Cell BiologyWeizmann Institute of ScienceRehovotIsrael
| | - Saacnicteh Toledo‐Patiño
- Protein Engineering and Evolution UnitOkinawa Institute of Science and Technology Graduate UniversityOkinawaJapan
| | - Lianet Noda‐García
- Department of Plant Pathology and Microbiology, Institute of Environmental Sciences, Robert H. Smith Faculty of Agriculture, Food and EnvironmentHebrew University of JerusalemRehovotIsrael
| | - Paola Laurino
- Protein Engineering and Evolution UnitOkinawa Institute of Science and Technology Graduate UniversityOkinawaJapan
| |
Collapse
|
35
|
Tao W, Bian J, Tang M, Zeng Y, Luo R, Ke Q, Li T, Li Y, Cui L. Genomic insights into positive selection during barley domestication. BMC PLANT BIOLOGY 2022; 22:267. [PMID: 35641942 PMCID: PMC9158214 DOI: 10.1186/s12870-022-03655-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/19/2022] [Accepted: 05/23/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND Cultivated barley (Hordeum vulgare) is widely used in animal feed, beverages, and foods and has become a model crop for molecular evolutionary studies. Few studies have examined the evolutionary fates of different types of genes in barley during the domestication process. RESULTS The rates of nonsynonymous substitution (Ka) to synonymous substitution (Ks) were calculated by comparing orthologous genes in different barley groups (wild vs. landrace and landrace vs. improved cultivar). The rates of evolution, properties, expression patterns, and diversity of positively selected genes (PSGs) and negatively selected genes (NSGs) were compared. PSGs evolved more rapidly, possessed fewer exons, and had lower GC content than NSGs; they were also shorter and had shorter intron, exon, and first exon lengths. Expression levels were lower, the tissue specificity of expression was higher, and codon usage bias was weaker for PSGs than for NSGs. Nucleotide diversity analysis revealed that PSGs have undergone a more severe genetic bottleneck than NSGs. Several candidate PSGs were involved in plant growth and development, which might make them as excellent targets for the molecular breeding of barley. CONCLUSIONS Our comprehensive analysis of the evolutionary, structural, and functional divergence between PSGs and NSGs in barley provides new insight into the evolutionary trajectory of barley during domestication. Our findings also aid future functional studies of PSGs in barley.
Collapse
Affiliation(s)
- Wenjing Tao
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, 330045 China
| | - Jianxin Bian
- Peking University Institute of Advanced Agricultural Sciences, Weifang, Shandong, 261325 China
| | - Minqiang Tang
- College of Forestry, Hainan University, Haikou, Hainan, 570228 China
| | - Yan Zeng
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, 330045 China
| | - Ruihan Luo
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, 330045 China
| | - Qinglin Ke
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, 330045 China
| | - Tingting Li
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, 330045 China
| | - Yihan Li
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, 330045 China
| | - Licao Cui
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, 330045 China
| |
Collapse
|
36
|
Patel R, Carnevale V, Kumar S. Epistasis Creates Invariant Sites and Modulates the Rate of Molecular Evolution. Mol Biol Evol 2022; 39:msac106. [PMID: 35575390 PMCID: PMC9156017 DOI: 10.1093/molbev/msac106] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Invariant sites are a common feature of amino acid sequence evolution. The presence of invariant sites is frequently attributed to the need to preserve function through site-specific conservation of amino acid residues. Amino acid substitution models without a provision for invariant sites often fit the data significantly worse than those that allow for an excess of invariant sites beyond those predicted by models that only incorporate rate variation among sites (e.g., a Gamma distribution). An alternative is epistasis between sites to preserve residue interactions that can create invariant sites. Through computer-simulated sequence evolution, we evaluated the relative effects of site-specific preferences and site-site couplings in the generation of invariant sites and the modulation of the rate of molecular evolution. In an analysis of ten major families of protein domains with diverse sequence and functional properties, we find that the negative selection imposed by epistasis creates many more invariant sites than site-specific residue preferences alone. Further, epistasis plays an increasingly larger role in creating invariant sites over longer evolutionary periods. Epistasis also dictates rates of domain evolution over time by exerting significant additional purifying selection to preserve site couplings. These patterns illuminate the mechanistic role of epistasis in the processes underlying observed site invariance and evolutionary rates.
Collapse
Affiliation(s)
- Ravi Patel
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA 19122, USA
- Department of Biology, Temple University, Philadelphia, PA 19122, USA
| | - Vincenzo Carnevale
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA 19122, USA
- Department of Biology, Temple University, Philadelphia, PA 19122, USA
| | - Sudhir Kumar
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA 19122, USA
- Department of Biology, Temple University, Philadelphia, PA 19122, USA
- Center for Excellence in Genome Medicine and Research, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
37
|
Gauto DF, Macek P, Malinverni D, Fraga H, Paloni M, Sučec I, Hessel A, Bustamante JP, Barducci A, Schanda P. Functional control of a 0.5 MDa TET aminopeptidase by a flexible loop revealed by MAS NMR. Nat Commun 2022; 13:1927. [PMID: 35395851 PMCID: PMC8993905 DOI: 10.1038/s41467-022-29423-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Accepted: 03/14/2022] [Indexed: 02/07/2023] Open
Abstract
Large oligomeric enzymes control a myriad of cellular processes, from protein synthesis and degradation to metabolism. The 0.5 MDa large TET2 aminopeptidase, a prototypical protease important for cellular homeostasis, degrades peptides within a ca. 60 Å wide tetrahedral chamber with four lateral openings. The mechanisms of substrate trafficking and processing remain debated. Here, we integrate magic-angle spinning (MAS) NMR, mutagenesis, co-evolution analysis and molecular dynamics simulations and reveal that a loop in the catalytic chamber is a key element for enzymatic function. The loop is able to stabilize ligands in the active site and may additionally have a direct role in activating the catalytic water molecule whereby a conserved histidine plays a key role. Our data provide a strong case for the functional importance of highly dynamic - and often overlooked - parts of an enzyme, and the potential of MAS NMR to investigate their dynamics at atomic resolution.
Collapse
Affiliation(s)
- Diego F Gauto
- Univ. Grenoble Alpes, CEA, CNRS, Institut de Biologie Structurale (IBS), 71, Avenue des Martyrs, F-38044, Grenoble, France
- ICSN, CNRS UPR2301, Univ. Paris-Saclay, Gif-sur-Yvette, France
| | - Pavel Macek
- Univ. Grenoble Alpes, CEA, CNRS, Institut de Biologie Structurale (IBS), 71, Avenue des Martyrs, F-38044, Grenoble, France
- Celonic AG, Eulerstrasse 55, 4051, Basel, Switzerland
| | - Duccio Malinverni
- Department of Structural Biology and Center for Data Driven Discovery, St Jude Children's Research Hospital, Memphis, TN, USA
| | - Hugo Fraga
- Univ. Grenoble Alpes, CEA, CNRS, Institut de Biologie Structurale (IBS), 71, Avenue des Martyrs, F-38044, Grenoble, France
- Departamento de Biomedicina, Faculdade de Medicina da Universidade do Porto, Porto, Portugal
- i3S, Instituto de Investigacao e Inovacao em Saude, Universidade do Porto, Porto, Portugal
| | - Matteo Paloni
- CBS (Centre de Biologie Structurale), Univ Montpellier, CNRS, INSERM, Montpellier, France
| | - Iva Sučec
- Univ. Grenoble Alpes, CEA, CNRS, Institut de Biologie Structurale (IBS), 71, Avenue des Martyrs, F-38044, Grenoble, France
| | - Audrey Hessel
- Univ. Grenoble Alpes, CEA, CNRS, Institut de Biologie Structurale (IBS), 71, Avenue des Martyrs, F-38044, Grenoble, France
| | - Juan Pablo Bustamante
- Instituto de Bioingenieria y Bioinformatica, IBB (CONICET-UNER), Oro Verde, Entre Rios, Argentina
| | - Alessandro Barducci
- CBS (Centre de Biologie Structurale), Univ Montpellier, CNRS, INSERM, Montpellier, France.
| | - Paul Schanda
- Univ. Grenoble Alpes, CEA, CNRS, Institut de Biologie Structurale (IBS), 71, Avenue des Martyrs, F-38044, Grenoble, France.
- Institute of Science and Technology Austria, Am Campus 1, A-3400, Klosterneuburg, Austria.
| |
Collapse
|
38
|
Secretory quality control constrains functional selection-associated protein structure innovation. Commun Biol 2022; 5:268. [PMID: 35338247 PMCID: PMC8956723 DOI: 10.1038/s42003-022-03220-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Accepted: 03/03/2022] [Indexed: 12/26/2022] Open
Abstract
Biophysical models suggest a dominant role of structural over functional constraints in shaping protein evolution. Selection on structural constraints is linked closely to expression levels of proteins, which together with structure-associated activities determine in vivo functions of proteins. Here we show that despite the up to two orders of magnitude differences in levels of C-reactive protein (CRP) in distinct species, the in vivo functions of CRP are paradoxically conserved. Such a pronounced level-function mismatch cannot be explained by activities associated with the conserved native structure, but is coupled to hidden activities associated with the unfolded, activated conformation. This is not the result of selection on structural constraints like foldability and stability, but is achieved by folding determinants-mediated functional selection that keeps a confined carrier structure to pass the stringent eukaryotic quality control on secretion. Further analysis suggests a folding threshold model which may partly explain the mismatch between the vast sequence space and the limited structure space of proteins. The mismatch in the conserved structure but different expression levels of C-reactive protein (CRP) in distinct species is reconciled by functional selection on hidden activities of unfolded CRPs.
Collapse
|
39
|
Nagar N, Ben Tal N, Pupko T. EvoRator: Prediction of residue-level evolutionary rates from protein structures using machine learning. J Mol Biol 2022; 434:167538. [DOI: 10.1016/j.jmb.2022.167538] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 03/07/2022] [Accepted: 03/07/2022] [Indexed: 10/18/2022]
|
40
|
Abstract
The spike protein (S-protein) of SARS-CoV-2, the protein that enables the virus to infect human cells, is the basis for many vaccines and a hotspot of concerning virus evolution. Here, we discuss the outstanding progress in structural characterization of the S-protein and how these structures facilitate analysis of virus function and evolution. We emphasize the differences in reported structures and that analysis of structure-function relationships is sensitive to the structure used. We show that the average residue solvent exposure in nearly complete structures is a good descriptor of open vs closed conformation states. Because of structural heterogeneity of functionally important surface-exposed residues, we recommend using averages of a group of high-quality protein structures rather than a single structure before reaching conclusions on specific structure-function relationships. To illustrate these points, we analyze some significant chemical tendencies of prominent S-protein mutations in the context of the available structures. In the discussion of new variants, we emphasize the selectivity of binding to ACE2 vs prominent antibodies rather than simply the antibody escape or ACE2 affinity separately. We note that larger chemical changes, in particular increased electrostatic charge or side-chain volume of exposed surface residues, are recurring in mutations of concern, plausibly related to adaptation to the negative surface potential of human ACE2. We also find indications that the fixated mutations of the S-protein in the main variants are less destabilizing than would be expected on average, possibly pointing toward a selection pressure on the S-protein. The richness of available structures for all of these situations provides an enormously valuable basis for future research into these structure-function relationships.
Collapse
Affiliation(s)
- Rukmankesh Mehra
- Department of Chemistry, Indian Institute
of Technology Bhilai, Sejbahar, Raipur 492015, Chhattisgarh,
India
| | - Kasper P. Kepp
- DTU Chemistry, Technical University of
Denmark, Building 206, 2800 Kongens Lyngby,
Denmark
| |
Collapse
|
41
|
Mitusińska K, Wojsa P, Bzówka M, Raczyńska A, Bagrowska W, Samol A, Kapica P, Góra A. Structure-function relationship between soluble epoxide hydrolases structure and their tunnel network. Comput Struct Biotechnol J 2021; 20:193-205. [PMID: 35024092 PMCID: PMC8715294 DOI: 10.1016/j.csbj.2021.10.042] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2021] [Revised: 10/21/2021] [Accepted: 10/23/2021] [Indexed: 12/04/2022] Open
Abstract
Enzymes with buried active sites maintain their catalytic function via a single tunnel or tunnel network. In this study we analyzed the functionality of soluble epoxide hydrolases (sEHs) tunnel network, by comparing the overall enzyme structure with the tunnel's shape and size. sEHs were divided into three groups based on their structure and the tunnel usage. The obtained results were compared with known substrate preferences of the studied enzymes, as well as reported in our other work evolutionary analyses data. The tunnel network architecture corresponded well with the evolutionary lineage of the source organism and large differences between enzymes were observed from long fragments insertions. This strategy can be used during protein re-engineering process for large changes introduction, whereas tunnel modification can be applied for fine-tuning of enzyme.
Collapse
Key Words
- CH65-EH, soluble epoxide hydrolase from an unknown source, sampled in hot springs in China
- Protein engineering
- Sibe-EH, soluble epoxide hydrolase from an unknown source, sampled in hot springs in Russia
- Soluble epoxide hydrolases
- StEH1, Solanum tuberosum soluble epoxide hydrolase
- Structure–function relationship
- TrEH, Trichoderma reesei soluble epoxide hydrolase
- Tunnel network
- VrEH2, Vigna radiata soluble epoxide hydrolase
- bmEH, Bacillus megaterium soluble epoxide hydrolase
- hsEH, Homo sapiens soluble epoxide hydrolase
- msEH, Mus musculus soluble epoxide hydrolase
- sEHs, soluble epoxide hydrolases
Collapse
Affiliation(s)
- Karolina Mitusińska
- Tunneling Group, Biotechnology Centre, Silesian University of Technology, Gliwice, Poland
| | - Piotr Wojsa
- Tunneling Group, Biotechnology Centre, Silesian University of Technology, Gliwice, Poland
| | - Maria Bzówka
- Tunneling Group, Biotechnology Centre, Silesian University of Technology, Gliwice, Poland
| | - Agata Raczyńska
- Tunneling Group, Biotechnology Centre, Silesian University of Technology, Gliwice, Poland
| | - Weronika Bagrowska
- Tunneling Group, Biotechnology Centre, Silesian University of Technology, Gliwice, Poland
| | - Aleksandra Samol
- Tunneling Group, Biotechnology Centre, Silesian University of Technology, Gliwice, Poland
| | - Patryk Kapica
- Tunneling Group, Biotechnology Centre, Silesian University of Technology, Gliwice, Poland
| | - Artur Góra
- Tunneling Group, Biotechnology Centre, Silesian University of Technology, Gliwice, Poland
| |
Collapse
|
42
|
Rahbar MR, Jahangiri A, Khalili S, Zarei M, Mehrabani-Zeinabad K, Khalesi B, Pourzardosht N, Hessami A, Nezafat N, Sadraei S, Negahdaripour M. Hotspots for mutations in the SARS-CoV-2 spike glycoprotein: a correspondence analysis. Sci Rep 2021; 11:23622. [PMID: 34880279 PMCID: PMC8654821 DOI: 10.1038/s41598-021-01655-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2021] [Accepted: 11/01/2021] [Indexed: 12/19/2022] Open
Abstract
Spike glycoprotein (Sgp) is liable for binding of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) to the host receptors. Since Sgp is the main target for vaccine and drug designing, elucidating its mutation pattern could help in this regard. This study is aimed at investigating the correspondence of specific residues to the SgpSARS-CoV-2 functionality by explorative interpretation of sequence alignments. Centrality analysis of the Sgp dissects the importance of these residues in the interaction network of the RBD-ACE2 (receptor-binding domain) complex and furin cleavage site. Correspondence of RBD to threonine500 and asparagine501 and furin cleavage site to glutamine675, glutamine677, threonine678, and alanine684 was observed; all residues are exactly located at the interaction interfaces. The harmonious location of residues dictates the RBD binding property and the flexibility, hydrophobicity, and accessibility of the furin cleavage site. These species-specific residues can be assumed as real targets of evolution, while other substitutions tend to support them. Moreover, all these residues are parts of experimentally identified epitopes. Therefore, their substitution may affect vaccine efficacy. Higher rate of RBD maintenance than furin cleavage site was predicted. The accumulation of substitutions reinforces the probability of the multi-host circulation of the virus and emphasizes the enduring evolutionary events.
Collapse
Affiliation(s)
- Mohammad Reza Rahbar
- Pharmaceutical Sciences Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Abolfazl Jahangiri
- Applied Microbiology Research Center, Systems Biology and Poisonings Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran
| | - Saeed Khalili
- Department of Biology Sciences, Shahid Rajaee Teacher Training University, Tehran, Iran
| | - Mahboubeh Zarei
- Pharmaceutical Sciences Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Kamran Mehrabani-Zeinabad
- Department of Biostatistics, Faculty of Medicine, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Bahman Khalesi
- Department of Research and Production of Poultry Viral Vaccine, Razi Vaccine, and Serum Research Institute, Agricultural Research Education and Extension Organization (AREEO), Karaj, Iran
| | - Navid Pourzardosht
- Cellular and Molecular Research Center, Faculty of Medicine, Guilan University of Medical Sciences, Rasht, Iran
- Biochemistry Department, Guilan University of Medical Sciences, Rasht, Iran
| | - Anahita Hessami
- School of Pharmacy, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Navid Nezafat
- Pharmaceutical Sciences Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Saman Sadraei
- Pharmaceutical Sciences Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Manica Negahdaripour
- Pharmaceutical Sciences Research Center, Shiraz University of Medical Sciences, Shiraz, Iran.
- Department of Pharmaceutical Biotechnology, School of Pharmacy, Shiraz University of Medical Sciences, P.O. Box 71345-1583, Shiraz, Iran.
| |
Collapse
|
43
|
Peng J, Svetec N, Zhao L. Intermolecular interactions drive protein adaptive and co-adaptive evolution at both species and population levels. Mol Biol Evol 2021; 39:6456312. [PMID: 34878126 PMCID: PMC8789070 DOI: 10.1093/molbev/msab350] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Proteins are the building blocks for almost all the functions in cells. Understanding the molecular evolution of proteins and the forces that shape protein evolution is essential in understanding the basis of function and evolution. Previous studies have shown that adaptation frequently occurs at the protein surface, such as in genes involved in host–pathogen interactions. However, it remains unclear whether adaptive sites are distributed randomly or at regions associated with particular structural or functional characteristics across the genome, since many proteins lack structural or functional annotations. Here, we seek to tackle this question by combining large-scale bioinformatic prediction, structural analysis, phylogenetic inference, and population genomic analysis of Drosophila protein-coding genes. We found that protein sequence adaptation is more relevant to function-related rather than structure-related properties. Interestingly, intermolecular interactions contribute significantly to protein adaptation. We further showed that intermolecular interactions, such as physical interactions, may play a role in the coadaptation of fast-adaptive proteins. We found that strongly differentiated amino acids across geographic regions in protein-coding genes are mostly adaptive, which may contribute to the long-term adaptive evolution. This strongly indicates that a number of adaptive sites tend to be repeatedly mutated and selected throughout evolution in the past, present, and maybe future. Our results highlight the important roles of intermolecular interactions and coadaptation in the adaptive evolution of proteins both at the species and population levels.
Collapse
Affiliation(s)
- Junhui Peng
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
| | - Nicolas Svetec
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
| |
Collapse
|
44
|
Castiglione GM, Zhou L, Xu Z, Neiman Z, Hung CF, Duh EJ. Evolutionary pathways to SARS-CoV-2 resistance are opened and closed by epistasis acting on ACE2. PLoS Biol 2021; 19:e3001510. [PMID: 34932561 PMCID: PMC8730403 DOI: 10.1371/journal.pbio.3001510] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Revised: 01/05/2022] [Accepted: 12/08/2021] [Indexed: 02/06/2023] Open
Abstract
Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) infects a broader range of mammalian species than previously predicted, binding a diversity of angiotensin converting enzyme 2 (ACE2) orthologs despite extensive sequence divergence. Within this sequence degeneracy, we identify a rare sequence combination capable of conferring SARS-CoV-2 resistance. We demonstrate that this sequence was likely unattainable during human evolution due to deleterious effects on ACE2 carboxypeptidase activity, which has vasodilatory and cardioprotective functions in vivo. Across the 25 ACE2 sites implicated in viral binding, we identify 6 amino acid substitutions unique to mouse-one of the only known mammalian species resistant to SARS-CoV-2. Substituting human variants at these positions is sufficient to confer binding of the SARS-CoV-2 S protein to mouse ACE2, facilitating cellular infection. Conversely, substituting mouse variants into either human or dog ACE2 abolishes viral binding, diminishing cellular infection. However, these same substitutions decrease human ACE2 activity by 50% and are predicted as pathogenic, consistent with the extreme rarity of human polymorphisms at these sites. This trade-off can be avoided, however, depending on genetic background; if substituted simultaneously, these same mutations have no deleterious effect on dog ACE2 nor that of the rodent ancestor estimated to exist 70 million years ago. This genetic contingency (epistasis) may have therefore opened the road to resistance for some species, while making humans susceptible to viruses that use these ACE2 surfaces for binding, as does SARS-CoV-2.
Collapse
Affiliation(s)
- Gianni M. Castiglione
- Department of Ophthalmology, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Lingli Zhou
- Department of Ophthalmology, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Zhenhua Xu
- Department of Ophthalmology, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Zachary Neiman
- Department of Ophthalmology, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Chien-Fu Hung
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Elia J. Duh
- Department of Ophthalmology, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| |
Collapse
|
45
|
Anderson BW, Fung DK, Wang JD. Regulatory Themes and Variations by the Stress-Signaling Nucleotide Alarmones (p)ppGpp in Bacteria. Annu Rev Genet 2021; 55:115-133. [PMID: 34416118 DOI: 10.1146/annurev-genet-021821-025827] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Bacterial stress-signaling alarmones are important components of a protective network against diverse stresses such as nutrient starvation and antibiotic assault. pppGpp and ppGpp, collectively (p)ppGpp, have well-documented regulatory roles in gene expression and protein translation. Recent work has highlighted another key function of (p)ppGpp: inducing rapid and coordinated changes in cellular metabolism by regulating enzymatic activities, especially those involved in purine nucleotide synthesis. Failure of metabolic regulation by (p)ppGpp results in the loss of coordination between metabolic and macromolecular processes, leading to cellular toxicity. In this review, we document how (p)ppGpp and newly characterized nucleotides pGpp and (p)ppApp directly regulate these enzymatic targets for metabolic remodeling. We examine targets' common determinants for alarmone interaction as well as their evolutionary diversification. We highlight classical and emerging themes in nucleotide signaling, including oligomerization and allostery along with metabolic interconversion and crosstalk, illustrating how they allow optimized bacterial adaptation to their environmental niches.
Collapse
Affiliation(s)
- Brent W Anderson
- Department of Bacteriology, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA; , ,
| | - Danny K Fung
- Department of Bacteriology, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA; , ,
| | - Jue D Wang
- Department of Bacteriology, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA; , ,
| |
Collapse
|
46
|
Learning the local landscape of protein structures with convolutional neural networks. J Biol Phys 2021; 47:435-454. [PMID: 34751854 DOI: 10.1007/s10867-021-09593-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 10/18/2021] [Indexed: 10/19/2022] Open
Abstract
One fundamental problem of protein biochemistry is to predict protein structure from amino acid sequence. The inverse problem, predicting either entire sequences or individual mutations that are consistent with a given protein structure, has received much less attention even though it has important applications in both protein engineering and evolutionary biology. Here, we ask whether 3D convolutional neural networks (3D CNNs) can learn the local fitness landscape of protein structure to reliably predict either the wild-type amino acid or the consensus in a multiple sequence alignment from the local structural context surrounding site of interest. We find that the network can predict wild type with good accuracy, and that network confidence is a reliable measure of whether a given prediction is likely going to be correct or not. Predictions of consensus are less accurate and are primarily driven by whether or not the consensus matches the wild type. Our work suggests that high-confidence mis-predictions of the wild type may identify sites that are primed for mutation and likely targets for protein engineering.
Collapse
|
47
|
Ali NF, Paracha RZ, Tahir M. In silico evaluation of molecular virus-virus interactions taking place between Cotton leaf curl Kokhran virus- Burewala strain and Tomato leaf curl New Delhi virus. PeerJ 2021; 9:e12018. [PMID: 34721952 PMCID: PMC8532979 DOI: 10.7717/peerj.12018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Accepted: 07/29/2021] [Indexed: 11/20/2022] Open
Abstract
Background Cotton leaf curl disease (CLCuD) is a disease of cotton caused by begomoviruses, leading to a drastic loss in the annual yield of the crop. Pakistan has suffered two epidemics of this disease leading to the loss of billions in annual exports. The speculation that a third epidemic of CLCuD may result as consequence of the frequent occurrence of Tomato leaf curl New Delhi virus (ToLCNDV) and Cotton leaf curl Kokhran Virus-Burewala Strain (CLCuKoV-Bu) in CLCuD infected samples, demand that the interactions taking between the two viruses be properly evaluated. This study is designed to assess virus-virus interactions at the molecular level and determine the type of co-infection taking place. Methods Based on the amino acid sequences of the gene products of both CLCuKoV-Bu and ToLCNDV, protein structures were generated using different software, i.e., MODELLER, I-TASSER, QUARKS, LOMETS and RAPTORX. A consensus model for each protein was selected after model quality assessment using ERRAT, QMEANDisCo, PROCHECK Z-Score and Ramachandran plot analysis. The active and passive residues in the protein structures were identified using the CPORT server. Protein–Protein Docking was done using the HADDOCK webserver, and 169 Protein–Protein Interaction (PPIs) were performed between the proteins of the two viruses. The docked complexes were submitted to the PRODIGY server to identify the interacting residues between the complexes. The strongest interactions were determined based on the HADDOCK Score, Desolvation energy, Van der Waals Energy, Restraint Violation Energy, Electrostatic Energy, Buried Surface Area and Restraint Violation Energy, Binding Affinity and Dissociation constant (Kd). A total of 50 ns Molecular Dynamic simulations were performed on complexes that exhibited the strongest affinity in order to validate the stability of the complexes, and to remove any steric hindrances that may exist within the structures. Results Our results indicate significant interactions taking place between the proteins of the two viruses. Out of all the interactions, the strongest were observed between the Replication Initiation protein (Rep) of CLCuKoV-Bu with the Movement protein (MP), Nuclear Shuttle Protein (NSP) of ToLCNDV (DNA-B), while the weakest were seen between the Replication Enhancer protein (REn) of CLCuKoV-Bu with the REn protein of ToLCNDV. The residues identified to be taking a part in interaction belonged to domains having a pivotal role in the viral life cycle and pathogenicity. It maybe deduced that the two viruses exhibit antagonistic behavior towards each other, and the type of infection may be categorised as a type of Super Infection Exclusion (SIE) or homologous interference. However, further experimentation, in the form of transient expression analysis, is needed to confirm the nature of these interactions and increase our understanding of the direct interactions taking place between two viruses.
Collapse
Affiliation(s)
- Nida Fatima Ali
- Department of Plant Biotechnology, Atta-ur-Rahman School of Applied Biosciences (ASAB), National University of Sciences and Technology, Islamabad, Federal, Pakistan
| | - Rehan Zafar Paracha
- Research Center for Modeling and Simulation (RCMS), National University of Sciences and Technology, Islamabad, Federal, Pakistan
| | - Muhammad Tahir
- Department of Plant Biotechnology, Atta-ur-Rahman School of Applied Biosciences (ASAB), National University of Sciences and Technology, Islamabad, Federal, Pakistan
| |
Collapse
|
48
|
Echave J. Evolutionary coupling range varies widely among enzymes depending on selection pressure. Biophys J 2021; 120:4320-4324. [PMID: 34480927 DOI: 10.1016/j.bpj.2021.08.042] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 07/19/2021] [Accepted: 08/30/2021] [Indexed: 10/20/2022] Open
Abstract
Recent studies proposed that enzyme-active sites induce evolutionary constraints at long distances. The physical origin of such long-range evolutionary coupling is unknown. Here, I use a recent biophysical model of evolution to study the relationship between physical and evolutionary couplings on a diverse data set of monomeric enzymes. I show that evolutionary coupling is not universally long-range. Rather, range varies widely among enzymes, from 2 to 20 Å. Furthermore, the evolutionary coupling range of an enzyme does not inform on the underlying physical coupling, which is short range for all enzymes. Rather, evolutionary coupling range is determined by functional selection pressure.
Collapse
Affiliation(s)
- Julian Echave
- Instituto de Ciencias Físicas, Escuela de Ciencia y Tecnología, Universidad Nacional de San Martín, San Martín, Buenos Aires, Argentina.
| |
Collapse
|
49
|
Wangchuk J, Chatterjee A, Patil S, Madugula SK, Kondabagil K. The coevolution of large and small terminases of bacteriophages is a result of purifying selection leading to phenotypic stabilization. Virology 2021; 564:13-25. [PMID: 34598064 DOI: 10.1016/j.virol.2021.09.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Revised: 09/14/2021] [Accepted: 09/14/2021] [Indexed: 10/20/2022]
Abstract
Genome packaging in many dsDNA phages requires a series of precisely coordinated actions of two phage-coded proteins, namely, large terminase (TerL) and small terminase (TerS) with DNA and ATP, and with each other. Despite the strict functional conservation, TerL and TerS homologs exhibit large sequence variations. We investigated the sequence variability across eight phage types and observed a coevolutionary framework wherein the genealogy of TerL homologs mirrored that of the corresponding TerS homologs. Furthermore, a high purifying selection observed (dN/dS«1) indicated strong structural constraints on both TerL and TerS, and identify coevolving residues in TerL and TerS of phage T4 and lambda. Using the highly coevolving (correlation coefficient of 0.99) TerL and TerS of phage N4, we show that their biochemical features are similar to the phylogenetically divergent phage λ terminases. We also demonstrate using the Surface Plasma Resonance (SPR) technique that phage N4 TerL transiently interacts with TerS.
Collapse
Affiliation(s)
- Jigme Wangchuk
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai, India
| | - Anirvan Chatterjee
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai, India
| | - Supriya Patil
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai, India
| | - Santhosh Kumar Madugula
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai, India
| | - Kiran Kondabagil
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai, India.
| |
Collapse
|
50
|
Lu H, Li F, Yuan L, Domenzain I, Yu R, Wang H, Li G, Chen Y, Ji B, Kerkhoven EJ, Nielsen J. Yeast metabolic innovations emerged via expanded metabolic network and gene positive selection. Mol Syst Biol 2021; 17:e10427. [PMID: 34676984 PMCID: PMC8532513 DOI: 10.15252/msb.202110427] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2021] [Revised: 10/02/2021] [Accepted: 10/04/2021] [Indexed: 12/24/2022] Open
Abstract
Yeasts are known to have versatile metabolic traits, while how these metabolic traits have evolved has not been elucidated systematically. We performed integrative evolution analysis to investigate how genomic evolution determines trait generation by reconstructing genome-scale metabolic models (GEMs) for 332 yeasts. These GEMs could comprehensively characterize trait diversity and predict enzyme functionality, thereby signifying that sequence-level evolution has shaped reaction networks towards new metabolic functions. Strikingly, using GEMs, we can mechanistically map different evolutionary events, e.g. horizontal gene transfer and gene duplication, onto relevant subpathways to explain metabolic plasticity. This demonstrates that gene family expansion and enzyme promiscuity are prominent mechanisms for metabolic trait gains, while GEM simulations reveal that additional factors, such as gene loss from distant pathways, contribute to trait losses. Furthermore, our analysis could pinpoint to specific genes and pathways that have been under positive selection and relevant for the formulation of complex metabolic traits, i.e. thermotolerance and the Crabtree effect. Our findings illustrate how multidimensional evolution in both metabolic network structure and individual enzymes drives phenotypic variations.
Collapse
Affiliation(s)
- Hongzhong Lu
- Department of Biology and Biological EngineeringChalmers University of TechnologyGothenburgSweden
| | - Feiran Li
- Department of Biology and Biological EngineeringChalmers University of TechnologyGothenburgSweden
| | - Le Yuan
- Department of Biology and Biological EngineeringChalmers University of TechnologyGothenburgSweden
| | - Iván Domenzain
- Department of Biology and Biological EngineeringChalmers University of TechnologyGothenburgSweden
| | - Rosemary Yu
- Department of Biology and Biological EngineeringChalmers University of TechnologyGothenburgSweden
| | - Hao Wang
- Department of Biology and Biological EngineeringChalmers University of TechnologyGothenburgSweden
- National Bioinformatics Infrastructure SwedenScience for Life LaboratoryChalmers University of TechnologyGothenburgSweden
| | - Gang Li
- Department of Biology and Biological EngineeringChalmers University of TechnologyGothenburgSweden
| | - Yu Chen
- Department of Biology and Biological EngineeringChalmers University of TechnologyGothenburgSweden
| | - Boyang Ji
- Department of Biology and Biological EngineeringChalmers University of TechnologyGothenburgSweden
- The Novo Nordisk Foundation Center for BiosustainabilityTechnical University of DenmarkLyngbyDenmark
| | - Eduard J Kerkhoven
- Department of Biology and Biological EngineeringChalmers University of TechnologyGothenburgSweden
| | - Jens Nielsen
- Department of Biology and Biological EngineeringChalmers University of TechnologyGothenburgSweden
- The Novo Nordisk Foundation Center for BiosustainabilityTechnical University of DenmarkLyngbyDenmark
- BioInnovation InstituteCopenhagen NDenmark
| |
Collapse
|