1
|
Xu T, Wang S, Ma T, Dong Y, Ashby CR, Hao GF. The identification of essential cellular genes is critical for validating drug targets. Drug Discov Today 2024; 29:104215. [PMID: 39428084 DOI: 10.1016/j.drudis.2024.104215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2024] [Revised: 10/06/2024] [Accepted: 10/15/2024] [Indexed: 10/22/2024]
Abstract
Accurately identifying biological targets is crucial for advancing treatment options. Essential genes, vital for cell or organism survival, hold promise as potential drug targets in disease treatment. Although many studies have sought to identify essential genes as therapeutic targets in medicine and bioinformatics, systematic reviews on their relationship with drug targets are relatively rare. This work presents a comprehensive analysis to aid in identifying essential genes as potential targets for drug discovery, encompassing their relevance, identification methods, successful case studies, and challenges. This work will facilitate the identification of essential genes as therapeutic targets, thereby boosting new drug development.
Collapse
Affiliation(s)
- Ting Xu
- School of Pharmaceutical Sciences, Guizhou Engineering Laboratory for Synthetic Drugs, Guizhou University, Guiyang 550025, China
| | - Shuang Wang
- State Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for R&D of Fine Chemicals, Guizhou University, Guiyang 550025, China
| | - Tingting Ma
- School of Pharmaceutical Sciences, Guizhou Engineering Laboratory for Synthetic Drugs, Guizhou University, Guiyang 550025, China
| | - Yawen Dong
- School of Pharmaceutical Sciences, Guizhou Engineering Laboratory for Synthetic Drugs, Guizhou University, Guiyang 550025, China.
| | - Charles R Ashby
- Department of Pharmaceutical Sciences, St. John's University, New York, NY, USA.
| | - Ge-Fei Hao
- State Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for R&D of Fine Chemicals, Guizhou University, Guiyang 550025, China.
| |
Collapse
|
2
|
de Jong MJ, van Oosterhout C, Hoelzel AR, Janke A. Moderating the neutralist-selectionist debate: exactly which propositions are we debating, and which arguments are valid? Biol Rev Camb Philos Soc 2024; 99:23-55. [PMID: 37621151 DOI: 10.1111/brv.13010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 08/04/2023] [Accepted: 08/07/2023] [Indexed: 08/26/2023]
Abstract
Half a century after its foundation, the neutral theory of molecular evolution continues to attract controversy. The debate has been hampered by the coexistence of different interpretations of the core proposition of the neutral theory, the 'neutral mutation-random drift' hypothesis. In this review, we trace the origins of these ambiguities and suggest potential solutions. We highlight the difference between the original, the revised and the nearly neutral hypothesis, and re-emphasise that none of them equates to the null hypothesis of strict neutrality. We distinguish the neutral hypothesis of protein evolution, the main focus of the ongoing debate, from the neutral hypotheses of genomic and functional DNA evolution, which for many species are generally accepted. We advocate a further distinction between a narrow and an extended neutral hypothesis (of which the latter posits that random non-conservative amino acid substitutions can cause non-ecological phenotypic divergence), and we discuss the implications for evolutionary biology beyond the domain of molecular evolution. We furthermore point out that the debate has widened from its initial focus on point mutations, and also concerns the fitness effects of large-scale mutations, which can alter the dosage of genes and regulatory sequences. We evaluate the validity of neutralist and selectionist arguments and find that the tested predictions, apart from being sensitive to violation of underlying assumptions, are often derived from the null hypothesis of strict neutrality, or equally consistent with the opposing selectionist hypothesis, except when assuming molecular panselectionism. Our review aims to facilitate a constructive neutralist-selectionist debate, and thereby to contribute to answering a key question of evolutionary biology: what proportions of amino acid and nucleotide substitutions and polymorphisms are adaptive?
Collapse
Affiliation(s)
- Menno J de Jong
- Senckenberg Biodiversity and Climate Research Institute (SBiK-F), Georg-Voigt-Strasse 14-16, Frankfurt am Main, 60325, Germany
| | - Cock van Oosterhout
- Centre for Ecology, Evolution and Conservation, University of East Anglia, Norwich Research Park, Norwich, NR4 7TJ, UK
| | - A Rus Hoelzel
- Department of Biosciences, Durham University, South Road, Durham, DH1 3LE, UK
| | - Axel Janke
- Senckenberg Biodiversity and Climate Research Institute (SBiK-F), Georg-Voigt-Strasse 14-16, Frankfurt am Main, 60325, Germany
- Institute for Ecology, Evolution and Diversity, Goethe University, Max-von-Laue-Strasse 9, Frankfurt am Main, 60438, Germany
- LOEWE-Centre for Translational Biodiversity Genomics (TBG), Senckenberg Nature Research Society, Georg-Voigt-Straße 14-16, Frankfurt am Main, 60325, Germany
| |
Collapse
|
3
|
Zhang J. What Has Genomics Taught An Evolutionary Biologist? GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:1-12. [PMID: 36720382 PMCID: PMC10373158 DOI: 10.1016/j.gpb.2023.01.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 01/06/2023] [Accepted: 01/19/2023] [Indexed: 01/30/2023]
Abstract
Genomics, an interdisciplinary field of biology on the structure, function, and evolution of genomes, has revolutionized many subdisciplines of life sciences, including my field of evolutionary biology, by supplying huge data, bringing high-throughput technologies, and offering a new approach to biology. In this review, I describe what I have learned from genomics and highlight the fundamental knowledge and mechanistic insights gained. I focus on three broad topics that are central to evolutionary biology and beyond-variation, interaction, and selection-and use primarily my own research and study subjects as examples. In the next decade or two, I expect that the most important contributions of genomics to evolutionary biology will be to provide genome sequences of nearly all known species on Earth, facilitate high-throughput phenotyping of natural variants and systematically constructed mutants for mapping genotype-phenotype-fitness landscapes, and assist the determination of causality in evolutionary processes using experimental evolution.
Collapse
Affiliation(s)
- Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA.
| |
Collapse
|
4
|
Bittner NKJ, Mack KL, Nachman MW. Shared Patterns of Gene Expression and Protein Evolution Associated with Adaptation to Desert Environments in Rodents. Genome Biol Evol 2022; 14:evac155. [PMID: 36268582 PMCID: PMC9648513 DOI: 10.1093/gbe/evac155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/17/2022] [Indexed: 01/18/2023] Open
Abstract
Desert specialization has arisen multiple times across rodents and is often associated with a suite of convergent phenotypes, including modification of the kidneys to mitigate water loss. However, the extent to which phenotypic convergence in desert rodents is mirrored at the molecular level is unknown. Here, we sequenced kidney mRNA and assembled transcriptomes for three pairs of rodent species to search for shared differences in gene expression and amino acid sequence associated with adaptation to deserts. We conducted phylogenetically independent comparisons between a desert specialist and a non-desert relative in three families representing ∼70 million years of evolution. Overall, patterns of gene expression faithfully recapitulated the phylogeny of these six taxa providing a strong evolutionary signal in levels of mRNA abundance. We also found that 8.6% of all genes showed shared patterns of expression divergence between desert and non-desert taxa, much of which likely reflects convergent evolution, and representing more than expected by chance under a model of independent gene evolution. In addition to these shared changes, we observed many species-pair-specific changes in gene expression indicating that instances of adaptation to deserts include a combination of unique and shared changes. Patterns of protein evolution revealed a small number of genes showing evidence of positive selection, the majority of which did not show shared changes in gene expression. Overall, our results suggest that convergent changes in gene regulation play an important role in the complex trait of desert adaptation in rodents.
Collapse
Affiliation(s)
- Noëlle K J Bittner
- Department of Integrative Biology and Museum of Vertebrate Zoology, 3101 Valley Life Sciences Building, University of California Berkeley, California 94720
| | - Katya L Mack
- Department of Integrative Biology and Museum of Vertebrate Zoology, 3101 Valley Life Sciences Building, University of California Berkeley, California 94720
| | - Michael W Nachman
- Department of Integrative Biology and Museum of Vertebrate Zoology, 3101 Valley Life Sciences Building, University of California Berkeley, California 94720
| |
Collapse
|
5
|
Li Y, Zeng M, Wu Y, Li Y, Li M. Accurate Prediction of Human Essential Proteins Using Ensemble Deep Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3263-3271. [PMID: 34699365 DOI: 10.1109/tcbb.2021.3122294] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Essential proteins are considered the foundation of life as they are indispensable for the survival of living organisms. Computational methods for essential protein discovery provide a fast way to identify essential proteins. But most of them heavily rely on various biological information, especially protein-protein interaction networks, which limits their practical applications. With the rapid development of high-throughput sequencing technology, sequencing data has become the most accessible biological data. However, using only protein sequence information to predict essential proteins has limited accuracy. In this paper, we propose EP-EDL, an ensemble deep learning model using only protein sequence information to predict human essential proteins. EP-EDL integrates multiple classifiers to alleviate the class imbalance problem and to improve prediction accuracy and robustness. In each base classifier, we employ multi-scale text convolutional neural networks to extract useful features from protein sequence feature matrices with evolutionary information. Our computational results show that EP-EDL outperforms the state-of-the-art sequence-based methods. Furthermore, EP-EDL provides a more practical and flexible way for biologists to accurately predict essential proteins. The source code and datasets can be downloaded from https://github.com/CSUBioGroup/EP-EDL.
Collapse
|
6
|
Sarkar I, Dey P, Rathore SS, Singh GD, Singh RP. Global genomic and proteomic analysis indicates co-evolution of Neisseria species and with their human host. World J Microbiol Biotechnol 2022; 38:149. [PMID: 35773545 DOI: 10.1007/s11274-022-03338-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Accepted: 06/11/2022] [Indexed: 11/30/2022]
Abstract
Neisseria, a genus from the beta-proteobacteria class, is of potential clinical importance. This genus contains both pathogenic and commensal strains. Gonorrhea and meningitis are two major diseases caused by pathogens belonging to this genus. With the increased use of antimicrobial agents against these pathogens they have evolved the antimicrobial resistance capacity making these diseases nearly untreatable. The set of anti-bacterial resistance genes (resistome) and genes associated with signal processing (secretomes) are crucial for the host-microbial interaction. With the virtue of whole-genome sequences and computational biology, it is now possible to study the genomic and proteomic riddles of Neisseria along with their comprehensive evolutionary and metabolic profiling. We have studied relative synonymous codon usage, amino acid usage, reverse ecology, comparative genomics, evolutionary analysis and pathogen-host (Neisseria-human) interaction through bioinformatics analysis. Our analysis revealed the co-evolution of Neisseria genomes with the human host. Moreover, the co-occurrence of Neisseria and humans has been supported through reverse ecology analysis. A differential pattern of the evolutionary rate of resistomes and secretomes was evident among the pathogenic and commensal strains. Comparative genomics supported the presence of virulent genes in both pathogenic and commensal strains of the select genus. Our analysis also indicated a transition from commensal to pathogenic Neisseria strains through the long run of evolution.
Collapse
Affiliation(s)
- Indrani Sarkar
- Salim Ali Centre for Ornithology and Natural History, Anaikatty, Coimbatore, Tamil Nadu, 641 108, India
| | - Prateek Dey
- Salim Ali Centre for Ornithology and Natural History, Anaikatty, Coimbatore, Tamil Nadu, 641 108, India
| | | | | | - Ram Pratap Singh
- Department of Life Science, Central University of South Bihar, Gaya, Bihar, 824236, India.
| |
Collapse
|
7
|
Ghimire N, Kim B, Lee CM, Oh TJ. Comparative genome analysis among Variovorax species and genome guided aromatic compound degradation analysis emphasizing 4-hydroxybenzoate degradation in Variovorax sp. PAMC26660. BMC Genomics 2022; 23:375. [PMID: 35585492 PMCID: PMC9115942 DOI: 10.1186/s12864-022-08589-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Accepted: 04/25/2022] [Indexed: 11/29/2022] Open
Abstract
Background While the genus Variovorax is known for its aromatic compound metabolism, no detailed study of the peripheral and central pathways of aromatic compound degradation has yet been reported. Variovorax sp. PAMC26660 is a lichen-associated bacterium isolated from Antarctica. The work presents the genome-based elucidation of peripheral and central catabolic pathways of aromatic compound degradation genes in Variovorax sp. PAMC26660. Additionally, the accessory, core and unique genes were identified among Variovorax species using the pan genome analysis tool. A detailed analysis of the genes related to xenobiotic metabolism revealed the potential roles of Variovorax sp. PAMC26660 and other species in bioremediation. Results TYGS analysis, dDDH, phylogenetic placement and average nucleotide identity (ANI) analysis identified the strain as Variovorax sp. Cell morphology was assessed using scanning electron microscopy (SEM). On analysis of the core, accessory, and unique genes, xenobiotic metabolism accounted only for the accessory and unique genes. On detailed analysis of the aromatic compound catabolic genes, peripheral pathway related to 4-hydroxybenzoate (4-HB) degradation was found among all species while phenylacetate and tyrosine degradation pathways were present in most of the species including PAMC26660. Likewise, central catabolic pathways, like protocatechuate, gentisate, homogentisate, and phenylacetyl-CoA, were also present. The peripheral pathway for 4-HB degradation was functionally tested using PAMC26660, which resulted in the growth using it as a sole source of carbon. Conclusions Computational tools for genome and pan genome analysis are important to understand the behavior of an organism. Xenobiotic metabolism-related genes, that only account for the accessory and unique genes infer evolution through events like lateral gene transfer, mutation and gene rearrangement. 4-HB, an aromatic compound present among lichen species is utilized by lichen-associated Variovorax sp. PAMC26660 as the sole source of carbon. The strain holds genes and pathways for its utilization. Overall, this study outlines the importance of Variovorax in bioremediation and presents the genomic information of the species. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08589-3.
Collapse
Affiliation(s)
- Nisha Ghimire
- Department of Life Science and Biochemical Engineering, Graduate School, SunMoon University, Asan, 31460, Korea
| | - Byeollee Kim
- Department of Life Science and Biochemical Engineering, Graduate School, SunMoon University, Asan, 31460, Korea
| | - Chang-Muk Lee
- Agricultural Microbiology Division, National Institute of Agricultural Sciences, Rural Development Administration, Jeonju, 55365, Korea
| | - Tae-Jin Oh
- Department of Life Science and Biochemical Engineering, Graduate School, SunMoon University, Asan, 31460, Korea. .,Genome-based BioIT Convergence Institute, Asan, 31460, Korea. .,Department of Pharmaceutical Engineering and Biotechnology, SunMoon University, Asan, 31460, South Korea.
| |
Collapse
|
8
|
Fields PD, McTaggart S, Reisser CMO, Haag C, Palmer WH, Little TJ, Ebert D, Obbard DJ. Population-genomic analysis identifies a low rate of global adaptive fixation in the proteins of the cyclical parthenogen Daphnia magna. Mol Biol Evol 2022; 39:6542319. [PMID: 35244177 PMCID: PMC8963301 DOI: 10.1093/molbev/msac048] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Daphnia are well-established ecological and evolutionary models, and the interaction between D. magna and its microparasites is widely considered a paragon of the host-parasite coevolutionary process. Like other well-studied arthropods such as Drosophila melanogaster and Anopheles gambiae, D. magna is a small, widespread, and abundant species that is therefore expected to display a large long-term population size and high rates of adaptive protein evolution. However, unlike these other species, D. magna is cyclically asexual and lives in a highly structured environment (ponds and lakes) with moderate levels of dispersal, both of which are predicted to impact upon long-term effective population size and adaptive protein evolution. To investigate patterns of adaptive protein fixation, we produced the complete coding genomes of 36 D. magna clones sampled from across the European range (Western Palaearctic), along with draft sequences for the close relatives D. similis and D. lumholtzi, used as outgroups. We analyzed genome-wide patterns of adaptive fixation, with a particular focus on genes that have an a priori expectation of high rates, such as those likely to mediate immune responses, RNA interference against viruses and transposable elements, and those with a strongly male-biased expression pattern. We find that, as expected, D. magna displays high levels of diversity and that this is highly structured among populations. However, compared with Drosophila, we find that D. magna proteins appear to have a high proportion of weakly deleterious variants and do not show evidence of pervasive adaptive fixation across its entire range. This is true of the genome as a whole, and also of putative ‘arms race’ genes that often show elevated levels of adaptive substitution in other species. In addition to the likely impact of extensive, and previously documented, local adaptation, we speculate that these findings may reflect reduced efficacy of selection associated with cyclical asexual reproduction.
Collapse
Affiliation(s)
- Peter D Fields
- University of Basel, Department of Environmental Sciences, Zoology, Vesalgasse 1, Basel, CH-4051, Switzerland
| | - Seanna McTaggart
- Institute of Evolutionary Biology; School of Biological Sciences University of Edinburgh, Edinburgh, EH9 3JT, United Kingdom
| | - Céline M O Reisser
- Centre d'Ecologie Fonctionnelle et Evolutive CEFE UMR 5175, Univ Montpellier, CNRS, EPHE, IRD, Univ Paul Valéry Montpellier 3, campus CNRS, 1919, route de Mende, 34293 Montpellier Cedex 5, France.,MARBEC, Univ Montpellier, CNRS, IFREMER, IRD, Montpellier, France
| | - Christoph Haag
- Centre d'Ecologie Fonctionnelle et Evolutive CEFE UMR 5175, Univ Montpellier, CNRS, EPHE, IRD, Univ Paul Valéry Montpellier 3, campus CNRS, 1919, route de Mende, 34293 Montpellier Cedex 5, France
| | - William H Palmer
- Institute of Evolutionary Biology; School of Biological Sciences University of Edinburgh, Edinburgh, EH9 3JT, United Kingdom
| | - Tom J Little
- Institute of Evolutionary Biology; School of Biological Sciences University of Edinburgh, Edinburgh, EH9 3JT, United Kingdom
| | - Dieter Ebert
- University of Basel, Department of Environmental Sciences, Zoology, Vesalgasse 1, Basel, CH-4051, Switzerland
| | - Darren J Obbard
- Institute of Evolutionary Biology; School of Biological Sciences University of Edinburgh, Edinburgh, EH9 3JT, United Kingdom
| |
Collapse
|
9
|
Determination of the Amino Acid Recruitment Order in Early Life by Genome-Wide Analysis of Amino Acid Usage Bias. Biomolecules 2022; 12:biom12020171. [PMID: 35204672 PMCID: PMC8961565 DOI: 10.3390/biom12020171] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 01/14/2022] [Accepted: 01/18/2022] [Indexed: 12/11/2022] Open
Abstract
The mechanisms shaping the amino acids recruitment pattern into the proteins in the early life history presently remains a huge mystery. In this study, we conducted genome-wide analyses of amino acids usage and genetic codons structure in 7270 species across three domains of life. The carried-out analyses evidenced ubiquitous usage bias of amino acids that were likely independent from codon usage bias. Taking advantage of codon usage bias, we performed pseudotime analysis to re-determine the chronological order of the species emergence, which inspired a new species relationship by tracing the imprint of codon usage evolution. Furthermore, the multidimensional data integration showed that the amino acids A, D, E, G, L, P, R, S, T and V might be the first recruited into the last universal common ancestry (LUCA) proteins. The data analysis also indicated that the remaining amino acids most probably were gradually incorporated into proteogenesis process in the course of two long-timescale parallel evolutionary routes: I→F→Y→C→M→W and K→N→Q→H. This study provides new insight into the origin of life, particularly in terms of the basic protein composition of early life. Our work provides crucial information that will help in a further understanding of protein structure and function in relation to their evolutionary history.
Collapse
|
10
|
Soni V, Eyre-Walker A. OUP accepted manuscript. Genome Biol Evol 2022; 14:6528851. [PMID: 35166775 PMCID: PMC8882387 DOI: 10.1093/gbe/evac028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/09/2022] [Indexed: 12/05/2022] Open
Abstract
The rate of amino acid substitution has been shown to be correlated to a number of factors including the rate of recombination, the age of the gene, the length of the protein, mean expression level, and gene function. However, the extent to which these correlations are due to adaptive and nonadaptive evolution has not been studied in detail, at least not in hominids. We find that the rate of adaptive evolution is significantly positively correlated to the rate of recombination, protein length and gene expression level, and negatively correlated to gene age. These correlations remain significant when each factor is controlled for in turn, except when controlling for expression in an analysis of protein length; and they also generally remain significant when biased gene conversion is taken into account. However, the positive correlations could be an artifact of population size contraction. We also find that the rate of nonadaptive evolution is negatively correlated to each factor, and all these correlations survive controlling for each other and biased gene conversion. Finally, we examine the effect of gene function on rates of adaptive and nonadaptive evolution; we confirm that virus-interacting proteins (VIPs) have higher rates of adaptive and lower rates of nonadaptive evolution, but we also demonstrate that there is significant variation in the rate of adaptive and nonadaptive evolution between GO categories when removing VIPs. We estimate that the VIP/non-VIP axis explains about 5–8 fold more of the variance in evolutionary rate than GO categories.
Collapse
Affiliation(s)
- Vivak Soni
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
| | - Adam Eyre-Walker
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
- Corresponding author: E-mail:
| |
Collapse
|
11
|
Campos TL, Korhonen PK, Hofmann A, Gasser RB, Young ND. Harnessing model organism genomics to underpin the machine learning-based prediction of essential genes in eukaryotes - Biotechnological implications. Biotechnol Adv 2021; 54:107822. [PMID: 34461202 DOI: 10.1016/j.biotechadv.2021.107822] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Revised: 08/17/2021] [Accepted: 08/24/2021] [Indexed: 12/17/2022]
Abstract
The availability of high-quality genomes and advances in functional genomics have enabled large-scale studies of essential genes in model eukaryotes, including the 'elegant worm' (Caenorhabditis elegans; Nematoda) and the 'vinegar fly' (Drosophila melanogaster; Arthropoda). However, this is not the case for other, much less-studied organisms, such as socioeconomically important parasites, for which functional genomic platforms usually do not exist. Thus, there is a need to develop innovative techniques or approaches for the prediction, identification and investigation of essential genes. A key approach that could enable the prediction of such genes is machine learning (ML). Here, we undertake an historical review of experimental and computational approaches employed for the characterisation of essential genes in eukaryotes, with a particular focus on model ecdysozoans (C. elegans and D. melanogaster), and discuss the possible applicability of ML-approaches to organisms such as socioeconomically important parasites. We highlight some recent results showing that high-performance ML, combined with feature engineering, allows a reliable prediction of essential genes from extensive, publicly available 'omic data sets, with major potential to prioritise such genes (with statistical confidence) for subsequent functional genomic validation. These findings could 'open the door' to fundamental and applied research areas. Evidence of some commonality in the essential gene-complement between these two organisms indicates that an ML-engineering approach could find broader applicability to ecdysozoans such as parasitic nematodes or arthropods, provided that suitably large and informative data sets become/are available for proper feature engineering, and for the robust training and validation of algorithms. This area warrants detailed exploration to, for example, facilitate the identification and characterisation of essential molecules as novel targets for drugs and vaccines against parasitic diseases. This focus is particularly important, given the substantial impact that such diseases have worldwide, and the current challenges associated with their prevention and control and with drug resistance in parasite populations.
Collapse
Affiliation(s)
- Tulio L Campos
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia; Bioinformatics Core Facility, Instituto Aggeu Magalhães, Fundação Oswaldo Cruz (IAM-Fiocruz), Recife, Pernambuco, Brazil
| | - Pasi K Korhonen
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Andreas Hofmann
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Robin B Gasser
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia.
| | - Neil D Young
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia.
| |
Collapse
|
12
|
Biesiadecka MK, Sliwa P, Tomala K, Korona R. An Overexpression Experiment Does Not Support the Hypothesis That Avoidance of Toxicity Determines the Rate of Protein Evolution. Genome Biol Evol 2021; 12:589-596. [PMID: 32259256 PMCID: PMC7250497 DOI: 10.1093/gbe/evaa067] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/01/2020] [Indexed: 12/22/2022] Open
Abstract
The misfolding avoidance hypothesis postulates that sequence mutations render proteins cytotoxic and therefore the higher the gene expression, the stronger the operation of selection against substitutions. This translates into prediction that relative toxicity of extant proteins is higher for those evolving faster. In the present experiment, we selected pairs of yeast genes which were paralogous but evolving at different rates. We expressed them artificially to high levels. We expected that toxicity would be higher for ones bearing more mutations, especially that overcrowding should rather exacerbate than reverse the already existing differences in misfolding rates. We did find that the applied mode of overexpression caused a considerable decrease in fitness and that the decrease was proportional to the amount of excessive protein. However, it was not higher for proteins which are normally expressed at lower levels (and have less conserved sequence). This result was obtained consistently, regardless whether the rate of growth or ability to compete in common cultures was used as a proxy for fitness. In additional experiments, we applied factors that reduce accuracy of translation or enhance structural instability of proteins. It did not change a consistent pattern of independence between the fitness cost caused by overexpression of a protein and the rate of its sequence evolution.
Collapse
Affiliation(s)
| | - Piotr Sliwa
- Department of Genetics, Faculty of Biotechnology, University of Rzeszów, Poland
| | - Katarzyna Tomala
- Institute of Environmental Sciences, Faculty of Biology, Jagiellonian University, Cracow, Poland
| | - Ryszard Korona
- Institute of Environmental Sciences, Faculty of Biology, Jagiellonian University, Cracow, Poland
| |
Collapse
|
13
|
Dubreuil B, Levy ED. Abundance Imparts Evolutionary Constraints of Similar Magnitude on the Buried, Surface, and Disordered Regions of Proteins. Front Mol Biosci 2021; 8:626729. [PMID: 33996892 PMCID: PMC8119896 DOI: 10.3389/fmolb.2021.626729] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 03/29/2021] [Indexed: 12/02/2022] Open
Abstract
An understanding of the forces shaping protein conservation is key, both for the fundamental knowledge it represents and to allow for optimal use of evolutionary information in practical applications. Sequence conservation is typically examined at one of two levels. The first is a residue-level, where intra-protein differences are analyzed and the second is a protein-level, where inter-protein differences are studied. At a residue level, we know that solvent-accessibility is a prime determinant of conservation. By inverting this logic, we inferred that disordered regions are slightly more solvent-accessible on average than the most exposed surface residues in domains. By integrating abundance information with evolutionary data within and across proteins, we confirmed a previously reported strong surface-core association in the evolution of structured regions, but we found a comparatively weak association between disordered and structured regions. The facts that disordered and structured regions experience different structural constraints and evolve independently provide a unique setup to examine an outstanding question: why is a protein’s abundance the main determinant of its sequence conservation? Indeed, any structural or biophysical property linked to the abundance-conservation relationship should increase the relative conservation of regions concerned with that property (e.g., disordered residues with mis-interactions, domain residues with misfolding). Surprisingly, however, we found the conservation of disordered and structured regions to increase in equal proportion with abundance. This observation implies that either abundance-related constraints are structure-independent, or multiple constraints apply to different regions and perfectly balance each other.
Collapse
Affiliation(s)
- Benjamin Dubreuil
- Department of Structural Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Emmanuel D Levy
- Department of Structural Biology, Weizmann Institute of Science, Rehovot, Israel
| |
Collapse
|
14
|
Dilucca M, Cimini G, Giansanti A. Bacterial Protein Interaction Networks: Connectivity is Ruled by Gene Conservation, Essentiality and Function. Curr Genomics 2021; 22:111-121. [PMID: 34220298 PMCID: PMC8188579 DOI: 10.2174/1389202922666210219110831] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Revised: 08/13/2020] [Accepted: 08/27/2020] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND Protein-protein interaction (PPI) networks are the backbone of all processes in living cells. In this work, we relate conservation, essentiality and functional repertoire of a gene to the connectivity k (i.e. the number of interactions, links) of the corresponding protein in the PPI network. METHODS On a set of 42 bacterial genomes of different sizes, and with reasonably separated evolutionary trajectories, we investigate three issues: i) whether the distribution of connectivities changes between PPI subnetworks of essential and nonessential genes; ii) how gene conservation, measured both by the evolutionary retention index (ERI) and by evolutionary pressures, is related to the connectivity of the corresponding protein; iii) how PPI connectivities are modulated by evolutionary and functional relationships, as represented by the Clusters of Orthologous Genes (COGs). RESULTS We show that conservation, essentiality and functional specialisation of genes constrain the connectivity of the corresponding proteins in bacterial PPI networks. In particular, we isolated a core of highly connected proteins (connectivities k≥40), which is ubiquitous among the species considered here, though mostly visible in the degree distributions of bacteria with small genomes (less than 1000 genes). CONCLUSION The genes that support this highly connected core are conserved, essential and, in most cases, belong to the COG cluster J, related to ribosomal functions and the processing of genetic information.
Collapse
Affiliation(s)
- Maddalena Dilucca
- Dipartimento di Fisica, Sapienza University of Rome, 00185, Rome, Italy
| | - Giulio Cimini
- Dipartimento di Fisica, Tor Vergata University of Rome, 00133, Rome, Italy Istituto dei Sistemi Complessi CNR UoS, Rome, Italy
| | - Andrea Giansanti
- Dipartimento di Fisica, Sapienza University of Rome, 00185, Rome, Italy INFN Roma1 Unit, Rome, Italy
| |
Collapse
|
15
|
Usmanova DR, Plata G, Vitkup D. The Relationship between the Misfolding Avoidance Hypothesis and Protein Evolutionary Rates in the Light of Empirical Evidence. Genome Biol Evol 2021; 13:6081017. [PMID: 33432359 PMCID: PMC7874998 DOI: 10.1093/gbe/evab006] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/07/2021] [Indexed: 12/14/2022] Open
Abstract
For more than a decade, the misfolding avoidance hypothesis (MAH) and related theories have dominated evolutionary discussions aimed at explaining the variance of the molecular clock across cellular proteins. In this study, we use various experimental data to further investigate the consistency of the MAH predictions with empirical evidence. We also critically discuss experimental results that motivated the MAH development and that are often viewed as evidence of its major contribution to the variability of protein evolutionary rates. We demonstrate, in Escherichia coli and Homo sapiens, the lack of a substantial negative correlation between protein evolutionary rates and Gibbs free energies of unfolding, a direct measure of protein stability. We then analyze multiple new genome-scale data sets characterizing protein aggregation and interaction propensities, the properties that are likely optimized in evolution to alleviate deleterious effects associated with toxic protein misfolding and misinteractions. Our results demonstrate that the propensity of proteins to aggregate, the fraction of charged amino acids, and protein stickiness do correlate with protein abundances. Nevertheless, across multiple organisms and various data sets we do not observe substantial correlations between proteins’ aggregation- and stability-related properties and evolutionary rates. Therefore, diverse empirical data support the conclusion that the MAH and similar hypotheses do not play a major role in mediating a strong negative correlation between protein expression and the molecular clock, and thus in explaining the variability of evolutionary rates across cellular proteins.
Collapse
Affiliation(s)
- Dinara R Usmanova
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Germán Plata
- Department of Systems Biology, Columbia University, New York, NY, USA.,Elanco Animal Health, Greenfield, IN, USA
| | - Dennis Vitkup
- Department of Systems Biology, Columbia University, New York, NY, USA.,Department of Biomedical Informatics, Columbia University, New York, NY, USA
| |
Collapse
|
16
|
Alvarez-Ponce D. Richard Dickerson, Molecular Clocks, and Rates of Protein Evolution. J Mol Evol 2020; 89:122-126. [PMID: 33205299 DOI: 10.1007/s00239-020-09973-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2020] [Accepted: 11/07/2020] [Indexed: 12/29/2022]
Abstract
Proteins approximately behave as molecular clocks, accumulating amino acid replacements at a more or less constant rate. Nonetheless, each protein displays a characteristic rate of evolution: whereas some proteins remain largely unaltered over large periods of time, others can rapidly accumulate amino acid replacements. An article by Richard Dickerson, published in the first issue of the Journal of Molecular Evolution (J Mol Evol 1:26-45, 1971), described the first analysis in which the rates of evolution of many proteins were compared, and the differences were interpreted in the light of their function. When comparing the sequences of fibrinopeptides, hemoglobin, and cytochrome c of different species, he observed a linear relationship between the number of amino acid replacements and divergence time. Remarkably, fibrinopeptides had evolved fast, cytochrome c had evolved slowly, and hemoglobin exhibited an intermediate rate of evolution. As the Journal of Molecular Evolution celebrates its 50th anniversary, I highlight this landmark article and reflect on its impact on the field of Molecular Evolution.
Collapse
Affiliation(s)
- David Alvarez-Ponce
- Department of Biology, University of Nevada, Reno, 1664 N. Virginia Street, Reno, NV, 89557, USA.
| |
Collapse
|
17
|
Zhang X, Pavlicev M, Jones HN, Muglia LJ. Eutherian-Specific Gene TRIML2 Attenuates Inflammation in the Evolution of Placentation. Mol Biol Evol 2020; 37:507-523. [PMID: 31633784 PMCID: PMC6993854 DOI: 10.1093/molbev/msz238] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Evolution of highly invasive placentation in the stem lineage of eutherians and subsequent extension of pregnancy set eutherians apart from other mammals, that is, marsupials with short-lived placentas, and oviparous monotremes. Recent studies suggest that eutherian implantation evolved from marsupial attachment reaction, an inflammatory process induced by the direct contact of fetal placenta with maternal endometrium after the breakdown of the shell coat, and shortly before the onset of parturition. Unique to eutherians, a dramatic downregulation of inflammation after implantation prevents the onset of premature parturition, and is critical for the maintenance of gestation. This downregulation likely involved evolutionary changes on maternal as well as fetal/placental side. Tripartite-motif family-like2 (TRIML2) only exists in eutherian genomes and shows preferential expression in preimplantation embryos, and trophoblast-derived structures, such as chorion and placental disc. Comparative genomic evidence supports that TRIML2 originated from a gene duplication event in the stem lineage of Eutheria that also gave rise to eutherian TRIML1. Compared with TRIML1, TRIML2 lost the catalytic RING domain of E3 ligase. However, only TRIML2 is induced in human choriocarcinoma cell line JEG3 with poly(I:C) treatment to simulate inflammation during viral infection. Its knockdown increases the production of proinflammatory cytokines and reduces trophoblast survival during poly(I:C) stimulation, while its overexpression reduces proinflammatory cytokine production, supporting TRIML2’s role as a regulatory inhibitor of the inflammatory pathways in trophoblasts. TRIML2’s potential virus-interacting PRY/SPRY domain shows significant signature of selection, suggesting its contribution to the evolution of eutherian-specific inflammation regulation during placentation.
Collapse
Affiliation(s)
- Xuzhe Zhang
- Division of Human Genetics, Center for Prevention of Preterm Birth, Perinatal Institute, Cincinnati Children's Hospital Medical Center, Cincinnati, OH.,Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH.,March of Dimes Prematurity Research Center Ohio Collaborative, Cincinnati, OH
| | - Mihaela Pavlicev
- Division of Human Genetics, Center for Prevention of Preterm Birth, Perinatal Institute, Cincinnati Children's Hospital Medical Center, Cincinnati, OH.,Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH.,March of Dimes Prematurity Research Center Ohio Collaborative, Cincinnati, OH
| | - Helen N Jones
- Division of Pediatric Surgery, Cincinnati Children's Hospital Medical Center, Cincinnati, OH.,Department of Surgery, University of Cincinnati College of Medicine, Cincinnati, OH
| | - Louis J Muglia
- Division of Human Genetics, Center for Prevention of Preterm Birth, Perinatal Institute, Cincinnati Children's Hospital Medical Center, Cincinnati, OH.,Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH.,March of Dimes Prematurity Research Center Ohio Collaborative, Cincinnati, OH
| |
Collapse
|
18
|
Kim TH, Zhou X, Chen M. Demystifying "drop-outs" in single-cell UMI data. Genome Biol 2020; 21:196. [PMID: 32762710 PMCID: PMC7412673 DOI: 10.1186/s13059-020-02096-y] [Citation(s) in RCA: 53] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Accepted: 07/08/2020] [Indexed: 01/10/2023] Open
Abstract
Many existing pipelines for scRNA-seq data apply pre-processing steps such as normalization or imputation to account for excessive zeros or "drop-outs." Here, we extensively analyze diverse UMI data sets to show that clustering should be the foremost step of the workflow. We observe that most drop-outs disappear once cell-type heterogeneity is resolved, while imputing or normalizing heterogeneous data can introduce unwanted noise. We propose a novel framework HIPPO (Heterogeneity-Inspired Pre-Processing tOol) that leverages zero proportions to explain cellular heterogeneity and integrates feature selection with iterative clustering. HIPPO leads to downstream analysis with greater flexibility and interpretability compared to alternatives.
Collapse
Affiliation(s)
- Tae Hyun Kim
- Department of Statistics, University of Chicago, Chicago, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, USA.
| | - Mengjie Chen
- Department of Human Genetics and Department of Medicine, University of Chicago, Chicago, USA.
| |
Collapse
|
19
|
Abstract
Darwin's theory of evolution emphasized that positive selection of functional proficiency provides the fitness that ultimately determines the structure of life, a view that has dominated biochemical thinking of enzymes as perfectly optimized for their specific functions. The 20th-century modern synthesis, structural biology, and the central dogma explained the machinery of evolution, and nearly neutral theory explained how selection competes with random fixation dynamics that produce molecular clocks essential e.g. for dating evolutionary histories. However, quantitative proteomics revealed that selection pressures not relating to optimal function play much larger roles than previously thought, acting perhaps most importantly via protein expression levels. This paper first summarizes recent progress in the 21st century toward recovering this universal selection pressure. Then, the paper argues that proteome cost minimization is the dominant, underlying 'non-function' selection pressure controlling most of the evolution of already functionally adapted living systems. A theory of proteome cost minimization is described and argued to have consequences for understanding evolutionary trade-offs, aging, cancer, and neurodegenerative protein-misfolding diseases.
Collapse
|
20
|
Aligning functional network constraint to evolutionary outcomes. BMC Evol Biol 2020; 20:58. [PMID: 32448114 PMCID: PMC7245893 DOI: 10.1186/s12862-020-01613-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Accepted: 04/15/2020] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Functional constraint through genomic architecture is suggested to be an important dimension of genome evolution, but quantitative evidence for this idea is rare. In this contribution, existing evidence and discussions on genomic architecture as constraint for convergent evolution, rapid adaptation, and genic adaptation are summarized into alternative, testable hypotheses. Network architecture statistics from protein-protein interaction networks are then used to calculate differences in evolutionary outcomes on the example of genomic evolution in yeast, and the results are used to evaluate statistical support for these longstanding hypotheses. RESULTS A discriminant function analysis lent statistical support to classifying the yeast interactome into hub, intermediate and peripheral nodes based on network neighborhood connectivity, betweenness centrality, and average shortest path length. Quantitative support for the existence of genomic architecture as a mechanistic basis for evolutionary constraint is then revealed through utilizing these statistical parameters of the protein-protein interaction network in combination with estimators of protein evolution. CONCLUSIONS As functional genetic networks are becoming increasingly available, it will now be possible to evaluate functional genetic network constraint against variables describing complex phenotypes and environments, for better understanding of commonly observed deterministic patterns of evolution in non-model organisms. The hypothesis framework and methodological approach outlined herein may help to quantify the extrinsic versus intrinsic dimensions of evolutionary constraint, and result in a better understanding of how fast, effectively, or deterministically organisms adapt.
Collapse
|
21
|
Alvarez-Ponce D, Aguilar-Rodríguez J, Fares MA. Molecular Chaperones Accelerate the Evolution of Their Protein Clients in Yeast. Genome Biol Evol 2020; 11:2360-2375. [PMID: 31297528 PMCID: PMC6735891 DOI: 10.1093/gbe/evz147] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/05/2019] [Indexed: 12/23/2022] Open
Abstract
Protein stability is a major constraint on protein evolution. Molecular chaperones, also known as heat-shock proteins, can relax this constraint and promote protein evolution by diminishing the deleterious effect of mutations on protein stability and folding. This effect, however, has only been stablished for a few chaperones. Here, we use a comprehensive chaperone–protein interaction network to study the effect of all yeast chaperones on the evolution of their protein substrates, that is, their clients. In particular, we analyze how yeast chaperones affect the evolutionary rates of their clients at two very different evolutionary time scales. We first study the effect of chaperone-mediated folding on protein evolution over the evolutionary divergence of Saccharomyces cerevisiae and S. paradoxus. We then test whether yeast chaperones have left a similar signature on the patterns of standing genetic variation found in modern wild and domesticated strains of S. cerevisiae. We find that genes encoding chaperone clients have diverged faster than genes encoding non-client proteins when controlling for their number of protein–protein interactions. We also find that genes encoding client proteins have accumulated more intraspecific genetic diversity than those encoding non-client proteins. In a number of multivariate analyses, controlling by other well-known factors that affect protein evolution, we find that chaperone dependence explains the largest fraction of the observed variance in the rate of evolution at both evolutionary time scales. Chaperones affecting rates of protein evolution mostly belong to two major chaperone families: Hsp70s and Hsp90s. Our analyses show that protein chaperones, by virtue of their ability to buffer destabilizing mutations and their role in modulating protein genotype–phenotype maps, have a considerable accelerating effect on protein evolution.
Collapse
Affiliation(s)
- David Alvarez-Ponce
- Biology Department, University of Nevada, Reno.,Instituto de Biología Molecular y Celular de Plantas, CSIC-UPV, Valencia, Spain
| | - José Aguilar-Rodríguez
- Department of Biology, Stanford University, CA.,Department of Chemical and Systems Biology, Stanford University School of Medicine, CA
| | - Mario A Fares
- Instituto de Biología Molecular y Celular de Plantas, CSIC-UPV, Valencia, Spain.,Smurfit Institute of Genetics, University of Dublin, Trinity College Dublin, Ireland
| |
Collapse
|
22
|
Buonocore F, Gerdol M, Pallavicini A, Stocchi V, Randelli E, Belardinelli MC, Miccoli A, Saraceni PR, Secombes CJ, Scapigliati G, Wang T. Identification, molecular characterization and functional analysis of interleukin (IL)-2 and IL-2like (IL-2L) cytokines in sea bass (Dicentrarchus labrax L.). Cytokine 2019; 126:154898. [PMID: 31706201 DOI: 10.1016/j.cyto.2019.154898] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2019] [Revised: 10/17/2019] [Accepted: 10/22/2019] [Indexed: 01/18/2023]
Abstract
In mammals, interleukin (IL)-2, initially known as a T-cell grow factor, is an immunomodulatory cytokine involved in the proliferation of T cells upon antigen activation. In bony fish, some IL-2 orthologs have been identified, but, recently, an additional IL-2like (IL-2L) gene has been found. In this paper, we report the presence of these two divergent IL-2 isoforms in sea bass (Dicentrarchus labrax L.). Genomic analyses revealed that they originated from a gene duplication event, as happened in most percomorphs. These two IL-2 paralogs show differences in the amino acid sequence and in the exon 4 size, and these features could be an indication that they bind preferentially to different specific IL-2 receptors. Sea bass IL-2 paralogs are highly expressed in gut and spleen, which are tissues and organs involved in fish T cell immune functions, and the two cytokines could be up-regulated by both PHA stimulation and vaccination with a bacterial vaccine, with IL-2L being more inducible. To investigate the functional activities of sea bass IL-2 and IL-2L we produced the corresponding recombinant molecules in E. coli and used them to in vitro stimulate HK and spleen leukocytes. IL-2L is able to up-regulate the expression of markers related to different T cell subsets (Th1, Th2 and Th17) and to Treg cells in HK, whereas it has little effect in spleen. IL-2 is not active on these markers in HK, but shows an effect on Th1 markers in spleen. Finally, the stimulation with recombinant IL-2 and IL-2L is also able to induce in vitro proliferation of HK- and spleen-derived leukocytes. In conclusion, we have demonstrated that sea bass possess two IL-2 paralogs that likely have an important role in regulating T cell development in this species and that show distinct bioactivities.
Collapse
Affiliation(s)
- Francesco Buonocore
- Department for Innovation in Biological, Agro-food and Forest Systems, University of Tuscia, Largo dell'Università snc, 05100 Viterbo, VT, Italy.
| | - Marco Gerdol
- Department of Life Sciences, University of Trieste, Via Giorgieri 5, 34127 Trieste, TS, Italy
| | - Alberto Pallavicini
- Department of Life Sciences, University of Trieste, Via Giorgieri 5, 34127 Trieste, TS, Italy
| | - Valentina Stocchi
- Department for Innovation in Biological, Agro-food and Forest Systems, University of Tuscia, Largo dell'Università snc, 05100 Viterbo, VT, Italy
| | - Elisa Randelli
- Department for Innovation in Biological, Agro-food and Forest Systems, University of Tuscia, Largo dell'Università snc, 05100 Viterbo, VT, Italy
| | - Maria Cristina Belardinelli
- Department for Innovation in Biological, Agro-food and Forest Systems, University of Tuscia, Largo dell'Università snc, 05100 Viterbo, VT, Italy
| | - Andrea Miccoli
- Department for Innovation in Biological, Agro-food and Forest Systems, University of Tuscia, Largo dell'Università snc, 05100 Viterbo, VT, Italy
| | - Paolo Roberto Saraceni
- Department for Innovation in Biological, Agro-food and Forest Systems, University of Tuscia, Largo dell'Università snc, 05100 Viterbo, VT, Italy
| | - Christopher J Secombes
- Scottish Fish Immunology Research Centre, School of Biological Sciences, University of Aberdeen, Aberdeen AB24 2TZ, UK
| | - Giuseppe Scapigliati
- Department for Innovation in Biological, Agro-food and Forest Systems, University of Tuscia, Largo dell'Università snc, 05100 Viterbo, VT, Italy
| | - Tiehui Wang
- Scottish Fish Immunology Research Centre, School of Biological Sciences, University of Aberdeen, Aberdeen AB24 2TZ, UK
| |
Collapse
|
23
|
Grandchamp A, Piégu B, Monget P. Genes Encoding Teleost Fish Ligands and Associated Receptors Remained in Duplicate More Frequently than the Rest of the Genome. Genome Biol Evol 2019; 11:1451-1462. [PMID: 31087101 PMCID: PMC6540934 DOI: 10.1093/gbe/evz078] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/05/2019] [Indexed: 12/15/2022] Open
Abstract
Signaling through ligand/receptor interactions is a widespread mechanism across all living taxa. During evolution, however, there has been a diversification in multigene families and changes in their interaction patterns. Among the events that led to the creation of new genes is the whole-genome duplication, which made possible some major innovations. Teleost fishes descended from a common ancestor which underwent one such whole-genome duplication. In our study, we investigated the effect of complete genome duplication on the evolution of ligand–receptor pairs in teleosts. We selected ten teleost species and used bioinformatics programs and phylogenetic tools in order to study the evolution of the human ligands and receptors that have orthologous genes in fishes, as well as the rest of the fish genomes. We established that since the complete duplication of the fish genomes, the conservation in duplicate copy of ligand and receptor genes is higher than expected. However, the ligand/receptor pair partners did not necessarily evolve in the same way, and a lot of situations occurred in which one of the partners returned in singleton copy when the other one was maintained in duplicate. This suggests that changes in interaction partners may have taken place during the evolution of teleosts. Moreover, the fate of the ligands and receptor coding genes is partly congruent with the phylogeny of teleosts. However, some incongruences can be observed. We suggest that these incongruences are correlated to the environment.
Collapse
Affiliation(s)
- Anna Grandchamp
- PRC, UMR85, INRA, CNRS, IFCE, Université de Tours, Nouzilly, France
| | - Benoît Piégu
- PRC, UMR85, INRA, CNRS, IFCE, Université de Tours, Nouzilly, France
| | - Philippe Monget
- PRC, UMR85, INRA, CNRS, IFCE, Université de Tours, Nouzilly, France
| |
Collapse
|
24
|
Fang L, Zhou Y, Liu S, Jiang J, Bickhart DM, Null DJ, Li B, Schroeder SG, Rosen BD, Cole JB, Van Tassell CP, Ma L, Liu GE. Comparative analyses of sperm DNA methylomes among human, mouse and cattle provide insights into epigenomic evolution and complex traits. Epigenetics 2019; 14:260-276. [PMID: 30810461 PMCID: PMC6557555 DOI: 10.1080/15592294.2019.1582217] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Sperm DNA methylation is crucial for fertility and viability of offspring but epigenome evolution in mammals is largely understudied. By comparing sperm DNA methylomes and large-scale genome-wide association study (GWAS) signals between human and cattle, we aimed to examine the DNA methylome evolution and its associations with complex phenotypes in mammals. Our analysis revealed that genes with conserved non-methylated promoters (e.g., ANKS1A and WNT7A) among human and cattle were involved in common system and embryo development, and enriched for GWAS signals of body conformation traits in both species, while genes with conserved hypermethylated promoters (e.g., TCAP and CD80) were engaged in immune responses and highlighted by immune-related traits. On the other hand, genes with human-specific hypomethylated promoters (e.g., FOXP2 and HYDIN) were engaged in neuron system development and enriched for GWAS signals of brain-related traits, while genes with cattle-specific hypomethylated promoters (e.g., LDHB and DGAT2) mainly participated in lipid storage and metabolism. We validated our findings using sperm-retained nucleosome, preimplantation transcriptome, and adult tissue transcriptome data, as well as sequence evolutionary features, including motif binding sites, mutation rates, recombination rates and evolution signatures. In conclusion, our results demonstrate important roles of epigenome evolution in shaping the genetic architecture underlying complex phenotypes, hence enhance signal prioritization in GWAS and provide valuable information for human neurological disorders and livestock genetic improvement.
Collapse
Affiliation(s)
- Lingzhao Fang
- a Animal Genomics and Improvement Laboratory, BARC , Agricultural Research Service, USDA , Beltsville , MD , USA.,b Department of Animal and Avian Sciences , University of Maryland , College Park , MD , USA
| | - Yang Zhou
- c Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Education Ministry of China , Huazhong Agricultural University , Wuhan , Hubei , China
| | - Shuli Liu
- a Animal Genomics and Improvement Laboratory, BARC , Agricultural Research Service, USDA , Beltsville , MD , USA.,d Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture & National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology , China Agricultural University , Beijing , China
| | - Jicai Jiang
- b Department of Animal and Avian Sciences , University of Maryland , College Park , MD , USA
| | - Derek M Bickhart
- e Dairy Forage Research Center , Agricultural Research Service, USDA , Madison , WI , USA
| | - Daniel J Null
- a Animal Genomics and Improvement Laboratory, BARC , Agricultural Research Service, USDA , Beltsville , MD , USA
| | - Bingjie Li
- a Animal Genomics and Improvement Laboratory, BARC , Agricultural Research Service, USDA , Beltsville , MD , USA
| | - Steven G Schroeder
- a Animal Genomics and Improvement Laboratory, BARC , Agricultural Research Service, USDA , Beltsville , MD , USA
| | - Benjamin D Rosen
- a Animal Genomics and Improvement Laboratory, BARC , Agricultural Research Service, USDA , Beltsville , MD , USA
| | - John B Cole
- a Animal Genomics and Improvement Laboratory, BARC , Agricultural Research Service, USDA , Beltsville , MD , USA
| | - Curtis P Van Tassell
- a Animal Genomics and Improvement Laboratory, BARC , Agricultural Research Service, USDA , Beltsville , MD , USA
| | - Li Ma
- b Department of Animal and Avian Sciences , University of Maryland , College Park , MD , USA
| | - George E Liu
- a Animal Genomics and Improvement Laboratory, BARC , Agricultural Research Service, USDA , Beltsville , MD , USA
| |
Collapse
|
25
|
Jain A, Perisa D, Fliedner F, von Haeseler A, Ebersberger I. The Evolutionary Traceability of a Protein. Genome Biol Evol 2019; 11:531-545. [PMID: 30649284 PMCID: PMC6394115 DOI: 10.1093/gbe/evz008] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/11/2019] [Indexed: 12/12/2022] Open
Abstract
Orthologs document the evolution of genes and metabolic capacities encoded in extant and ancient genomes. However, the similarity between orthologs decays with time, and ultimately it becomes insufficient to infer common ancestry. This leaves ancient gene set reconstructions incomplete and distorted to an unknown extent. Here we introduce the “evolutionary traceability” as a measure that quantifies, for each protein, the evolutionary distance beyond which the sensitivity of the ortholog search becomes limiting. Using yeast, we show that genes that were thought to date back to the last universal common ancestor are of high traceability. Their functions mostly involve catalysis, ion transport, and ribonucleoprotein complex assembly. In turn, the fraction of yeast genes whose traceability is not sufficient to infer their presence in last universal common ancestor is enriched for regulatory functions. Computing the traceabilities of genes that have been experimentally characterized as being essential for a self-replicating cell reveals that many of the genes that lack orthologs outside bacteria have low traceability. This leaves open whether their orthologs in the eukaryotic and archaeal domains have been overlooked. Looking at the example of REC8, a protein essential for chromosome cohesion, we demonstrate how a traceability-informed adjustment of the search sensitivity identifies hitherto missed orthologs in the fast-evolving microsporidia. Taken together, the evolutionary traceability helps to differentiate between true absence and nondetection of orthologs, and thus improves our understanding about the evolutionary conservation of functional protein networks. “protTrace,” a software tool for computing evolutionary traceability, is freely available at https://github.com/BIONF/protTrace.git; last accessed February 10, 2019.
Collapse
Affiliation(s)
- Arpit Jain
- Applied Bioinformatics Group, Institute of Cell Biology & Neuroscience, Goethe University, Frankfurt, Germany
| | - Dominik Perisa
- Applied Bioinformatics Group, Institute of Cell Biology & Neuroscience, Goethe University, Frankfurt, Germany
| | - Fabian Fliedner
- Applied Bioinformatics Group, Institute of Cell Biology & Neuroscience, Goethe University, Frankfurt, Germany
| | - Arndt von Haeseler
- Center for Integrative Bioinformatics Vienna, Max F. Perutz Laboratories, University of Vienna, Medical University Vienna, Austria.,Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna, Austria
| | - Ingo Ebersberger
- Applied Bioinformatics Group, Institute of Cell Biology & Neuroscience, Goethe University, Frankfurt, Germany.,Senckenberg Biodiversity and Climate Research Center (BiK-F), Frankfurt, Germany.,LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Frankfurt, Germany
| |
Collapse
|
26
|
Abstract
An attractive and long-standing hypothesis regarding the evolution of genes after duplication posits that the duplication event creates new evolutionary possibilities by releasing a copy of the gene from constraint. Apparent support was found in numerous analyses, particularly, the observation of higher rates of evolution in duplicated as compared with singleton genes. Could it, instead, be that more duplicable genes (owing to mutation, fixation, or retention biases) are intrinsically faster evolving? To uncouple the measurement of rates of evolution from the determination of duplicate or singleton status, we measure the rates of evolution in singleton genes in outgroup primate lineages but classify these genes as to whether they have duplicated or not in a crown group of great apes. We find that rates of evolution are higher in duplicable genes prior to the duplication event. In part this is owing to a negative correlation between coding sequence length and rate of evolution, coupled with a bias toward smaller genes being more duplicable. The effect is masked by difference in expression rate between duplicable genes and singletons. Additionally, in contradiction to the classical assumption, we find no convincing evidence for an increase in dN/dS after duplication, nor for rate asymmetry between duplicates. We conclude that high rates of evolution of duplicated genes are not solely a consequence of the duplication event, but are rather a predictor of duplicability. These results are consistent with a model in which successful gene duplication events in mammals are skewed toward events of minimal phenotypic impact.
Collapse
Affiliation(s)
- Áine N O'Toole
- Department of Genetics, Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland
| | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, Somerset, United Kingdom
| | - Aoife McLysaght
- Department of Genetics, Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland
| |
Collapse
|
27
|
Mao XF, Chen XP, Jin YB, Cui JH, Pan YM, Lai CY, Lin KR, Ling F, Luo W. The variations of TRBV genes usages in the peripheral blood of a healthy population are associated with their evolution and single nucleotide polymorphisms. Hum Immunol 2018; 80:195-203. [PMID: 30576702 DOI: 10.1016/j.humimm.2018.12.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2018] [Revised: 12/10/2018] [Accepted: 12/17/2018] [Indexed: 11/16/2022]
Abstract
T cell receptors (TCRs) are a class of T cell surface molecules that recognize the antigen-derived peptides presented by the major histocompatibility complex (MHC) and are able to trigger a series of immune responses. TCRs are important members of the adaptive immune system that arose in the jawed fish 500 million years ago. T cell receptor beta variable (TRBV) genes have been widely used to characterize TCR repertoires. Studying the evolution of TRBV may help us to better understand the adaptive immune system. To investigate TRBV evolution and its impacts on the usages of TRBV genes in human populations, we compared the TRBV genes and their homologous sequences among humans, mouse, rhesus and chimpanzee, analyzed the single-nucleotide polymorphisms (SNPs) located at TRBV loci, and sequenced TCR repertoires in the peripheral blood of 97 healthy donors. We found that functional TRBVs are more evolutionarily conserved but possess more SNPs in human populations than do nonfunctional (pseudo) TRBVs. Based on the conservation levels in the four species, we classified the functional TRBVs into 2 groups: old (conserved between mouse and humans) and new (conserved only in primates). The new TRBVs evolve faster and possess more SNPs than the old TRBVs. The variations in TRBV genes frequencies in the peripheral blood of healthy donors are negatively correlated with SNP density. These observations suggest that TRBV usages may be influenced by TCR-MHC co-evolution.
Collapse
Affiliation(s)
- Xiao-Fan Mao
- Clinical Research Institute, Sun Yat-Sen University Foshan Hospital, Foshan, China; Department of Molecular Biology, School of Bioengineering and Biotechnology, South China University of Technology, Guangzhou, China
| | - Xiang-Ping Chen
- Clinical Research Institute, Sun Yat-Sen University Foshan Hospital, Foshan, China
| | - Ya-Bin Jin
- Clinical Research Institute, Sun Yat-Sen University Foshan Hospital, Foshan, China
| | - Jin-Huan Cui
- Clinical Research Institute, Sun Yat-Sen University Foshan Hospital, Foshan, China
| | - Ying-Ming Pan
- Clinical Research Institute, Sun Yat-Sen University Foshan Hospital, Foshan, China
| | - Chun-Yan Lai
- Center of Health Management, Sun Yat-Sen University Foshan Hospital, Foshan, China
| | - Kai-Rong Lin
- Clinical Research Institute, Sun Yat-Sen University Foshan Hospital, Foshan, China
| | - Fei Ling
- Department of Molecular Biology, School of Bioengineering and Biotechnology, South China University of Technology, Guangzhou, China.
| | - Wei Luo
- Clinical Research Institute, Sun Yat-Sen University Foshan Hospital, Foshan, China.
| |
Collapse
|
28
|
Aguilar-Rodríguez J, Wagner A. Metabolic Determinants of Enzyme Evolution in a Genome-Scale Bacterial Metabolic Network. Genome Biol Evol 2018; 10:3076-3088. [PMID: 30351420 PMCID: PMC6257574 DOI: 10.1093/gbe/evy234] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/22/2018] [Indexed: 11/12/2022] Open
Abstract
Different genes and proteins evolve at very different rates. To identify the factors that explain these differences is an important aspect of research in molecular evolution. One such factor is the role a protein plays in a large molecular network. Here, we analyze the evolutionary rates of enzyme-coding genes in the genome-scale metabolic network of Escherichia coli to find the evolutionary constraints imposed by the structure and function of this complex metabolic system. Central and highly connected enzymes appear to evolve more slowly than less connected enzymes, but we find that they do so as a by-product of their high abundance, and not because of their position in the metabolic network. In contrast, enzymes catalyzing reactions with high metabolic flux-high substrate to product conversion rates-evolve slowly even after we account for their abundance. Moreover, enzymes catalyzing reactions that are difficult to by-pass through alternative pathways, such that they are essential in many different genetic backgrounds, also evolve more slowly. Our analyses show that an enzyme's role in the function of a metabolic network affects its evolution more than its place in the network's structure. They highlight the value of a system-level perspective for studies of molecular evolution.
Collapse
Affiliation(s)
- José Aguilar-Rodríguez
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Department of Biology, Stanford University, Stanford, CA and Department of Chemical and Systems Biology, Stanford University School of Medicine, Stanford, CA
| | - Andreas Wagner
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- The Santa Fe Institute, Santa Fe, New Mexico
| |
Collapse
|
29
|
Leanse LG, Harrington OD, Fang Y, Ahmed I, Goh XS, Dai T. Evaluating the Potential for Resistance Development to Antimicrobial Blue Light (at 405 nm) in Gram-Negative Bacteria: In vitro and in vivo Studies. Front Microbiol 2018; 9:2403. [PMID: 30459719 PMCID: PMC6232756 DOI: 10.3389/fmicb.2018.02403] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Accepted: 09/19/2018] [Indexed: 11/17/2022] Open
Abstract
Antimicrobial resistance is a threat to public health that requires our immediate attention. With increasing numbers of microbes that are becoming resistant to routinely used antimicrobials, it is vital that we look to other, non-traditional therapies for the treatment of infections. Antimicrobial blue light (aBL) is an innovative approach that has demonstrated efficacy for the inactivation of an array of microbial pathogens. In the present study, we investigated the potential for resistance development to aBL in Gram-negative pathogenic bacteria by carrying out multiple aBL exposures on bacteria. In the first aBL exposure, clinical isolates of Pseudomonas aeruginosa, Acinetobacter baumannii, and uropathogenic Escherichia coli [107 colony forming units/mL (CFU/mL)] were irradiated in phosphate-buffered saline with aBL at 405 nm until a >99.99% reduction in bacterial viability was achieved. Irradiation was then repeated for each bacterial species over 20 cycles of aBL exposure. The potential for resistance development to aBL was also investigated in vivo, in superficial mouse wounds infected with a bioluminescent strain of P. aeruginosa (PAO1; 108 CFU) and irradiated with a sub-curative radiant exposures of 108 or 216 J/cm2 aBL over 5 cycles of treatment (over 5 days) prior to bacterial isolation from the animal tissue. PAO1 isolated from infected tissue were treated with aBL at 216 J/cm2, in vitro, in parallel with unexposed PAO1 or PAO1 isolates from mouse wound infections not treated with aBL. No statistically significant correlation was found between the aBL-susceptibility of bacteria in vitro and the number of cycles of aBL exposure any bacterial species (P ≥ 0.26). In addition, serial exposure of infected mouse wounds to aBL did not result in any change in the susceptibility to aBL of PAO1 (P = 0.97). In conclusion, it is unlikely that sequential exposure to aBL will result in aBL-resistance in Gram-negative bacteria. Also, multiple aBL treatments may potentially be administered to an infected wound without resistance development becoming a concern.
Collapse
Affiliation(s)
- Leon G Leanse
- Wellman Center for Photomedicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, United States
| | - Olivia D Harrington
- Wellman Center for Photomedicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, United States
| | - Yanyan Fang
- Wellman Center for Photomedicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, United States
| | - Imran Ahmed
- Wellman Center for Photomedicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, United States
| | - Xueping Sharon Goh
- Wellman Center for Photomedicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, United States
| | - Tianhong Dai
- Wellman Center for Photomedicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, United States
| |
Collapse
|
30
|
Duan C, Huan Q, Chen X, Wu S, Carey LB, He X, Qian W. Reduced intrinsic DNA curvature leads to increased mutation rate. Genome Biol 2018; 19:132. [PMID: 30217230 PMCID: PMC6138893 DOI: 10.1186/s13059-018-1525-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2018] [Accepted: 09/05/2018] [Indexed: 01/24/2023] Open
Abstract
BACKGROUND Mutation rates vary across the genome. Many trans factors that influence mutation rates have been identified, as have specific sequence motifs at the 1-7-bp scale, but cis elements remain poorly characterized. The lack of understanding regarding why different sequences have different mutation rates hampers our ability to identify positive selection in evolution and to identify driver mutations in tumorigenesis. RESULTS Here, we use a combination of synthetic genes and sequences of thousands of isolated yeast colonies to show that intrinsic DNA curvature is a major cis determinant of mutation rate. Mutation rate negatively correlates with DNA curvature within genes, and a 10% decrease in curvature results in a 70% increase in mutation rate. Consistently, both yeast and humans accumulate mutations in regions with small curvature. We further show that this effect is due to differences in the intrinsic mutation rate, likely due to differences in mutagen sensitivity and not due to differences in the local activity of DNA repair. CONCLUSIONS Our study establishes a framework for understanding the cis properties of DNA sequence in modulating the local mutation rate and identifies a novel causal source of non-uniform mutation rates across the genome.
Collapse
Affiliation(s)
- Chaorui Duan
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China.,Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Qing Huan
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China.,Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Xiaoshu Chen
- Human Genome Research Institute and Department of Medical Genetics, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510080, China
| | - Shaohuan Wu
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China.,Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Lucas B Carey
- Department of Experimental and Health Sciences, Universitat Pompeu Fabra, 08003, Barcelona, Spain
| | - Xionglei He
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, China
| | - Wenfeng Qian
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China. .,Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China. .,University of Chinese Academy of Sciences, Beijing, 100049, China.
| |
Collapse
|
31
|
Coetzer WG, Turner TR, Schmitt CA, Grobler JP. Adaptive genetic variation at three loci in South African vervet monkeys ( Chlorocebus pygerythrus) and the role of selection within primates. PeerJ 2018; 6:e4953. [PMID: 29888138 PMCID: PMC5991302 DOI: 10.7717/peerj.4953] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2018] [Accepted: 05/22/2018] [Indexed: 12/22/2022] Open
Abstract
Vervet monkeys (Chlorocebus pygerythrus) are one of the most widely distributed non-human primate species found in South Africa. They occur across all the South African provinces, inhabiting a large variety of habitats. These habitats vary sufficiently that it can be assumed that various factors such as pathogen diversity could influence populations in different ways. In turn, these factors could lead to varied levels of selection at specific fitness linked loci. The Toll-like receptor (TLR) gene family, which play an integral role in vertebrate innate immunity, is a group of fitness linked loci which has been the focus of much research. In this study, we assessed the level of genetic variation at partial sequences of two TLR loci (TLR4 and 7) and a reproductively linked gene, acrosin (ACR), across the different habitat types within the vervet monkey distribution range. Gene variation and selection estimates were also made among 11-21 primate species. Low levels of genetic variation for all three gene regions were observed within vervet monkeys, with only two polymorphic sites identified for TLR4, three sites for TLR7 and one site for ACR. TLR7 variation was positively correlated with high mean annual rainfall, which was linked to increased pathogen abundance. The observed genetic variation at TLR4 might have been influenced by numerous factors including pathogens and climatic conditions. The ACR exonic regions showed no variation in vervet monkeys, which could point to the occurrence of a selective sweep. The TLR4 and TLR7 results for the among primate analyses was mostly in line with previous studies, indicating a higher rate of evolution for TLR4. Within primates, ACR coding regions also showed signs of positive selection, which was congruent with previous reports on mammals. Important additional information to the already existing vervet monkey knowledge base was gained from this study, which can guide future research projects on this highly researched taxon as well as help conservation agencies with future management planning involving possible translocations of this species.
Collapse
Affiliation(s)
- Willem G Coetzer
- Department of Genetics, University of the Free State, Bloemfontein, South Africa
| | - Trudy R Turner
- Department of Anthropology, University of Wisconsin-Milwaukee, Milwaukee, WI, USA
| | | | - J Paul Grobler
- Department of Genetics, University of the Free State, Bloemfontein, South Africa
| |
Collapse
|
32
|
Singh P, Dass JFP. Nearly neutral evolution in IFNL3 gene retains the immune function to detect and clear the viral infection in HCV. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2018; 140:107-116. [PMID: 29746888 DOI: 10.1016/j.pbiomolbio.2018.05.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2017] [Revised: 04/24/2018] [Accepted: 05/05/2018] [Indexed: 02/07/2023]
Abstract
IFNL3 gene plays a crucial role in immune defense against viruses. It induces the interferon stimulated genes (ISGs) with antiviral properties by activating the JAK-STAT pathway. In this study, we investigated the evolutionary force involved in shaping the IFNL3 gene to perform its downstream function as a regulatory gene in HCV clearance. We have selected 25 IFNL3 coding sequences with human gene as a reference sequence and constructed a phylogeny. Furthermore, rate of variation, substitution saturation test, phylogenetic informativeness and differential selection were also analysed. The codon evolution result suggests that nearly neutral mutation is the key pattern in shaping the IFNL3 evolution. The results were validated by subjecting the human IFNL3 protein variants to that of the native through a molecular dynamics simulation study. The molecular dynamics simulation clearly depicts the negative impact on the reported variants in human IFNL3 protein. However, these detrimental mutations (R157Q and R157W) were shown to be negatively selected in the evolutionary study of the mammals. Hence, the variation revealed a mild impact on the IFNL3 function and may be removed from the population through negative selection due to its high functional constraints. In a nutshell, our study may contribute the overall evidence in phylotyping and structural transformation that takes place in the non-synonymous substitutions of IFNL3 protein. Substantially, our obtained theoretical knowledge will lay the path to extend the experimental validation in HCV clearance.
Collapse
Affiliation(s)
- Pratichi Singh
- Department of Integrative Biology, School of Biosciences and Technology, VIT University, Vellore, Tamil Nadu 632014, India
| | - J Febin Prabhu Dass
- Department of Integrative Biology, School of Biosciences and Technology, VIT University, Vellore, Tamil Nadu 632014, India.
| |
Collapse
|
33
|
Alvarez-Ponce D, Feyertag F, Chakraborty S. Position Matters: Network Centrality Considerably Impacts Rates of Protein Evolution in the Human Protein-Protein Interaction Network. Genome Biol Evol 2018; 9:1742-1756. [PMID: 28854629 PMCID: PMC5570066 DOI: 10.1093/gbe/evx117] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/01/2017] [Indexed: 02/06/2023] Open
Abstract
The proteins of any organism evolve at disparate rates. A long list of factors affecting rates of protein evolution have been identified. However, the relative importance of each factor in determining rates of protein evolution remains unresolved. The prevailing view is that evolutionary rates are dominantly determined by gene expression, and that other factors such as network centrality have only a marginal effect, if any. However, this view is largely based on analyses in yeasts, and accurately measuring the importance of the determinants of rates of protein evolution is complicated by the fact that the different factors are often correlated with each other, and by the relatively poor quality of available functional genomics data sets. Here, we use correlation, partial correlation and principal component regression analyses to measure the contributions of several factors to the variability of the rates of evolution of human proteins. For this purpose, we analyzed the entire human protein–protein interaction data set and the human signal transduction network—a network data set of exceptionally high quality, obtained by manual curation, which is expected to be virtually free from false positives. In contrast with the prevailing view, we observe that network centrality (measured as the number of physical and nonphysical interactions, betweenness, and closeness) has a considerable impact on rates of protein evolution. Surprisingly, the impact of centrality on rates of protein evolution seems to be comparable, or even superior according to some analyses, to that of gene expression. Our observations seem to be independent of potentially confounding factors and from the limitations (biases and errors) of interactomic data sets.
Collapse
|
34
|
Dilucca M, Cimini G, Giansanti A. Essentiality, conservation, evolutionary pressure and codon bias in bacterial genomes. Gene 2018; 663:178-188. [PMID: 29678658 DOI: 10.1016/j.gene.2018.04.017] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Revised: 03/25/2018] [Accepted: 04/09/2018] [Indexed: 11/30/2022]
Abstract
Essential genes constitute the core of genes which cannot be mutated too much nor lost along the evolutionary history of a species. Natural selection is expected to be stricter on essential genes and on conserved (highly shared) genes, than on genes that are either nonessential or peculiar to a single or a few species. In order to further assess this expectation, we study here how essentiality of a gene is connected with its degree of conservation among several unrelated bacterial species, each one characterised by its own codon usage bias. Confirming previous results on E. coli, we show the existence of a universal exponential relation between gene essentiality and conservation in bacteria. Moreover, we show that, within each bacterial genome, there are at least two groups of functionally distinct genes, characterised by different levels of conservation and codon bias: i) a core of essential genes, mainly related to cellular information processing; ii) a set of less conserved nonessential genes with prevalent functions related to metabolism. In particular, the genes in the first group are more retained among species, are subject to a stronger purifying conservative selection and display a more limited repertoire of synonymous codons. The core of essential genes is close to the minimal bacterial genome, which is in the focus of recent studies in synthetic biology, though we confirm that orthologs of genes that are essential in one species are not necessarily essential in other species. We also list a set of highly shared genes which, reasonably, could constitute a reservoir of targets for new anti-microbial drugs.
Collapse
Affiliation(s)
- Maddalena Dilucca
- Dipartimento di Fisica, "Sapienza" University of Rome, Rome 00185, Italy.
| | - Giulio Cimini
- IMT School for Advanced Studies, Lucca 55100, Italy; Istituto dei Sistemi Complessi (ISC)-CNR, Rome 00185, Italy
| | - Andrea Giansanti
- Dipartimento di Fisica, "Sapienza" University of Rome, Rome 00185, Italy; INFN Roma1 Unit, Rome 00185, Italy
| |
Collapse
|
35
|
Schumacher J, Herlyn H. Correlates of evolutionary rates in the murine sperm proteome. BMC Evol Biol 2018; 18:35. [PMID: 29580206 PMCID: PMC5870804 DOI: 10.1186/s12862-018-1157-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2017] [Accepted: 03/19/2018] [Indexed: 01/20/2023] Open
Abstract
Background Protein-coding genes expressed in sperm evolve at different rates. To gain deeper insight into the factors underlying this heterogeneity we examined the relative importance of a diverse set of previously described rate correlates in determining the evolution of murine sperm proteins. Results Using partial rank correlations we detected several major rate indicators: Phyletic gene age, numbers of protein-protein interactions, and survival essentiality emerged as particularly important rate correlates in murine sperm proteins. Tissue specificity, numbers of paralogs, and untranslated region lengths also correlate significantly with sperm genes’ evolutionary rates, albeit to a lesser extent. Multifunctionality, coding sequence or average intron lengths, and mean expression level have insignificant or virtually no independent effects on evolutionary rates in murine sperm genes. Gene ontology enrichment analyses of three equally sized murine sperm protein groups classified based on their evolutionary rates indicate strongest sperm-specific functional specialization in the most quickly evolving gene class. Conclusions We propose a model according to which slowly evolving murine sperm proteins tend to be constrained by factors such as survival essentiality, network connectivity, and/or broad expression. In contrast, evolutionary change may arise especially in less constrained sperm proteins, which might, moreover, be prone to specialize to reproduction-related functions. Our results should be taken into account in future studies on rate variations of reproductive genes. Electronic supplementary material The online version of this article (10.1186/s12862-018-1157-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Julia Schumacher
- Institute of Organismic and Molecular Evolution, Anthropology, Johannes Gutenberg University, Mainz, Germany.
| | - Holger Herlyn
- Institute of Organismic and Molecular Evolution, Anthropology, Johannes Gutenberg University, Mainz, Germany.
| |
Collapse
|
36
|
Feyertag F, Alvarez-Ponce D. Disulfide Bonds Enable Accelerated Protein Evolution. Mol Biol Evol 2018; 34:1833-1837. [PMID: 28431018 DOI: 10.1093/molbev/msx135] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
The different proteins of any proteome evolve at enormously different rates. What factors contribute to this variability, and to what extent, is still a largely open question. We hypothesized that disulfide bonds, by increasing protein stability, should make proteins' structures relatively independent of their amino acid sequences, thus acting as buffers of deleterious mutations and enabling accelerated sequence evolution. In agreement with this hypothesis, we observed that membrane proteins with disulfide bonds evolved 88% faster than those without disulfide bonds, and that extracellular proteins with disulfide bonds evolved 49% faster than those without disulfide bonds. In addition, genes encoding proteins with disulfide bonds exhibit an increased likelihood of showing signatures of positive selection. Multivariate analyses indicate that the trend is independent of a number of potentially confounding factors. The effect, however, is not observed among the longest proteins, which can become stabilized by mechanisms other than disulfide bonds.
Collapse
Affiliation(s)
- Felix Feyertag
- Department of Biology, University of Nevada-Reno, Reno, NV
| | | |
Collapse
|
37
|
KENZAKA TAKEHIKO, YASUI MADOKA, BABA TAKASHI, NASU MASAO, TANI KATSUJI. Positive Selection in F-Box Domain (lpp0233) Encoded in Legionella pneumophila Strains. Biocontrol Sci 2018; 23:53-59. [DOI: 10.4265/bio.23.53] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
Affiliation(s)
- TAKEHIKO KENZAKA
- Faculty of Pharmacy, Osaka Ohtani University
- Graduate School of Pharmaceutical Sciences, Osaka University
| | - MADOKA YASUI
- Graduate School of Pharmaceutical Sciences, Osaka University
| | - TAKASHI BABA
- Graduate School of Pharmaceutical Sciences, Osaka University
| | - MASAO NASU
- Faculty of Pharmacy, Osaka Ohtani University
- Graduate School of Pharmaceutical Sciences, Osaka University
| | - KATSUJI TANI
- Faculty of Pharmacy, Osaka Ohtani University
- Graduate School of Pharmaceutical Sciences, Osaka University
| |
Collapse
|
38
|
Biswas K, Acharya D, Podder S, Ghosh TC. Evolutionary rate heterogeneity between multi- and single-interface hubs across human housekeeping and tissue-specific protein interaction network: Insights from proteins' and its partners' properties. Genomics 2017; 110:283-290. [PMID: 29198610 DOI: 10.1016/j.ygeno.2017.11.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2017] [Revised: 11/10/2017] [Accepted: 11/29/2017] [Indexed: 12/12/2022]
Abstract
Integrating gene expression into protein-protein interaction network (PPIN) leads to the construction of tissue-specific (TS) and housekeeping (HK) sub-networks, with distinctive TS- and HK-hubs. All such hub proteins are divided into multi-interface (MI) hubs and single-interface (SI) hubs, where MI hubs evolve slower than SI hubs. Here we explored the evolutionary rate difference between MI and SI proteins within TS- and HK-PPIN and observed that this difference is present only in TS, but not in HK-class. Next, we explored whether proteins' own properties or its partners' properties are more influential in such evolutionary discrepancy. Statistical analyses revealed that this evolutionary rate correlates negatively with protein's own properties like expression level, miRNA count, conformational diversity and functional properties and with its partners' properties like protein disorder and tissue expression similarity. Moreover, partial correlation and regression analysis revealed that both proteins' and its partners' properties have independent effects on protein evolutionary rate.
Collapse
Affiliation(s)
- Kakali Biswas
- Bioinformatics Centre, Bose Institute, P-1/12, C.I.T. Scheme VII M, Kolkata 700 054, India
| | - Debarun Acharya
- Bioinformatics Centre, Bose Institute, P-1/12, C.I.T. Scheme VII M, Kolkata 700 054, India
| | - Soumita Podder
- Bioinformatics Centre, Bose Institute, P-1/12, C.I.T. Scheme VII M, Kolkata 700 054, India; Department of Microbiology, Raiganj University, Raiganj, Uttar Dinajpur 733134, India
| | - Tapash Chandra Ghosh
- Bioinformatics Centre, Bose Institute, P-1/12, C.I.T. Scheme VII M, Kolkata 700 054, India.
| |
Collapse
|
39
|
Abstract
Gene essentiality is a founding concept of genetics with important implications in both fundamental and applied research. Multiple screens have been performed over the years in bacteria, yeasts, animals and more recently in human cells to identify essential genes. A mounting body of evidence suggests that gene essentiality, rather than being a static and binary property, is both context dependent and evolvable in all kingdoms of life. This concept of a non-absolute nature of gene essentiality changes our fundamental understanding of essential biological processes and could directly affect future treatment strategies for cancer and infectious diseases.
Collapse
|
40
|
Ebel ER, Telis N, Venkataram S, Petrov DA, Enard D. High rate of adaptation of mammalian proteins that interact with Plasmodium and related parasites. PLoS Genet 2017; 13:e1007023. [PMID: 28957326 PMCID: PMC5634635 DOI: 10.1371/journal.pgen.1007023] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2017] [Revised: 10/10/2017] [Accepted: 09/15/2017] [Indexed: 11/18/2022] Open
Abstract
Plasmodium parasites, along with their Piroplasm relatives, have caused malaria-like illnesses in terrestrial mammals for millions of years. Several Plasmodium-protective alleles have recently evolved in human populations, but little is known about host adaptation to blood parasites over deeper evolutionary timescales. In this work, we analyze mammalian adaptation in ~500 Plasmodium- or Piroplasm- interacting proteins (PPIPs) manually curated from the scientific literature. We show that (i) PPIPs are enriched for both immune functions and pleiotropy with other pathogens, and (ii) the rate of adaptation across mammals is significantly elevated in PPIPs, compared to carefully matched control proteins. PPIPs with high pathogen pleiotropy show the strongest signatures of adaptation, but this pattern is fully explained by their immune enrichment. Several pieces of evidence suggest that blood parasites specifically have imposed selection on PPIPs. First, even non-immune PPIPs that lack interactions with other pathogens have adapted at twice the rate of matched controls. Second, PPIP adaptation is linked to high expression in the liver, a critical organ in the parasite life cycle. Finally, our detailed investigation of alpha-spectrin, a major red blood cell membrane protein, shows that domains with particularly high rates of adaptation are those known to interact specifically with P. falciparum. Overall, we show that host proteins that interact with Plasmodium and Piroplasm parasites have experienced elevated rates of adaptation across mammals, and provide evidence that some of this adaptation has likely been driven by blood parasites.
Collapse
Affiliation(s)
- Emily R. Ebel
- Department of Biology, Stanford University, Stanford, California, United States of America
- * E-mail: (ERE); (DE)
| | - Natalie Telis
- Program in Biomedical Informatics, Stanford University, Stanford, California, United States of America
| | - Sandeep Venkataram
- Department of Biology, Stanford University, Stanford, California, United States of America
| | - Dmitri A. Petrov
- Department of Biology, Stanford University, Stanford, California, United States of America
| | - David Enard
- Department of Biology, Stanford University, Stanford, California, United States of America
- * E-mail: (ERE); (DE)
| |
Collapse
|
41
|
Mesbah-Uddin M, Guldbrandtsen B, Iso-Touru T, Vilkki J, De Koning DJ, Boichard D, Lund MS, Sahana G. Genome-wide mapping of large deletions and their population-genetic properties in dairy cattle. DNA Res 2017; 25:49-59. [PMID: 28985340 PMCID: PMC5824824 DOI: 10.1093/dnares/dsx037] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2017] [Accepted: 08/18/2017] [Indexed: 01/10/2023] Open
Abstract
Large genomic deletions are potential candidate for loss-of-function, which could be lethal as homozygote. Analysing whole genome data of 175 cattle, we report 8,480 large deletions (199 bp–773 KB) with an overall false discovery rate of 8.8%; 82% of which are novel compared with deletions in the dbVar database. Breakpoint sequence analyses revealed that majority (24 of 29 tested) of the deletions contain microhomology/homology at breakpoint, and therefore, most likely generated by microhomology-mediated end joining. We observed higher differentiation among breeds for deletions in some genic-regions, such as ABCA12, TTC1, VWA3B, TSHR, DST/BPAG1, and CD1D. The genes overlapping deletions are on average evolutionarily less conserved compared with known mouse lethal genes (P-value = 2.3 × 10−6). We report 167 natural gene knockouts in cattle that are apparently nonessential as live homozygote individuals are observed. These genes are functionally enriched for immunoglobulin domains, olfactory receptors, and MHC classes (FDR = 2.06 × 10−22, 2.06 × 10−22, 7.01 × 10−6, respectively). We also demonstrate that deletions are enriched for health and fertility related quantitative trait loci (2-and 1.5-fold enrichment, Fisher’s P-value = 8.91 × 10−10 and 7.4 × 10−11, respectively). Finally, we identified and confirmed the breakpoint of a ∼525 KB deletion on Chr23:12,291,761-12,817,087 (overlapping BTBD9, GLO1 and DNAH8), causing stillbirth in Nordic Red Cattle.
Collapse
Affiliation(s)
- Md Mesbah-Uddin
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, 8830 Tjele, Denmark.,Animal Genetics and Integrative Biology, UMR 1313 GABI, INRA, AgroParisTech, Université Paris-Saclay, 78350 Jouy-en-Josas, France
| | - Bernt Guldbrandtsen
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, 8830 Tjele, Denmark
| | - Terhi Iso-Touru
- Green Technology, Natural Resources Institute Finland, FI-31600 Jokioinen, Finland
| | - Johanna Vilkki
- Green Technology, Natural Resources Institute Finland, FI-31600 Jokioinen, Finland
| | - Dirk-Jan De Koning
- Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, SE-750?07 Uppsala, Sweden
| | - Didier Boichard
- Animal Genetics and Integrative Biology, UMR 1313 GABI, INRA, AgroParisTech, Université Paris-Saclay, 78350 Jouy-en-Josas, France
| | - Mogens Sandø Lund
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, 8830 Tjele, Denmark
| | - Goutam Sahana
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, 8830 Tjele, Denmark
| |
Collapse
|
42
|
Feyertag F, Berninsone PM, Alvarez-Ponce D. Secreted Proteins Defy the Expression Level-Evolutionary Rate Anticorrelation. Mol Biol Evol 2017; 34:692-706. [PMID: 28007979 DOI: 10.1093/molbev/msw268] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
The rates of evolution of the proteins of any organism vary across orders of magnitude. A primary factor influencing rates of protein evolution is expression. A strong negative correlation between expression levels and evolutionary rates (the so-called E-R anticorrelation) has been observed in virtually all studied organisms. This effect is currently attributed to the abundance-dependent fitness costs of misfolding and unspecific protein-protein interactions, among other factors. Secreted proteins are folded in the endoplasmic reticulum, a compartment where chaperones, folding catalysts, and stringent quality control mechanisms promote their correct folding and may reduce the fitness costs of misfolding. In addition, confinement of secreted proteins to the extracellular space may reduce misinteractions and their deleterious effects. We hypothesize that each of these factors (the secretory pathway quality control and extracellular location) may reduce the strength of the E-R anticorrelation. Indeed, here we show that among human proteins that are secreted to the extracellular space, rates of evolution do not correlate with protein abundances. This trend is robust to controlling for several potentially confounding factors and is also observed when analyzing protein abundance data for 6 human tissues. In addition, analysis of mRNA abundance data for 32 human tissues shows that the E-R correlation is always less negative, and sometimes nonsignificant, in secreted proteins. Similar observations were made in Caenorhabditis elegans and in Escherichia coli, and to a lesser extent in Drosophila melanogaster, Saccharomyces cerevisiae and Arabidopsis thaliana. Our observations contribute to understand the causes of the E-R anticorrelation.
Collapse
Affiliation(s)
- Felix Feyertag
- Department of Biology, University of Nevada, Reno, Reno, NV
| | | | | |
Collapse
|
43
|
An Essential Regulatory System Originating from Polygenic Transcriptional Rewiring of PhoP-PhoQ of Xanthomonas campestris. Genetics 2017; 206:2207-2223. [PMID: 28550013 DOI: 10.1534/genetics.117.200204] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2017] [Accepted: 05/22/2017] [Indexed: 01/06/2023] Open
Abstract
How essential, regulatory genes originate and evolve is intriguing because mutations of these genes not only lead to lethality in organisms, but also have pleiotropic effects since they control the expression of multiple downstream genes. Therefore, the evolution of essential, regulatory genes is not only determined by genetic variations of their own sequences, but also by the biological function of downstream genes and molecular mechanisms of regulation. To understand the origin of essential, regulatory genes, experimental dissection of the complete regulatory cascade is needed. Here, we provide genetic evidences to reveal that PhoP-PhoQ is an essential two-component signal transduction system in the gram-negative bacterium Xanthomonas campestris, but that its orthologs in other bacteria belonging to Proteobacteria are nonessential. Mutational, biochemical, and chromatin immunoprecipitation together with high-throughput sequencing analyses revealed that phoP and phoQ of X. campestris and its close relative Pseudomonas aeruginosa are replaceable, and that the consensus binding motifs of the transcription factor PhoP are also highly conserved. PhoP Xcc in X. campestris regulates the transcription of a number of essential, structural genes by directly binding to cis-regulatory elements (CREs); however, these CREs are lacking in the orthologous essential, structural genes in P. aeruginosa, and thus the regulatory relationships between PhoP Pae and these downstream essential genes are disassociated. Our findings suggested that the recruitment of regulatory proteins by critical structural genes via transcription factor-CRE rewiring is a driving force in the origin and functional divergence of essential, regulatory genes.
Collapse
|
44
|
Effects of different kinds of essentiality on sequence evolution of human testis proteins. Sci Rep 2017; 7:43534. [PMID: 28272493 PMCID: PMC5341092 DOI: 10.1038/srep43534] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Accepted: 01/25/2017] [Indexed: 11/17/2022] Open
Abstract
We asked if essentiality for either fertility or viability differentially affects sequence evolution of human testis proteins. Based on murine knockout data, we classified a set of 965 proteins expressed in human seminiferous tubules into three categories: proteins essential for prepubertal survival (“lethality proteins”), associated with male sub- or infertility (“male sub-/infertility proteins”), and nonessential proteins. In our testis protein dataset, lethality genes evolved significantly slower than nonessential and male sub-/infertility genes, which is in line with other authors’ findings. Using tissue specificity, connectivity in the protein-protein interaction (PPI) network, and multifunctionality as proxies for evolutionary constraints, we found that of the three categories, proteins linked to male sub- or infertility are least constrained. Lethality proteins, on the other hand, are characterized by broad expression, many PPI partners, and high multifunctionality, all of which points to strong evolutionary constraints. We conclude that compared with lethality proteins, those linked to male sub- or infertility are nonetheless indispensable, but evolve under more relaxed constraints. Finally, adaptive evolution in response to postmating sexual selection could further accelerate evolutionary rates of male sub- or infertility proteins expressed in human testis. These findings may become useful for in silico detection of human sub-/infertility genes.
Collapse
|
45
|
Stanley CE, Kulathinal RJ. Neurogenomics and the role of a large mutational target on rapid behavioral change. Biol Direct 2016; 11:60. [PMID: 27825385 PMCID: PMC5101817 DOI: 10.1186/s13062-016-0162-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2016] [Accepted: 10/24/2016] [Indexed: 01/06/2023] Open
Abstract
BACKGROUND Behavior, while complex and dynamic, is among the most diverse, derived, and rapidly evolving traits in animals. The highly labile nature of heritable behavioral change is observed in such evolutionary phenomena as the emergence of converged behaviors in domesticated animals, the rapid evolution of preferences, and the routine development of ethological isolation between diverging populations and species. In fact, it is believed that nervous system development and its potential to evolve a seemingly infinite array of behavioral innovations played a major role in the successful diversification of metazoans, including our own human lineage. However, unlike other rapidly evolving functional systems such as sperm-egg interactions and immune defense, the genetic basis of rapid behavioral change remains elusive. PRESENTATION OF THE HYPOTHESIS Here we propose that the rapid divergence and widespread novelty of innate and adaptive behavior is primarily a function of its genomic architecture. Specifically, we hypothesize that the broad diversity of behavioral phenotypes present at micro- and macroevolutionary scales is promoted by a disproportionately large mutational target of neurogenic genes. We present evidence that these large neuro-behavioral targets are significant and ubiquitous in animal genomes and suggest that behavior's novelty and rapid emergence are driven by a number of factors including more selection on a larger pool of variants, a greater role of phenotypic plasticity, and/or unique molecular features present in large genes. We briefly discuss the origins of these large neurogenic genes, as they relate to the remarkable diversity of metazoan behaviors, and highlight key consequences on both behavioral traits and neurogenic disease across, respectively, evolutionary and ontogenetic time scales. TESTING THE HYPOTHESIS Current approaches to studying the genetic mechanisms underlying rapid phenotypic change primarily focus on identifying signatures of Darwinian selection in protein-coding regions. In contrast, the large mutational target hypothesis places genomic architecture and a larger allelic pool at the forefront of rapid evolutionary change, particularly in genetic systems that are polygenic and regulatory in nature. Genomic data from brain and neural tissues in mammals as well as a preliminary survey of neurogenic genes from comparative genomic data support this hypothesis while rejecting both positive and relaxed selection on proteins or higher mutation rates. In mammals and invertebrates, neurogenic genes harbor larger protein-coding regions and possess a richer regulatory repertoire of miRNA targets and transcription factor binding sites. Overall, neurogenic genes cover a disproportionately large genomic fraction, providing a sizeable substrate for evolutionary, genetic, and molecular mechanisms to act upon. Readily available comparative and functional genomic data provide unexplored opportunities to test whether a distinct neurogenomic architecture can promote rapid behavioral change via several mechanisms unique to large genes, and which components of this large footprint are uniquely metazoan. IMPLICATIONS OF THE HYPOTHESIS The large mutational target hypothesis highlights the eminent roles of mutation and functional genomic architecture in generating rapid developmental and evolutionary change. It has broad implications on our understanding of the genetics of complex adaptive traits such as behavior by focusing on the importance of mutational input, from SNPs to alternative transcripts to transposable elements, on driving evolutionary rates of functional systems. Such functional divergence has important implications in promoting behavioral isolation across short- and long-term timescales. Due to genome-scaled polygenic adaptation, the large target effect also contributes to our inability to identify adapted behavioral candidate genes. The presence of large neurogenic genes, particularly in the mammalian brain and other neural tissues, further offers emerging insight into the etiology of neurodevelopmental and neurodegenerative diseases. The well-known correlation between neurological spectrum disorders in children and paternal age may simply be a direct result of aging fathers accumulating mutations across these large neurodevelopmental genes. The large mutational target hypothesis can also explain the rapid evolution of other functional systems covering a large genomic fraction such as male fertility and its preferential association with hybrid male sterility among closely related taxa. Overall, a focus on mutational potential may increase our power in understanding the genetic basis of complex phenotypes such as behavior while filling a general gap in understanding their evolution.
Collapse
Affiliation(s)
- Craig E. Stanley
- Department of Biology, Temple University, Philadelphia, PA 19122 USA
- Institute of Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA 19122 USA
| | - Rob J. Kulathinal
- Department of Biology, Temple University, Philadelphia, PA 19122 USA
- Institute of Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA 19122 USA
| |
Collapse
|
46
|
Alvarez-Ponce D, Sabater-Muñoz B, Toft C, Ruiz-González MX, Fares MA. Essentiality Is a Strong Determinant of Protein Rates of Evolution during Mutation Accumulation Experiments in Escherichia coli. Genome Biol Evol 2016; 8:2914-2927. [PMID: 27566759 PMCID: PMC5630975 DOI: 10.1093/gbe/evw205] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The Neutral Theory of Molecular Evolution is considered the most powerful theory to understand the evolutionary behavior of proteins. One of the main predictions of this theory is that essential proteins should evolve slower than dispensable ones owing to increased selective constraints. Comparison of genomes of different species, however, has revealed only small differences between the rates of evolution of essential and nonessential proteins. In some analyses, these differences vanish once confounding factors are controlled for, whereas in other cases essentiality seems to have an independent, albeit small, effect. It has been argued that comparing relatively distant genomes may entail a number of limitations. For instance, many of the genes that are dispensable in controlled lab conditions may be essential in some of the conditions faced in nature. Moreover, essentiality can change during evolution, and rates of protein evolution are simultaneously shaped by a variety of factors, whose individual effects are difficult to isolate. Here, we conducted two parallel mutation accumulation experiments in Escherichia coli, during 5,500–5,750 generations, and compared the genomes at different points of the experiments. Our approach (a short-term experiment, under highly controlled conditions) enabled us to overcome many of the limitations of previous studies. We observed that essential proteins evolved substantially slower than nonessential ones during our experiments. Strikingly, rates of protein evolution were only moderately affected by expression level and protein length.
Collapse
Affiliation(s)
| | - Beatriz Sabater-Muñoz
- Instituto de Biología Molecular y Celular de Plantas (CSIC-UPV), Valencia, Spain Department of Genetics, Smurfit Institute of Genetics, University of Dublin, Trinity College Dublin, Dublin, Ireland
| | - Christina Toft
- Department of Genetics, University of Valencia, Valencia, Spain Departamento de Biotecnología, Instituto de Agroquímica y Tecnología de los Alimentos (CSIC), Valencia, Spain
| | - Mario X Ruiz-González
- Instituto de Biología Molecular y Celular de Plantas (CSIC-UPV), Valencia, Spain Current Address: Secretaría de Educación Superior, Ciencia, Tecnología e Innovación, Proyecto Prometeo; Departamento de Ciencias Biológicas, Universidad Tócnica Particular de Loja, Loja, Ecuador
| | - Mario A Fares
- Instituto de Biología Molecular y Celular de Plantas (CSIC-UPV), Valencia, Spain Department of Genetics, Smurfit Institute of Genetics, University of Dublin, Trinity College Dublin, Dublin, Ireland
| |
Collapse
|
47
|
Mannakee BK, Gutenkunst RN. Selection on Network Dynamics Drives Differential Rates of Protein Domain Evolution. PLoS Genet 2016; 12:e1006132. [PMID: 27380265 PMCID: PMC4933380 DOI: 10.1371/journal.pgen.1006132] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2016] [Accepted: 05/27/2016] [Indexed: 11/19/2022] Open
Abstract
The long-held principle that functionally important proteins evolve slowly has recently been challenged by studies in mice and yeast showing that the severity of a protein knockout only weakly predicts that protein's rate of evolution. However, the relevance of these studies to evolutionary changes within proteins is unknown, because amino acid substitutions, unlike knockouts, often only slightly perturb protein activity. To quantify the phenotypic effect of small biochemical perturbations, we developed an approach to use computational systems biology models to measure the influence of individual reaction rate constants on network dynamics. We show that this dynamical influence is predictive of protein domain evolutionary rate within networks in vertebrates and yeast, even after controlling for expression level and breadth, network topology, and knockout effect. Thus, our results not only demonstrate the importance of protein domain function in determining evolutionary rate, but also the power of systems biology modeling to uncover unanticipated evolutionary forces.
Collapse
Affiliation(s)
- Brian K. Mannakee
- Division of Epidemiology and Biostatistics, Mel and Enid Zuckerman College of Public Health, University of Arizona, Tucson, Arizona, United States of America
| | - Ryan N. Gutenkunst
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, Arizona, United States of America
- * E-mail:
| |
Collapse
|
48
|
Biswas K, Chakraborty S, Podder S, Ghosh TC. Insights into the dN/dS ratio heterogeneity between brain specific genes and widely expressed genes in species of different complexity. Genomics 2016; 108:11-7. [DOI: 10.1016/j.ygeno.2016.04.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2015] [Revised: 04/22/2016] [Accepted: 04/23/2016] [Indexed: 01/07/2023]
|
49
|
Gu X, Tang W. Model parameters of molecular evolution explain genomic correlations. Brief Bioinform 2015; 18:37-42. [PMID: 26628558 DOI: 10.1093/bib/bbv098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2015] [Revised: 10/01/2015] [Indexed: 11/13/2022] Open
Abstract
One long-standing research focus in evolutionary genomics is trying to resolve how biological variables (expression, essentiality, protein-protein interaction, structural stability, etc.) determine the rate of protein evolution. While these studies have considerably deepened our understanding of molecular evolution, many issues remain unsolved. In this opinion article, after having a brief survey of literatures, we establish relationships between model parameters of molecular evolution and genomic variables, based on which, most-observed genomic correlations and confounds can be explained by model parameter combinations under different conditions, which include the strength of stabilizing selection, mutational variance, expression sufficiency, gene pleiotropy, as well as the effective population size. We suggest that the problem to discern biological variable(s) that may determine the rate of protein evolution can be tackled at two levels. The first level, as discussed here, is to demonstrate how the model of molecular evolution can predict potential genomic correlations under various conditions. And the second level is to estimate genome-wide variations of model parameters (or combinations) that help to identify canonical biological variables that may underlie the rate variation among genes that ranges up to at least three magnitudes.
Collapse
|
50
|
Karn RC, Laukaitis CM. Comparative Proteomics of Mouse Tears and Saliva: Evidence from Large Protein Families for Functional Adaptation. Proteomes 2015; 3:283-297. [PMID: 28248272 PMCID: PMC5217377 DOI: 10.3390/proteomes3030283] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2015] [Revised: 07/29/2015] [Accepted: 08/27/2015] [Indexed: 12/27/2022] Open
Abstract
We produced a tear proteome of the genome mouse, C57BL/6, that contained 139 different protein identifications: 110 from a two-dimensional (2D) gel with subsequent trypsin digestion, 19 from a one-dimensional (1D) gel with subsequent trypsin digestion and ten from a 1D gel with subsequent Asp-N digestion. We compared this tear proteome with a C57BL/6 mouse saliva proteome produced previously. Sixteen of the 139 tear proteins are shared between the two proteomes, including six proteins that combat microbial growth. Among the 123 other tear proteins, were members of four large protein families that have no counterparts in humans: Androgen-binding proteins (ABPs) with different members expressed in the two proteomes, Exocrine secreted peptides (ESPs) expressed exclusively in the tear proteome, major urinary proteins (MUPs) expressed in one or both proteomes and the mouse-specific Kallikreins (subfamily b KLKs) expressed exclusively in the saliva proteome. All four families have members with suggested roles in mouse communication, which may influence some aspect of reproductive behavior. We discuss this in the context of functional adaptation involving tear and saliva proteins in the secretions of mouse lacrimal and salivary glands, respectively.
Collapse
Affiliation(s)
- Robert C Karn
- College of Medicine, University of Arizona, Tucson, AZ 85724, USA.
| | | |
Collapse
|