51
|
Durand É, Gagnon-Arsenault I, Hallin J, Hatin I, Dubé AK, Nielly-Thibault L, Namy O, Landry CR. Turnover of ribosome-associated transcripts from de novo ORFs produces gene-like characteristics available for de novo gene emergence in wild yeast populations. Genome Res 2019; 29:932-943. [PMID: 31152050 PMCID: PMC6581059 DOI: 10.1101/gr.239822.118] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2018] [Accepted: 05/13/2019] [Indexed: 12/17/2022]
Abstract
Little is known about the rate of emergence of de novo genes, what their initial properties are, and how they spread in populations. We examined wild yeast populations (Saccharomyces paradoxus) to characterize the diversity and turnover of intergenic ORFs over short evolutionary timescales. We find that hundreds of intergenic ORFs show translation signatures similar to canonical genes, and we experimentally confirmed the translation of many of these ORFs in laboratory conditions using a reporter assay. Compared with canonical genes, intergenic ORFs have lower translation efficiency, which could imply a lack of optimization for translation or a mechanism to reduce their production cost. Translated intergenic ORFs also tend to have sequence properties that are generally close to those of random intergenic sequences. However, some of the very recent translated intergenic ORFs, which appeared <110 kya, already show gene-like characteristics, suggesting that the raw material for functional innovations could appear over short evolutionary timescales.
Collapse
Affiliation(s)
- Éléonore Durand
- Institut de Biologie Intégrative et des Systèmes, Département de Biologie, PROTEO, Centre de Recherche en Données Massives de l'Université Laval, Pavillon Charles-Eugène-Marchand, Université Laval, G1V 0A6 Québec, Québec, Canada
| | - Isabelle Gagnon-Arsenault
- Institut de Biologie Intégrative et des Systèmes, Département de Biologie, PROTEO, Centre de Recherche en Données Massives de l'Université Laval, Pavillon Charles-Eugène-Marchand, Université Laval, G1V 0A6 Québec, Québec, Canada.,Département de Biochimie, Microbiologie et Bio-informatique, Université Laval, G1V 0A6 Québec, Québec, Canada
| | - Johan Hallin
- Institut de Biologie Intégrative et des Systèmes, Département de Biologie, PROTEO, Centre de Recherche en Données Massives de l'Université Laval, Pavillon Charles-Eugène-Marchand, Université Laval, G1V 0A6 Québec, Québec, Canada.,Département de Biochimie, Microbiologie et Bio-informatique, Université Laval, G1V 0A6 Québec, Québec, Canada
| | - Isabelle Hatin
- Institut de Biologie Intégrative de la Cellule (I2BC), CEA, CNRS, Université Paris-Sud, Université Paris-Saclay, 91190 Gif sur Yvette, France
| | - Alexandre K Dubé
- Institut de Biologie Intégrative et des Systèmes, Département de Biologie, PROTEO, Centre de Recherche en Données Massives de l'Université Laval, Pavillon Charles-Eugène-Marchand, Université Laval, G1V 0A6 Québec, Québec, Canada.,Département de Biochimie, Microbiologie et Bio-informatique, Université Laval, G1V 0A6 Québec, Québec, Canada
| | - Lou Nielly-Thibault
- Institut de Biologie Intégrative et des Systèmes, Département de Biologie, PROTEO, Centre de Recherche en Données Massives de l'Université Laval, Pavillon Charles-Eugène-Marchand, Université Laval, G1V 0A6 Québec, Québec, Canada
| | - Olivier Namy
- Institut de Biologie Intégrative de la Cellule (I2BC), CEA, CNRS, Université Paris-Sud, Université Paris-Saclay, 91190 Gif sur Yvette, France
| | - Christian R Landry
- Institut de Biologie Intégrative et des Systèmes, Département de Biologie, PROTEO, Centre de Recherche en Données Massives de l'Université Laval, Pavillon Charles-Eugène-Marchand, Université Laval, G1V 0A6 Québec, Québec, Canada.,Département de Biochimie, Microbiologie et Bio-informatique, Université Laval, G1V 0A6 Québec, Québec, Canada
| |
Collapse
|
52
|
Song H, Sun J, Yang G. Old and young duplicate genes reveal different responses to environmental changes in Arachis duranensis. Mol Genet Genomics 2019; 294:1199-1209. [PMID: 31076861 DOI: 10.1007/s00438-019-01574-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2019] [Accepted: 05/03/2019] [Indexed: 11/24/2022]
Abstract
Old and young duplicate genes have been reported in some organisms. However, little is known about the properties of old and young duplicate genes in Arachis. Here, we have identified old and young duplicate genes in Arachis duranensis, and analyzed the evolution, gene complexity, gene expression pattern, and functional divergence between old and young duplicate genes. Our results showed different evolutionary, gene complexity and gene expression patterns, as well as differing correlations between old and young duplicate genes. Gene ontology results showed that old duplicate genes play a crucial role in lipid and amino acid biosynthesis and the oxidation-reduction process and that young duplicate genes are preferentially involved in photosynthesis and response to biotic stimulus. Transcriptome data sets revealed that most old and young duplicate genes had asymmetric function, and only a few duplicate genes exhibited symmetric function under drought and nematode stress. We found that old duplicate genes are preferentially involved in lipid and amino acid metabolism and response to abiotic stress, while young duplicate genes are likely to participate in photosynthesis and response to biotic stress. This work provides a better understanding of the evolution and functional divergence of old and young duplicate genes in A. duranensis.
Collapse
Affiliation(s)
- Hui Song
- Grassland Agri-husbandry Research Center, Qingdao Agricultural University, Qingdao, China.
| | - Juan Sun
- Grassland Agri-husbandry Research Center, Qingdao Agricultural University, Qingdao, China
| | - Guofeng Yang
- Grassland Agri-husbandry Research Center, Qingdao Agricultural University, Qingdao, China.
| |
Collapse
|
53
|
Affiliation(s)
- Stephen Branden Van Oss
- Department of Computational and Systems Biology, Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States of America
| | - Anne-Ruxandra Carvunis
- Department of Computational and Systems Biology, Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States of America
| |
Collapse
|
54
|
Jain A, Perisa D, Fliedner F, von Haeseler A, Ebersberger I. The Evolutionary Traceability of a Protein. Genome Biol Evol 2019; 11:531-545. [PMID: 30649284 PMCID: PMC6394115 DOI: 10.1093/gbe/evz008] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/11/2019] [Indexed: 12/12/2022] Open
Abstract
Orthologs document the evolution of genes and metabolic capacities encoded in extant and ancient genomes. However, the similarity between orthologs decays with time, and ultimately it becomes insufficient to infer common ancestry. This leaves ancient gene set reconstructions incomplete and distorted to an unknown extent. Here we introduce the “evolutionary traceability” as a measure that quantifies, for each protein, the evolutionary distance beyond which the sensitivity of the ortholog search becomes limiting. Using yeast, we show that genes that were thought to date back to the last universal common ancestor are of high traceability. Their functions mostly involve catalysis, ion transport, and ribonucleoprotein complex assembly. In turn, the fraction of yeast genes whose traceability is not sufficient to infer their presence in last universal common ancestor is enriched for regulatory functions. Computing the traceabilities of genes that have been experimentally characterized as being essential for a self-replicating cell reveals that many of the genes that lack orthologs outside bacteria have low traceability. This leaves open whether their orthologs in the eukaryotic and archaeal domains have been overlooked. Looking at the example of REC8, a protein essential for chromosome cohesion, we demonstrate how a traceability-informed adjustment of the search sensitivity identifies hitherto missed orthologs in the fast-evolving microsporidia. Taken together, the evolutionary traceability helps to differentiate between true absence and nondetection of orthologs, and thus improves our understanding about the evolutionary conservation of functional protein networks. “protTrace,” a software tool for computing evolutionary traceability, is freely available at https://github.com/BIONF/protTrace.git; last accessed February 10, 2019.
Collapse
Affiliation(s)
- Arpit Jain
- Applied Bioinformatics Group, Institute of Cell Biology & Neuroscience, Goethe University, Frankfurt, Germany
| | - Dominik Perisa
- Applied Bioinformatics Group, Institute of Cell Biology & Neuroscience, Goethe University, Frankfurt, Germany
| | - Fabian Fliedner
- Applied Bioinformatics Group, Institute of Cell Biology & Neuroscience, Goethe University, Frankfurt, Germany
| | - Arndt von Haeseler
- Center for Integrative Bioinformatics Vienna, Max F. Perutz Laboratories, University of Vienna, Medical University Vienna, Austria.,Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna, Austria
| | - Ingo Ebersberger
- Applied Bioinformatics Group, Institute of Cell Biology & Neuroscience, Goethe University, Frankfurt, Germany.,Senckenberg Biodiversity and Climate Research Center (BiK-F), Frankfurt, Germany.,LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Frankfurt, Germany
| |
Collapse
|
55
|
Gauthier L, Di Franco R, Serohijos AWR. SodaPop: a forward simulation suite for the evolutionary dynamics of asexual populations on protein fitness landscapes. Bioinformatics 2019; 35:4053-4062. [DOI: 10.1093/bioinformatics/btz175] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2018] [Revised: 01/21/2019] [Accepted: 03/12/2019] [Indexed: 11/14/2022] Open
Abstract
Abstract
Motivation
Protein evolution is determined by forces at multiple levels of biological organization. Random mutations have an immediate effect on the biophysical properties, structure and function of proteins. These same mutations also affect the fitness of the organism. However, the evolutionary fate of mutations, whether they succeed to fixation or are purged, also depends on population size and dynamics. There is an emerging interest, both theoretically and experimentally, to integrate these two factors in protein evolution. Although there are several tools available for simulating protein evolution, most of them focus on either the biophysical or the population-level determinants, but not both. Hence, there is a need for a publicly available computational tool to explore both the effects of protein biophysics and population dynamics on protein evolution.
Results
To address this need, we developed SodaPop, a computational suite to simulate protein evolution in the context of the population dynamics of asexual populations. SodaPop accepts as input several fitness landscapes based on protein biochemistry or other user-defined fitness functions. The user can also provide as input experimental fitness landscapes derived from deep mutational scanning approaches or theoretical landscapes derived from physical force field estimates. Here, we demonstrate the broad utility of SodaPop with different applications describing the interplay of selection for protein properties and population dynamics. SodaPop is designed such that population geneticists can explore the influence of protein biochemistry on patterns of genetic variation, and that biochemists and biophysicists can explore the role of population size and demography on protein evolution.
Availability and implementation
Source code and binaries are freely available at https://github.com/louisgt/SodaPop under the GNU GPLv3 license. The software is implemented in C++ and supported on Linux, Mac OS/X and Windows.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Louis Gauthier
- Département de Biochimie, Université de Montréal, Montréal, QC, Canada
- Centre Robert-Cedergren en Bioinformatique et Génomique, Université de Montréal, Montréal, QC, Canada
| | - Rémicia Di Franco
- Département de Biochimie, Université de Montréal, Montréal, QC, Canada
- Centre Robert-Cedergren en Bioinformatique et Génomique, Université de Montréal, Montréal, QC, Canada
- Enseirb-Matmeca, Bordeaux Institute of Technology, Talence, France
| | - Adrian W R Serohijos
- Département de Biochimie, Université de Montréal, Montréal, QC, Canada
- Centre Robert-Cedergren en Bioinformatique et Génomique, Université de Montréal, Montréal, QC, Canada
| |
Collapse
|
56
|
Salvador-Martínez I, Coronado-Zamora M, Castellano D, Barbadilla A, Salazar-Ciudad I. Mapping Selection within Drosophila melanogaster Embryo's Anatomy. Mol Biol Evol 2019; 35:66-79. [PMID: 29040697 DOI: 10.1093/molbev/msx266] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
We present a survey of selection across Drosophila melanogaster embryonic anatomy. Our approach integrates genomic variation, spatial gene expression patterns, and development with the aim of mapping adaptation over the entire embryo's anatomy. Our adaptation map is based on analyzing spatial gene expression information for 5,969 genes (from text-based annotations of in situ hybridization data directly from the BDGP database, Tomancak et al. 2007) and the polymorphism and divergence in these genes (from the project DGRP, Mackay et al. 2012).The proportion of nonsynonymous substitutions that are adaptive, neutral, or slightly deleterious are estimated for the set of genes expressed in each embryonic anatomical structure using the distribution of fitness effects-alpha method (Eyre-Walker and Keightley 2009). This method is a robust derivative of the McDonald and Kreitman test (McDonald and Kreitman 1991). We also explore whether different anatomical structures differ in the phylogenetic age, codon usage, or expression bias of the genes they express and whether genes expressed in many anatomical structures show more adaptive substitutions than other genes.We found that: 1) most of the digestive system and ectoderm-derived structures are under selective constraint, 2) the germ line and some specific mesoderm-derived structures show high rates of adaptive substitution, and 3) the genes that are expressed in a small number of anatomical structures show higher expression bias, lower phylogenetic ages, and less constraint.
Collapse
Affiliation(s)
- Irepan Salvador-Martínez
- Evo-devo Helsinki Community, Centre of Excellence in Experimental and Computational Developmental Biology, Institute of Biotechnology, University of Helsinki, Helsinki, Finland
| | - Marta Coronado-Zamora
- Departament de Genètica i de Microbiologia, Genomics, Bioinformatics and Evolution, Departament de Genètica i Microbiologia, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
| | - David Castellano
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| | - Antonio Barbadilla
- Departament de Genètica i de Microbiologia, Genomics, Bioinformatics and Evolution, Departament de Genètica i Microbiologia, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
| | - Isaac Salazar-Ciudad
- Evo-devo Helsinki Community, Centre of Excellence in Experimental and Computational Developmental Biology, Institute of Biotechnology, University of Helsinki, Helsinki, Finland.,Departament de Genètica i de Microbiologia, Genomics, Bioinformatics and Evolution, Departament de Genètica i Microbiologia, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
| |
Collapse
|
57
|
Abstract
Several studies have pointed out that the tight correlation between genes' evolutionary rate is better explained by a model denoted as the Universal PaceMaker (UPM) rather than by a simple rate constancy as manifested by the classical hypothesis of molecular clock (MC). Under UPM, each gene is associated with a single pacemaker (PM) and varies its evolutionary rate according to this PM ticks. Hence, the relative rates of all genes associated with the same PM remain nearly constant, whereas the absolute rates can change arbitrarily according to the PM ticks. A consequent question to that mentioned is finding the gene-PM association only from the gene sequence data. This, however, turns to be a nontrivial task and is affected by the number of variables, their random noise, and the amount of available information. To this end, a clustering heuristic was devised by exploiting the correlation between corresponding edge lengths across thousands of gene trees. Nevertheless, no theoretical study linking the relationship between the affecting parameters was done. We here study this question by providing theoretical bounds, expressed by the system parameters, on probabilities for positive and negative results. We corroborate these results by a simulation study that reveals the critical role of the variances.
Collapse
Affiliation(s)
- Sagi Snir
- The Department of Evolutionary and Environmental Biology, University of Haifa, Haifa, Israel
| |
Collapse
|
58
|
Mao XF, Chen XP, Jin YB, Cui JH, Pan YM, Lai CY, Lin KR, Ling F, Luo W. The variations of TRBV genes usages in the peripheral blood of a healthy population are associated with their evolution and single nucleotide polymorphisms. Hum Immunol 2018; 80:195-203. [PMID: 30576702 DOI: 10.1016/j.humimm.2018.12.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2018] [Revised: 12/10/2018] [Accepted: 12/17/2018] [Indexed: 11/16/2022]
Abstract
T cell receptors (TCRs) are a class of T cell surface molecules that recognize the antigen-derived peptides presented by the major histocompatibility complex (MHC) and are able to trigger a series of immune responses. TCRs are important members of the adaptive immune system that arose in the jawed fish 500 million years ago. T cell receptor beta variable (TRBV) genes have been widely used to characterize TCR repertoires. Studying the evolution of TRBV may help us to better understand the adaptive immune system. To investigate TRBV evolution and its impacts on the usages of TRBV genes in human populations, we compared the TRBV genes and their homologous sequences among humans, mouse, rhesus and chimpanzee, analyzed the single-nucleotide polymorphisms (SNPs) located at TRBV loci, and sequenced TCR repertoires in the peripheral blood of 97 healthy donors. We found that functional TRBVs are more evolutionarily conserved but possess more SNPs in human populations than do nonfunctional (pseudo) TRBVs. Based on the conservation levels in the four species, we classified the functional TRBVs into 2 groups: old (conserved between mouse and humans) and new (conserved only in primates). The new TRBVs evolve faster and possess more SNPs than the old TRBVs. The variations in TRBV genes frequencies in the peripheral blood of healthy donors are negatively correlated with SNP density. These observations suggest that TRBV usages may be influenced by TCR-MHC co-evolution.
Collapse
Affiliation(s)
- Xiao-Fan Mao
- Clinical Research Institute, Sun Yat-Sen University Foshan Hospital, Foshan, China; Department of Molecular Biology, School of Bioengineering and Biotechnology, South China University of Technology, Guangzhou, China
| | - Xiang-Ping Chen
- Clinical Research Institute, Sun Yat-Sen University Foshan Hospital, Foshan, China
| | - Ya-Bin Jin
- Clinical Research Institute, Sun Yat-Sen University Foshan Hospital, Foshan, China
| | - Jin-Huan Cui
- Clinical Research Institute, Sun Yat-Sen University Foshan Hospital, Foshan, China
| | - Ying-Ming Pan
- Clinical Research Institute, Sun Yat-Sen University Foshan Hospital, Foshan, China
| | - Chun-Yan Lai
- Center of Health Management, Sun Yat-Sen University Foshan Hospital, Foshan, China
| | - Kai-Rong Lin
- Clinical Research Institute, Sun Yat-Sen University Foshan Hospital, Foshan, China
| | - Fei Ling
- Department of Molecular Biology, School of Bioengineering and Biotechnology, South China University of Technology, Guangzhou, China.
| | - Wei Luo
- Clinical Research Institute, Sun Yat-Sen University Foshan Hospital, Foshan, China.
| |
Collapse
|
59
|
Petitjean C, Makarova KS, Wolf YI, Koonin EV. Extreme Deviations from Expected Evolutionary Rates in Archaeal Protein Families. Genome Biol Evol 2018; 9:2791-2811. [PMID: 28985292 PMCID: PMC5737733 DOI: 10.1093/gbe/evx189] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/12/2017] [Indexed: 02/07/2023] Open
Abstract
Origin of new biological functions is a complex phenomenon ranging from single-nucleotide substitutions to the gain of new genes via horizontal gene transfer or duplication. Neofunctionalization and subfunctionalization of proteins is often attributed to the emergence of paralogs that are subject to relaxed purifying selection or positive selection and thus evolve at accelerated rates. Such phenomena potentially could be detected as anomalies in the phylogenies of the respective gene families. We developed a computational pipeline to search for such anomalies in 1,834 orthologous clusters of archaeal genes, focusing on lineage-specific subfamilies that significantly deviate from the expected rate of evolution. Multiple potential cases of neofunctionalization and subfunctionalization were identified, including some ancient, house-keeping gene families, such as ribosomal protein S10, general transcription factor TFIIB and chaperone Hsp20. As expected, many cases of apparent acceleration of evolution are associated with lineage-specific gene duplication. On other occasions, long branches in phylogenetic trees correspond to horizontal gene transfer across long evolutionary distances. Significant deceleration of evolution is less common than acceleration, and the underlying causes are not well understood; functional shifts accompanied by increased constraints could be involved. Many gene families appear to be “highly evolvable,” that is, include both long and short branches. Even in the absence of precise functional predictions, this approach allows one to select targets for experimentation in search of new biology.
Collapse
Affiliation(s)
- Celine Petitjean
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland
| | - Kira S Makarova
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland
| | - Yuri I Wolf
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland
| |
Collapse
|
60
|
Lu GA, Zhao Y, Yang H, Lan A, Shi S, Liufu Z, Huang Y, Tang T, Xu J, Shen X, Wu CI. Death of new microRNA genes in Drosophila via gradual loss of fitness advantages. Genome Res 2018; 28:1309-1318. [PMID: 30049791 PMCID: PMC6120634 DOI: 10.1101/gr.233809.117] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2017] [Accepted: 07/20/2018] [Indexed: 01/23/2023]
Abstract
The prevalence of de novo coding genes is controversial due to length and coding constraints. Noncoding genes, especially small ones, are freer to evolve de novo by comparison. The best examples are microRNAs (miRNAs), a large class of regulatory molecules ∼22 nt in length. Here, we study six de novo miRNAs in Drosophila, which, like most new genes, are testis-specific. We ask how and why de novo genes die because gene death must be sufficiently frequent to balance the many new births. By knocking out each miRNA gene, we analyzed their contributions to the nine components of male fitness (sperm production, length, and competitiveness, among others). To our surprise, the knockout mutants often perform better than the wild type in some components, and slightly worse in others. When two of the younger miRNAs are assayed in long-term laboratory populations, their total fitness contributions are found to be essentially zero. These results collectively suggest that adaptive de novo genes die regularly, not due to the loss of functionality, but due to the canceling out of positive and negative fitness effects, which may be characterized as "quasi-neutrality." Since de novo genes often emerge adaptively and become lost later, they reveal ongoing period-specific adaptations, reminiscent of the "Red-Queen" metaphor for long-term evolution.
Collapse
Affiliation(s)
- Guang-An Lu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Yixin Zhao
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Hao Yang
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Ao Lan
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Suhua Shi
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Zhongqi Liufu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Yumei Huang
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Tian Tang
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Jin Xu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
- Center for Personal Dynamic Regulomes, Stanford University, Stanford, California 94305, USA
| | - Xu Shen
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
| | - Chung-I Wu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, Guangdong, China
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois 60637, USA
| |
Collapse
|
61
|
Sankoff D. Evolutionary Model for the Statistical Divergence of Paralogous and Orthologous Gene Pairs Generated by Whole Genome Duplication and Speciation. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:1579-1584. [PMID: 28715335 DOI: 10.1109/tcbb.2017.2712695] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
We outline a principled approach to the analysis of duplicate gene similarity distributions, based on a model integrating sequence divergence and the process of fractionation of duplicate genes resulting from whole genome duplication (WGD). This model allows us to predict duplicate gene similarity distributions for a series of two or three WGD, for whole genome triplication followed by a WGD, and for triplication, followed by speciation, followed by WGD. We calculate the probabilities of all possible fates of a gene pair as its two members proliferate or are lost, predicting the number of surviving pairs from each event. We discuss how to calculate maximum likelihood estimators for the parameters of these models, illustrating with an analysis of the distribution of paralog similarities in the poplar genome.
Collapse
|
62
|
An analysis of aging-related genes derived from the Genotype-Tissue Expression project (GTEx). Cell Death Discov 2018; 4:26. [PMID: 30155276 PMCID: PMC6102484 DOI: 10.1038/s41420-018-0093-y] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2018] [Revised: 06/26/2018] [Accepted: 07/26/2018] [Indexed: 01/30/2023] Open
Abstract
Aging is a complex biological process that is far from being completely understood. Analyzing transcriptional differences across age might help uncover genetic bases of aging. In this study, 1573 differentially expressed genes, related to chronological age, from the Genotype-Tissue Expression (GTEx) project, were categorized as upregulated age-associated genes (UAGs) and downregulated age-associated genes (DAGs). Characteristics in evolution, expression, function and molecular networks were comprehensively described and compared for UAGs, DAGs and other genes. Analyses revealed that UAGs are more clustered, more quickly evolving, more tissue specific and have accumulated more single-nucleotide polymorphisms (SNPs) and disease genes than DAGs. DAGs were found with a lower evolutionary rate, higher expression level, greater homologous gene number, smaller phyletic age and earlier expression in body development. UAGs are more likely to be located in the extracellular region and to occur in both immune-relevant processes and cancer-related pathways. By contrast, DAGs are more likely to be located intracellularly and to be enriched in catabolic and metabolic processes. Moreover, DAGs are also critical in a protein–protein interaction (PPI) network, whereas UAGs have more influence on a signaling network. This study highlights characteristics of the aging transcriptional landscape in a healthy population, which may benefit future studies on the aging process and provide a broader horizon for age-dependent precision medicine.
Collapse
|
63
|
Moyers BA, Zhang J. Toward Reducing Phylostratigraphic Errors and Biases. Genome Biol Evol 2018; 10:2037-2048. [PMID: 30060201 PMCID: PMC6105108 DOI: 10.1093/gbe/evy161] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/28/2018] [Indexed: 01/03/2023] Open
Abstract
Phylostratigraphy is a method for estimating gene age, usually applied to large numbers of genes in order to detect nonrandom age-distributions of gene properties that could shed light on mechanisms of gene origination and evolution. However, phylostratigraphy underestimates gene age with a nonnegligible probability. The underestimation is severer for genes with certain properties, creating spurious age distributions of these properties and those correlated with these properties. Here we explore three strategies to reduce phylostratigraphic error/bias. First, we test several alternative homology detection methods (PSIBLAST, HMMER, PHMMER, OMA, and GLAM2Scan) in phylostratigraphy, but fail to find any that noticeably outperforms the commonly used BLASTP. Second, using machine learning, we look for predictors of error-prone genes to exclude from phylostratigraphy, but cannot identify reliable predictors. Finally, we remove from phylostratigraphic analysis genes exhibiting errors in simulation, which by definition minimizes error/bias if the simulation is sufficiently realistic. Using this last approach, we show that some previously reported phylostratigraphic trends (e.g., younger proteins tend to evolve more rapidly and be shorter) disappear or even reverse, reconfirming the necessity of controlling phylostratigraphic error/bias. Taken together, our analyses demonstrate that phylostratigraphic errors/biases are refractory to several potential solutions but can be controlled at least partially by the exclusion of error-prone genes identified via realistic simulations. These results are expected to stimulate the judicious use of error-aware phylostratigraphy and reevaluation of previous phylostratigraphic findings.
Collapse
Affiliation(s)
- Bryan A Moyers
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan
| |
Collapse
|
64
|
Zhang Q, Wang S, Pan Y, Su D, Lu Q, Zuo Y, Yang L. Characterization of proteins in different subcellular localizations for Escherichia coli K12. Genomics 2018; 111:1134-1141. [PMID: 30026105 DOI: 10.1016/j.ygeno.2018.07.008] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2018] [Revised: 07/07/2018] [Accepted: 07/11/2018] [Indexed: 10/28/2022]
Abstract
Knowing the comprehensive knowledge about the protein subcellular localization is an important step to understand the function of the proteins. Recent advances in system biology have allowed us to develop more accurate methods for characterizing the proteins at subcellular localization level. In this study, the analysis method was developed to characterize the topological properties and biological properties of the cytoplasmic proteins, inner membrane proteins, outer membrane proteins and periplasmic proteins in Escherichia coli (E. coli). Statistical significant differences were found in all topological properties and biological properties among proteins in different subcellular localizations. In addition, investigation was carried out to analyze the differences in 20 amino acid compositions for four protein categories. We also found that there were significant differences in all of the 20 amino acid compositions. These findings may be helpful for understanding the comprehensive relationship between protein subcellular localization and biological function.
Collapse
Affiliation(s)
- Qi Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Shiyuan Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Yi Pan
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Dongqing Su
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Qianzi Lu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Yongchun Zuo
- The State key Laboratory of Reproductive Regulation, Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot 010070, China.
| | - Lei Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China.
| |
Collapse
|
65
|
Banerjee S, Chakraborty S. Protein intrinsic disorder negatively associates with gene age in different eukaryotic lineages. MOLECULAR BIOSYSTEMS 2018; 13:2044-2055. [PMID: 28783193 DOI: 10.1039/c7mb00230k] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The emergence of new protein-coding genes in a specific lineage or species provides raw materials for evolutionary adaptations. Until recently, the biology of new genes emerging particularly from non-genic sequences remained unexplored. Although the new genes are subjected to variable selection pressure and face rapid deletion, some of them become functional and are retained in the gene pool. To acquire functional novelties, new genes often get integrated into the pre-existing ancestral networks. However, the mechanism by which young proteins acquire novel interactions remains unanswered till date. Since structural orientation contributes hugely to the mode of proteins' physical interactions, in this regard, we put forward an interesting question - Do new genes encode proteins with stable folds? Addressing the question, we demonstrated that the intrinsic disorder inversely correlates with the evolutionary gene ages - i.e. young proteins are richer in intrinsic disorder than the ancient ones. We further noted that young proteins, which are initially poorly connected hubs, prefer to be structurally more disordered than well-connected ancient proteins. The phenomenon strikingly defies the usual trend of well-connected proteins being highly disordered in structure. We justified that structural disorder might help poorly connected young proteins to undergo promiscuous interactions, which provides the foundation for novel protein interactions. The study focuses on the evolutionary perspectives of young proteins in the light of structural adaptations.
Collapse
Affiliation(s)
- Sanghita Banerjee
- Machine Intelligence Unit, Indian Statistical Institute, 203 Barrackpore Trunk Road, Kolkata 700108, India.
| | | |
Collapse
|
66
|
Schumacher J, Herlyn H. Correlates of evolutionary rates in the murine sperm proteome. BMC Evol Biol 2018; 18:35. [PMID: 29580206 PMCID: PMC5870804 DOI: 10.1186/s12862-018-1157-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2017] [Accepted: 03/19/2018] [Indexed: 01/20/2023] Open
Abstract
Background Protein-coding genes expressed in sperm evolve at different rates. To gain deeper insight into the factors underlying this heterogeneity we examined the relative importance of a diverse set of previously described rate correlates in determining the evolution of murine sperm proteins. Results Using partial rank correlations we detected several major rate indicators: Phyletic gene age, numbers of protein-protein interactions, and survival essentiality emerged as particularly important rate correlates in murine sperm proteins. Tissue specificity, numbers of paralogs, and untranslated region lengths also correlate significantly with sperm genes’ evolutionary rates, albeit to a lesser extent. Multifunctionality, coding sequence or average intron lengths, and mean expression level have insignificant or virtually no independent effects on evolutionary rates in murine sperm genes. Gene ontology enrichment analyses of three equally sized murine sperm protein groups classified based on their evolutionary rates indicate strongest sperm-specific functional specialization in the most quickly evolving gene class. Conclusions We propose a model according to which slowly evolving murine sperm proteins tend to be constrained by factors such as survival essentiality, network connectivity, and/or broad expression. In contrast, evolutionary change may arise especially in less constrained sperm proteins, which might, moreover, be prone to specialize to reproduction-related functions. Our results should be taken into account in future studies on rate variations of reproductive genes. Electronic supplementary material The online version of this article (10.1186/s12862-018-1157-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Julia Schumacher
- Institute of Organismic and Molecular Evolution, Anthropology, Johannes Gutenberg University, Mainz, Germany.
| | - Holger Herlyn
- Institute of Organismic and Molecular Evolution, Anthropology, Johannes Gutenberg University, Mainz, Germany.
| |
Collapse
|
67
|
Dabbagh N, Preisfeld A. Intrageneric Variability Between the Chloroplast Genomes of Trachelomonas grandis and Trachelomonas volvocina and Phylogenomic Analysis of Phototrophic Euglenoids. J Eukaryot Microbiol 2018; 65:648-660. [PMID: 29418041 DOI: 10.1111/jeu.12510] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2017] [Revised: 01/09/2018] [Accepted: 01/25/2018] [Indexed: 11/29/2022]
Abstract
The latest studies of chloroplast genomes of phototrophic euglenoids yielded different results according to intrageneric variability such as cluster arrangement or diversity of introns. Although the genera Euglena and Monomorphina in those studies show high syntenic arrangements at the intrageneric level, the two investigated Eutreptiella species comprise low synteny. Furthermore Trachelomonas volvocina show low synteny to the chloroplast genomes of the sister genera Monomorphina aenigmatica, M. parapyrum, Cryptoglena skujae, Euglenaria anabaena, Strombomonas acuminata, all of which were highly syntenic. Consequently, this study aims at the analysis of the cpGenome of Trachelomonas grandis and a comparative examination of T. volvocina to investigate whether the cpGenomes are of such resemblance as could be expected for a genus within the Euglenaceae. Although these analyses resulted in almost identical gene content to other Euglenaceae, the chloroplast genome showed significant novelties: In the rRNA operon, we detected group II introns, not yet found in any other cpGenome of Euglenaceae and a substantially heterogeneous cluster arrangement in the genus Trachelomonas. The phylogenomic analysis with 84 genes of 19 phototrophic euglenoids and 18 cpGenome sequences from Chlorophyta and Streptophyta resulted in a well-supported cpGenome phylogeny, which is in accordance to former phylogenetic analyses.
Collapse
Affiliation(s)
- Nadja Dabbagh
- Faculty of Mathematics and Natural Sciences, Zoology and Didactics of Biology, Bergische University Wuppertal, Gaussstraße 20, 42119 Wuppertal, Germany
| | - Angelika Preisfeld
- Faculty of Mathematics and Natural Sciences, Zoology and Didactics of Biology, Bergische University Wuppertal, Gaussstraße 20, 42119 Wuppertal, Germany
| |
Collapse
|
68
|
Hanada K, Tezuka A, Nozawa M, Suzuki Y, Sugano S, Nagano AJ, Ito M, Morinaga SI. Functional divergence of duplicate genes several million years after gene duplication in Arabidopsis. DNA Res 2018; 25:4898128. [PMID: 29481587 PMCID: PMC6014284 DOI: 10.1093/dnares/dsy005] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2017] [Accepted: 02/02/2018] [Indexed: 12/02/2022] Open
Abstract
Lineage-specific duplicated genes likely contribute to the phenotypic divergence in closely related species. However, neither the frequency of duplication events nor the degree of selection pressures immediately after gene duplication is clear in the speciation process. Here, using Illumina DNA-sequencing reads from Arabidopsis halleri, which has multiple closely related species with high-quality genome assemblies (A. thaliana and A. lyrata), we succeeded in generating orthologous gene groups in Brassicaceae. The duplication frequency of retained genes in the Arabidopsis lineage was ∼10 times higher than the duplication frequency inferred by comparative genomics of Arabidopsis, poplar, rice and moss (Physcomitrella patens). The difference of duplication frequencies can be explained by a rapid decay of anciently duplicated genes. To examine the degree of selection pressure on genes duplicated in either the A. halleri-lyrata or the A. halleri lineage, we examined positive and purifying selection in the A. halleri-lyrata and A. halleri lineages throughout the ratios of nonsynonymous to synonymous substitution rates (KA/KS). Duplicate genes tended to have a higher proportion of positive selection compared with non-duplicated genes. Interestingly, we found that functional divergence of duplicated genes was accelerated several million years after gene duplication compared with immediately after gene duplication.
Collapse
Affiliation(s)
- Kousuke Hanada
- Department of Bioscience and Bioinformatics, Frontier Research Academy for Young Researchers, Kyusyu Institute of Technology, Iizuka, Fukuoka 820-8502, Japan
- RIKEN Center for Sustainable Resource Science, RIKEN, Yokohama, Kanagawa 230-0045, Japan
- CREST, Japan Science and Technology Agency, Kawaguchi, Saitama 332-0012, Japan
| | - Ayumi Tezuka
- Department of Bioscience and Bioinformatics, Frontier Research Academy for Young Researchers, Kyusyu Institute of Technology, Iizuka, Fukuoka 820-8502, Japan
| | - Masafumi Nozawa
- Center for Information Biology, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan
- Department of Genetics, SOKENDAI, Mishima, Shizuoka 411-8540, Japan
- Department of Biological Sciences, Tokyo Metropolitan University, Hachioji, Tokyo 192-0397, Japan
| | - Yutaka Suzuki
- Graduate School of Frontier Science, The University of Tokyo, Kashiwa, Chiba 277-8562, Japan
| | - Sumio Sugano
- Graduate School of Frontier Science, The University of Tokyo, Kashiwa, Chiba 277-8562, Japan
| | - Atsushi J Nagano
- CREST, Japan Science and Technology Agency, Kawaguchi, Saitama 332-0012, Japan
- Center of Ecological Research, Kyoto University, Hirano, Otsu, Shiga 520-2113, Japan
| | - Motomi Ito
- Graduate School of Arts and Sciences, The University of Tokyo, Tokyo 153-8902, Japan
| | - Shin-Ichi Morinaga
- CREST, Japan Science and Technology Agency, Kawaguchi, Saitama 332-0012, Japan
- Graduate School of Arts and Sciences, The University of Tokyo, Tokyo 153-8902, Japan
- College of Bioresource Sciences, Nihon University, Fujisawa, Kanagawa 252-0880, Japan
| |
Collapse
|
69
|
Marnetto D, Mantica F, Molineris I, Grassi E, Pesando I, Provero P. Evolutionary Rewiring of Human Regulatory Networks by Waves of Genome Expansion. Am J Hum Genet 2018; 102:207-218. [PMID: 29357977 DOI: 10.1016/j.ajhg.2017.12.014] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2017] [Accepted: 12/15/2017] [Indexed: 01/09/2023] Open
Abstract
Genome expansion is believed to be an important driver of the evolution of gene regulation. To investigate the role of a newly arising sequence in rewiring regulatory networks, we estimated the age of each region of the human genome by applying maximum parsimony to genome-wide alignments with 100 vertebrates. We then studied the age distribution of several types of functional regions, with a focus on regulatory elements. The age distribution of regulatory elements reveals the extensive use of newly formed genomic sequence in the evolution of regulatory interactions. Many transcription factors have expanded their repertoire of targets through waves of genomic expansions that can be traced to specific evolutionary times. Repeated elements contributed a major part of such expansion: many classes of such elements are enriched in binding sites of one or a few specific transcription factors, whose binding sites are localized in specific portions of the element and characterized by distinctive motif words. These features suggest that the binding sites were available as soon as the new sequence entered the genome, rather than being created later by accumulation of point mutations. By comparing the age of regulatory regions to the evolutionary shift in expression of nearby genes, we show that rewiring through genome expansion played an important role in shaping human regulatory networks.
Collapse
|
70
|
Genomes of 13 domesticated and wild rice relatives highlight genetic conservation, turnover and innovation across the genus Oryza. Nat Genet 2018; 50:285-296. [DOI: 10.1038/s41588-018-0040-0] [Citation(s) in RCA: 289] [Impact Index Per Article: 48.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2017] [Accepted: 12/18/2017] [Indexed: 11/08/2022]
|
71
|
Barroso GV, Puzovic N, Dutheil JY. The Evolution of Gene-Specific Transcriptional Noise Is Driven by Selection at the Pathway Level. Genetics 2018; 208:173-189. [PMID: 29097405 PMCID: PMC5753856 DOI: 10.1534/genetics.117.300467] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2017] [Accepted: 10/13/2017] [Indexed: 11/18/2022] Open
Abstract
Biochemical reactions within individual cells result from the interactions of molecules, typically in small numbers. Consequently, the inherent stochasticity of binding and diffusion processes generates noise along the cascade that leads to the synthesis of a protein from its encoding gene. As a result, isogenic cell populations display phenotypic variability even in homogeneous environments. The extent and consequences of this stochastic gene expression have only recently been assessed on a genome-wide scale, owing, in particular, to the advent of single-cell transcriptomics. However, the evolutionary forces shaping this stochasticity have yet to be unraveled. Here, we take advantage of two recently published data sets for the single-cell transcriptome of the domestic mouse Mus musculus to characterize the effect of natural selection on gene-specific transcriptional stochasticity. We show that noise levels in the mRNA distributions (also known as transcriptional noise) significantly correlate with three-dimensional nuclear domain organization, evolutionary constraints on the encoded protein, and gene age. However, the position of the encoded protein in a biological pathway is the main factor that explains observed levels of transcriptional noise, in agreement with models of noise propagation within gene networks. Because transcriptional noise is under widespread selection, we argue that it constitutes an important component of the phenotype and that variance of expression is a potential target of adaptation. Stochastic gene expression should therefore be considered together with the mean expression level in functional and evolutionary studies of gene expression.
Collapse
Affiliation(s)
- Gustavo Valadares Barroso
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, 24306 Plön, Germany
| | - Natasa Puzovic
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, 24306 Plön, Germany
| | - Julien Y Dutheil
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, 24306 Plön, Germany
- Unité mixte de recherche 5554, Institut des Sciences de l'Évolution, Université de Montpellier, 34095, France
| |
Collapse
|
72
|
Popadin K, Peischl S, Garieri M, Sailani MR, Letourneau A, Santoni F, Lukowski SW, Bazykin GA, Nikolaev S, Meyer D, Excoffier L, Reymond A, Antonarakis SE. Slightly deleterious genomic variants and transcriptome perturbations in Down syndrome embryonic selection. Genome Res 2017; 28:1-10. [PMID: 29237728 PMCID: PMC5749173 DOI: 10.1101/gr.228411.117] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2017] [Accepted: 11/20/2017] [Indexed: 12/13/2022]
Abstract
The majority of aneuploid fetuses are spontaneously miscarried. Nevertheless, some aneuploid individuals survive despite the strong genetic insult. Here, we investigate if the survival probability of aneuploid fetuses is affected by the genome-wide burden of slightly deleterious variants. We analyzed two cohorts of live-born Down syndrome individuals (388 genotyped samples and 16 fibroblast transcriptomes) and observed a deficit of slightly deleterious variants on Chromosome 21 and decreased transcriptome-wide variation in the expression level of highly constrained genes. We interpret these results as signatures of embryonic selection, and propose a genetic handicap model whereby an individual bearing an extremely severe deleterious variant (such as aneuploidy) could escape embryonic lethality if the genome-wide burden of slightly deleterious variants is sufficiently low. This approach can be used to study the composition and effect of the numerous slightly deleterious variants in humans and model organisms.
Collapse
Affiliation(s)
- Konstantin Popadin
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland.,Center for Integrative Genomics, University of Lausanne, CH-1015 Lausanne, Switzerland.,Immanuel Kant Baltic Federal University, Kaliningrad, 236041, Russia.,Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Stephan Peischl
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland.,Interfaculty Bioinformatics Unit, University of Bern, 3012 Bern, Switzerland
| | - Marco Garieri
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland
| | - M Reza Sailani
- Stanford School of Medicine, Stanford University, Stanford, California 94305, USA
| | - Audrey Letourneau
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland
| | - Federico Santoni
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland
| | - Samuel W Lukowski
- Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD 4072, Australia
| | - Georgii A Bazykin
- Institute for Information Transmission Problems (Kharkevich Institute) of the Russian Academy of Sciences, Moscow, 127051, Russia.,Center for Data-Intensive Biomedicine and Biotechnology, Skolkovo Institute of Science and Technology, Skolkovo, 143026, Russia
| | - Sergey Nikolaev
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland
| | - Diogo Meyer
- Department of Genetics and Evolutionary Biology, University of Sao Paulo, 05508-090, Sao Paulo, Brazil
| | - Laurent Excoffier
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland.,Institute for Ecology and Evolution, University of Bern, CH-3012 Bern, Switzerland
| | - Alexandre Reymond
- Center for Integrative Genomics, University of Lausanne, CH-1015 Lausanne, Switzerland
| | - Stylianos E Antonarakis
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland
| |
Collapse
|
73
|
Begum T, Ghosh TC, Basak S. Systematic Analyses and Prediction of Human Drug Side Effect Associated Proteins from the Perspective of Protein Evolution. Genome Biol Evol 2017; 9:337-350. [PMID: 28391292 PMCID: PMC5499873 DOI: 10.1093/gbe/evw301] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/16/2017] [Indexed: 12/20/2022] Open
Abstract
Identification of various factors involved in adverse drug reactions in target proteins to develop therapeutic drugs with minimal/no side effect is very important. In this context, we have performed a comparative evolutionary rate analyses between the genes exhibiting drug side-effect(s) (SET) and genes showing no side effect (NSET) with an aim to increase the prediction accuracy of SET/NSET proteins using evolutionary rate determinants. We found that SET proteins are more conserved than the NSET proteins. The rates of evolution between SET and NSET protein primarily depend upon their noncomplex (protein complex association number = 0) forming nature, phylogenetic age, multifunctionality, membrane localization, and transmembrane helix content irrespective of their essentiality, total druggability (total number of drugs/target), m-RNA expression level, and tissue expression breadth. We also introduced two novel terms—killer druggability (number of drugs with killing side effect(s)/target), essential druggability (number of drugs targeting essential proteins/target) to explain the evolutionary rate variation between SET and NSET proteins. Interestingly, we noticed that SET proteins are younger than NSET proteins and multifunctional younger SET proteins are candidates of acquiring killing side effects. We provide evidence that higher killer druggability, multifunctionality, and transmembrane helices support the conservation of SET proteins over NSET proteins in spite of their recent origin. By employing all these entities, our Support Vector Machine model predicts human SET/NSET proteins to a high degree of accuracy (∼86%).
Collapse
Affiliation(s)
- Tina Begum
- Bioinformatics Centre, Tripura University, Suryamaninagar, Tripura, India
| | | | - Surajit Basak
- Bioinformatics Centre, Tripura University, Suryamaninagar, Tripura, India.,Department of Molecular Biology & Bioinformatics, Tripura University, Suryamaninagar, Tripura, India
| |
Collapse
|
74
|
Wang T, Tang H. The physical characteristics of human proteins in different biological functions. PLoS One 2017; 12:e0176234. [PMID: 28459865 PMCID: PMC5411090 DOI: 10.1371/journal.pone.0176234] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2016] [Accepted: 04/08/2017] [Indexed: 01/24/2023] Open
Abstract
The physical properties of gene products are the foundation of their biological functions. In this study, we systematically explored relationships between physical properties and biological functions. The physical properties including origin time, evolution pressure, mRNA and protein stability, molecular weight, hydrophobicity, acidity/alkaline, amino acid compositions, and chromosome location. The biological functions are defined from 4 aspects: biological process, molecular function, cellular component and cell/tissue/organ expression. We found that the proteins associated with basic material and energy metabolism process originated earlier, while the proteins associated with immune, neurological system process etc. originated later. Tissues may have a strong influence on evolution pressure. The proteins associated with energy metabolism are double-stable. Immune and peripheral cell proteins tend to be mRNA stable/protein unstable. There are very few function items with double-unstable of mRNA and protein. The proteins involved in the cell adhesion tend to consist of large proteins with high proportion of small amino acids. The proteins of organic acid transport, neurological system process and amine transport have significantly high hydrophobicity. Interestingly, the proteins involved in olfactory receptor activity tend to have high frequency of aromatic, sulfuric and hydroxyl amino acids.
Collapse
Affiliation(s)
- Tengjiao Wang
- Department of Bioinformatics, Second Military Medical University, Shanghai, P.R. China
| | - Hailin Tang
- Department of Biological Biodefense (Microbiology), Faculty of Tropical Medicine and Public Health, Second Military Medical University, Shanghai Key Laboratory of Medical Biodefense, Shanghai, P.R.China
| |
Collapse
|
75
|
França GS, Hinske LC, Galante PAF, Vibranovski MD. Unveiling the Impact of the Genomic Architecture on the Evolution of Vertebrate microRNAs. Front Genet 2017; 8:34. [PMID: 28377786 PMCID: PMC5359303 DOI: 10.3389/fgene.2017.00034] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2016] [Accepted: 03/09/2017] [Indexed: 12/12/2022] Open
Abstract
Eukaryotic genomes frequently exhibit interdependency between transcriptional units, as evidenced by regions of high gene density. It is well recognized that vertebrate microRNAs (miRNAs) are usually embedded in those regions. Recent work has shown that the genomic context is of utmost importance to determine miRNA expression in time and space, thus affecting their evolutionary fates over long and short terms. Consequently, understanding the inter- and intraspecific changes on miRNA genomic architecture may bring novel insights on the basic cellular processes regulated by miRNAs, as well as phenotypic evolution and disease-related mechanisms.
Collapse
Affiliation(s)
- Gustavo S França
- Departamento de Genética e Biologia Evolutiva, Universidade de São Paulo São Paulo, Brazil
| | - Ludwig C Hinske
- Department of Anesthesiology, Clinic of the University of Munich, Ludwig Maximilian University of Munich Munich, Germany
| | - Pedro A F Galante
- Centro de Oncologia Molecular, Hospital Sírio-Libanês São Paulo, Brazil
| | - Maria D Vibranovski
- Departamento de Genética e Biologia Evolutiva, Universidade de São Paulo São Paulo, Brazil
| |
Collapse
|
76
|
Badet T, Peyraud R, Mbengue M, Navaud O, Derbyshire M, Oliver RP, Barbacci A, Raffaele S. Codon optimization underpins generalist parasitism in fungi. eLife 2017; 6:e22472. [PMID: 28157073 PMCID: PMC5315462 DOI: 10.7554/elife.22472] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2016] [Accepted: 01/28/2017] [Indexed: 01/04/2023] Open
Abstract
The range of hosts that parasites can infect is a key determinant of the emergence and spread of disease. Yet, the impact of host range variation on the evolution of parasite genomes remains unknown. Here, we show that codon optimization underlies genome adaptation in broad host range parasites. We found that the longer proteins encoded by broad host range fungi likely increase natural selection on codon optimization in these species. Accordingly, codon optimization correlates with host range across the fungal kingdom. At the species level, biased patterns of synonymous substitutions underpin increased codon optimization in a generalist but not a specialist fungal pathogen. Virulence genes were consistently enriched in highly codon-optimized genes of generalist but not specialist species. We conclude that codon optimization is related to the capacity of parasites to colonize multiple hosts. Our results link genome evolution and translational regulation to the long-term persistence of generalist parasitism.
Collapse
Affiliation(s)
- Thomas Badet
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| | - Remi Peyraud
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| | - Malick Mbengue
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| | - Olivier Navaud
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| | - Mark Derbyshire
- Centre for Crop and Disease Management, Department of Environment and Agriculture, Curtin University, Perth, Australia
| | - Richard P Oliver
- Centre for Crop and Disease Management, Department of Environment and Agriculture, Curtin University, Perth, Australia
| | - Adelin Barbacci
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| | - Sylvain Raffaele
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| |
Collapse
|
77
|
Two fundamentally different classes of microbial genes. Nat Microbiol 2016; 2:16208. [PMID: 27819663 DOI: 10.1038/nmicrobiol.2016.208] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2016] [Accepted: 09/20/2016] [Indexed: 01/15/2023]
Abstract
The evolution of bacterial and archaeal genomes is highly dynamic and involves extensive horizontal gene transfer and gene loss1-4. Furthermore, many microbial species appear to have open pangenomes, where each newly sequenced genome contains more than 10% ORFans, that is, genes without detectable homologues in other species5,6. Here, we report a quantitative analysis of microbial genome evolution by fitting the parameters of a simple, steady-state evolutionary model to the comparative genomic data on the gene content and gene order similarity between archaeal genomes. The results reveal two sharply distinct classes of microbial genes, one of which is characterized by effectively instantaneous gene replacement, and the other consists of genes with finite, distributed replacement rates. These findings imply a conservative estimate of the size of the prokaryotic genomic universe, which appears to consist of at least a billion distinct genes. Furthermore, the same distribution of constraints is shown to govern the evolution of gene complement and gene order, without the need to invoke long-range conservation or the selfish operon concept7.
Collapse
|
78
|
Lopes KDP, Campos-Laborie FJ, Vialle RA, Ortega JM, De Las Rivas J. Evolutionary hallmarks of the human proteome: chasing the age and coregulation of protein-coding genes. BMC Genomics 2016; 17:725. [PMID: 27801289 PMCID: PMC5088522 DOI: 10.1186/s12864-016-3062-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Background The development of large-scale technologies for quantitative transcriptomics has enabled comprehensive analysis of the gene expression profiles in complete genomes. RNA-Seq allows the measurement of gene expression levels in a manner far more precise and global than previous methods. Studies using this technology are altering our view about the extent and complexity of the eukaryotic transcriptomes. In this respect, multiple efforts have been done to determine and analyse the gene expression patterns of human cell types in different conditions, either in normal or pathological states. However, until recently, little has been reported about the evolutionary marks present in human protein-coding genes, particularly from the combined perspective of gene expression and protein evolution. Results We present a combined analysis of human protein-coding gene expression profiling and time-scale ancestry mapping, that places the genes in taxonomy clades and reveals eight evolutionary major steps (“hallmarks”), that include clusters of functionally coherent proteins. The human expressed genes are analysed using a RNA-Seq dataset of 116 samples from 32 tissues. The evolutionary analysis of the human proteins is performed combining the information from: (i) a database of orthologous proteins (OMA), (ii) the taxonomy mapping of genes to lineage clades (from NCBI Taxonomy) and (iii) the evolution time-scale mapping provided by TimeTree (Timescale of Life). The human protein-coding genes are also placed in a relational context based in the construction of a robust gene coexpression network, that reveals tighter links between age-related protein-coding genes and finds functionally coherent gene modules. Conclusions Understanding the relational landscape of the human protein-coding genes is essential for interpreting the functional elements and modules of our active genome. Moreover, decoding the evolutionary history of the human genes can provide very valuable information to reveal or uncover their origin and function. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3062-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Katia de Paiva Lopes
- Bioinformatics and Functional Genomics Group, Cancer Research Center (CiC-IBMCC, CSIC/USAL/IBSAL), Consejo Superior de Investigaciones Cientificas (CSIC), Salamanca, Spain.,Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas (ICB), Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brasil
| | - Francisco José Campos-Laborie
- Bioinformatics and Functional Genomics Group, Cancer Research Center (CiC-IBMCC, CSIC/USAL/IBSAL), Consejo Superior de Investigaciones Cientificas (CSIC), Salamanca, Spain
| | - Ricardo Assunção Vialle
- Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas (ICB), Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brasil
| | - José Miguel Ortega
- Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas (ICB), Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brasil
| | - Javier De Las Rivas
- Bioinformatics and Functional Genomics Group, Cancer Research Center (CiC-IBMCC, CSIC/USAL/IBSAL), Consejo Superior de Investigaciones Cientificas (CSIC), Salamanca, Spain.
| |
Collapse
|
79
|
Abstract
As genes originate at different evolutionary times, they harbor distinctive genomic signatures of evolutionary ages. Although previous studies have investigated different gene age-related signatures, what signatures dominantly associate with gene age remains unresolved. Here we address this question via a combined approach of comprehensive assignment of gene ages, gene family identification, and multivariate analyses. We first provide a comprehensive and improved gene age assignment by combining homolog clustering with phylogeny inference and categorize human genes into 26 age classes spanning the whole tree of life. We then explore the dominant age-related signatures based on a collection of 10 potential signatures (including gene composition, gene length, selection pressure, expression level, connectivity in protein–protein interaction network and DNA methylation). Our results show that GC content and connectivity in protein–protein interaction network (PPIN) associate dominantly with gene age. Furthermore, we investigate the heterogeneity of dominant signatures in duplicates and singletons. We find that GC content is a consistent primary factor of gene age in duplicates and singletons, whereas PPIN is more strongly associated with gene age in singletons than in duplicates. Taken together, GC content and PPIN are two dominant signatures in close association with gene age, exhibiting heterogeneity in duplicates and singletons and presumably reflecting complex differential interplays between natural selection and mutation.
Collapse
Affiliation(s)
- Hongyan Yin
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China BIG Data Center, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China University of Chinese Academy of Sciences, Beijing, China
| | - Guangyu Wang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China BIG Data Center, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China University of Chinese Academy of Sciences, Beijing, China
| | - Lina Ma
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China BIG Data Center, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China
| | - Soojin V Yi
- School of Biology, Georgia Institute of Technology, Atlanta
| | - Zhang Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China BIG Data Center, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
80
|
Wei W, Jin YT, Du MZ, Wang J, Rao N, Guo FB. Genomic Complexity Places Less Restrictions on the Evolution of Young Coexpression Networks than Protein-Protein Interactions. Genome Biol Evol 2016; 8:2624-31. [PMID: 27521813 PMCID: PMC5010916 DOI: 10.1093/gbe/evw198] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
The differences in evolutionary patterns of young protein–protein interactions (PPIs) among distinct species have long been a puzzle. However, based on our genome-wide analysis of available integrated experimental data, we confirm that young genes preferentially integrate into ancestral PPI networks, and that this manner is consistent in all of six model organisms with widely different levels of phenotypic complexity. We demonstrate that the level of restrictions placed on the evolution of biological networks declines with a decrease of phenotypic complexity. Compared with young PPI networks, new co-expression links have less evolutionary restrictions, so a young gene with a high possibility to be coexpressed other young genes relatively frequently emerges in the four simpler genomes among the six studied. However, it is not favorable for such young–young coexpression in terms of a young gene evolving into a coexpression hub, so the coexpression pattern could gradually decline. To explain this apparent contradiction, we suggest that young genes that are initially peripheral to networks are temporarily coexpressed with other young genes, driving functional evolution because of low selective pressure. However, as the expression levels of genes increase and they gradually develop a greater effect on fitness, young genes start to be coexpressed more with members of ancestral networks and less with other young genes. Our findings provide new insights into the evolution of biological networks.
Collapse
Affiliation(s)
- Wen Wei
- School of Life Sciences, Chongqing University, Chongqing, China School of Biomedical Engineering, Tianjin Medical University, Tianjin, China
| | - Yan-Ting Jin
- Key Laboratory for Neuroinformation of the Ministry of Education, Center of Bioinformatics, University of Electronic Science and Technology of China, Chengdu, China Center for Information in Biomedicine, University of Electronic Science and Technology of China, Chengdu, China
| | - Meng-Ze Du
- Key Laboratory for Neuroinformation of the Ministry of Education, Center of Bioinformatics, University of Electronic Science and Technology of China, Chengdu, China Center for Information in Biomedicine, University of Electronic Science and Technology of China, Chengdu, China
| | - Ju Wang
- School of Biomedical Engineering, Tianjin Medical University, Tianjin, China
| | - Nini Rao
- Key Laboratory for Neuroinformation of the Ministry of Education, Center of Bioinformatics, University of Electronic Science and Technology of China, Chengdu, China Center for Information in Biomedicine, University of Electronic Science and Technology of China, Chengdu, China
| | - Feng-Biao Guo
- Key Laboratory for Neuroinformation of the Ministry of Education, Center of Bioinformatics, University of Electronic Science and Technology of China, Chengdu, China Center for Information in Biomedicine, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
81
|
Saber MM, Adeyemi Babarinde I, Hettiarachchi N, Saitou N. Emergence and Evolution of Hominidae-Specific Coding and Noncoding Genomic Sequences. Genome Biol Evol 2016; 8:2076-92. [PMID: 27289096 PMCID: PMC4987104 DOI: 10.1093/gbe/evw132] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Family Hominidae, which includes humans and great apes, is recognized for unique complex social behavior and intellectual abilities. Despite the increasing genome data, however, the genomic origin of its phenotypic uniqueness has remained elusive. Clade-specific genes and highly conserved noncoding sequences (HCNSs) are among the high-potential evolutionary candidates involved in driving clade-specific characters and phenotypes. On this premise, we analyzed whole genome sequences along with gene orthology data retrieved from major DNA databases to find Hominidae-specific (HS) genes and HCNSs. We discovered that Down syndrome critical region 4 (DSCR4) is the only experimentally verified gene uniquely present in Hominidae. DSCR4 has no structural homology to any known protein and was inferred to have emerged in several steps through LTR/ERV1, LTR/ERVL retrotransposition, and transversion. Using the genomic distance as neutral evolution threshold, we identified 1,658 HS HCNSs. Polymorphism coverage and derived allele frequency analysis of HS HCNSs showed that these HCNSs are under purifying selection, indicating that they may harbor important functions. They are overrepresented in promoters/untranslated regions, in close proximity of genes involved in sensory perception of sound and developmental process, and also showed a significantly lower nucleosome occupancy probability. Interestingly, many ancestral sequences of the HS HCNSs showed very high evolutionary rates. This suggests that new functions emerged through some kind of positive selection, and then purifying selection started to operate to keep these functions.
Collapse
Affiliation(s)
- Morteza Mahmoudi Saber
- Department of Biological Sciences, Graduate School of Science, University of Tokyo Division of Population Genetics, National Institute of Genetics, Mishima, Japan
| | - Isaac Adeyemi Babarinde
- Division of Population Genetics, National Institute of Genetics, Mishima, Japan Department of Genetics, School of Life Science, Graduate University for Advanced Studies (SOKENDAI), Mishima, Japan
| | - Nilmini Hettiarachchi
- Division of Population Genetics, National Institute of Genetics, Mishima, Japan Department of Genetics, School of Life Science, Graduate University for Advanced Studies (SOKENDAI), Mishima, Japan
| | - Naruya Saitou
- Department of Biological Sciences, Graduate School of Science, University of Tokyo Division of Population Genetics, National Institute of Genetics, Mishima, Japan Department of Genetics, School of Life Science, Graduate University for Advanced Studies (SOKENDAI), Mishima, Japan
| |
Collapse
|
82
|
Abstract
Correctly estimating the age of a gene or gene family is important for a variety of fields, including molecular evolution, comparative genomics, and phylogenetics, and increasingly for systems biology and disease genetics. However, most studies use only a point estimate of a gene’s age, neglecting the substantial uncertainty involved in this estimation. Here, we characterize this uncertainty by investigating the effect of algorithm choice on gene-age inference and calculate consensus gene ages with attendant error distributions for a variety of model eukaryotes. We use 13 orthology inference algorithms to create gene-age datasets and then characterize the error around each age-call on a per-gene and per-algorithm basis. Systematic error was found to be a large factor in estimating gene age, suggesting that simple consensus algorithms are not enough to give a reliable point estimate. We also found that different sources of error can affect downstream analyses, such as gene ontology enrichment. Our consensus gene-age datasets, with associated error terms, are made fully available at so that researchers can propagate this uncertainty through their analyses (geneages.org).
Collapse
Affiliation(s)
- Benjamin J Liebeskind
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin Center for Computational Biology and Bioinformatics, University of Texas at Austin
| | - Claire D McWhite
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin
| | - Edward M Marcotte
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin
| |
Collapse
|
83
|
Boukouris AE, Zervopoulos SD, Michelakis ED. Metabolic Enzymes Moonlighting in the Nucleus: Metabolic Regulation of Gene Transcription. Trends Biochem Sci 2016; 41:712-730. [PMID: 27345518 DOI: 10.1016/j.tibs.2016.05.013] [Citation(s) in RCA: 191] [Impact Index Per Article: 23.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2015] [Revised: 04/30/2016] [Accepted: 05/25/2016] [Indexed: 12/15/2022]
Abstract
During evolution, cells acquired the ability to sense and adapt to varying environmental conditions, particularly in terms of fuel supply. Adaptation to fuel availability is crucial for major cell decisions and requires metabolic alterations and differential gene expression that are often epigenetically driven. A new mechanistic link between metabolic flux and regulation of gene expression is through moonlighting of metabolic enzymes in the nucleus. This facilitates delivery of membrane-impermeable or unstable metabolites to the nucleus, including key substrates for epigenetic mechanisms such as acetyl-CoA which is used in histone acetylation. This metabolism-epigenetics axis facilitates adaptation to a changing environment in normal (e.g., development, stem cell differentiation) and disease states (e.g., cancer), providing a potential novel therapeutic target.
Collapse
|
84
|
Yin H, Ma L, Wang G, Li M, Zhang Z. Old genes experience stronger translational selection than young genes. Gene 2016; 590:29-34. [PMID: 27259662 DOI: 10.1016/j.gene.2016.05.041] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2016] [Revised: 05/27/2016] [Accepted: 05/29/2016] [Indexed: 12/12/2022]
Abstract
Selection on synonymous codon usage for translation efficiency and/or accuracy has been identified as a widespread mechanism in many living organisms. However, it remains unknown whether translational selection associates closely with gene age and acts differentially on genes with different evolutionary ages. To address this issue, here we investigate the strength of translational selection acting on different aged genes in human. Our results show that old genes present stronger translational selection than young genes, demonstrating that translational selection correlates positively with gene age. We further explore the difference of translational selection in duplicates vs. singletons and in housekeeping vs. tissue-specific genes. We find that translational selection acts comparably in old singletons and old duplicates and stronger translational selection in old genes is contributed primarily by housekeeping genes. For young genes, contrastingly, singletons experience stronger translational selection than duplicates, presumably due to redundant function of duplicated genes during their early evolutionary stage. Taken together, our results indicate that translational selection acting on a gene would not be constant during all stages of evolution, associating closely with gene age.
Collapse
Affiliation(s)
- Hongyan Yin
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Lina Ma
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing 100101, China
| | - Guangyu Wang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Mengwei Li
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhang Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing 100101, China.
| |
Collapse
|
85
|
Zhang Q, Nogales-Cadenas R, Lin JR, Zhang W, Cai Y, Vijg J, Zhang ZD. Systems-level analysis of human aging genes shed new light on mechanisms of aging. Hum Mol Genet 2016; 25:2934-2947. [PMID: 27179790 DOI: 10.1093/hmg/ddw145] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2016] [Revised: 04/07/2016] [Accepted: 05/09/2016] [Indexed: 11/13/2022] Open
Abstract
Although studies over the last decades have firmly connected a number of genes and molecular pathways to aging, the aging process as a whole still remains poorly understood. To gain novel insights into the mechanisms underlying aging, instead of considering aging genes individually, we studied their characteristics at the systems level in the context of biological networks. We calculated a comprehensive set of network characteristics for human aging-related genes from the GenAge database. By comparing them with other functional groups of genes, we identified a robust group of aging-specific network characteristics. To find the structural basis and the molecular mechanisms underlying this aging-related network specificity, we also analyzed protein domain interactions and gene expression patterns across different tissues. Our study revealed that aging genes not only tend to be network hubs, playing important roles in communication among different functional modules or pathways, but also are more likely to physically interact and be co-expressed with essential genes. The high expression of aging genes across a large number of tissue types also points to a high level of connectivity among aging genes. Unexpectedly, contrary to the depletion of interactions among hub genes in biological networks, we observed close interactions among aging hubs, which renders the aging subnetworks vulnerable to random attacks and thus may contribute to the aging process. Comparison across species reveals the evolution process of the aging subnetwork. As the organisms become more complex, the complexity of its aging mechanisms increases and their aging hub genes are more functionally connected.
Collapse
Affiliation(s)
| | | | | | | | | | - Jan Vijg
- Department of Genetics.,Department of Ophthalmology and Visual Sciences, Albert Einstein College of Medicine, Bronx, NY, USA
| | | |
Collapse
|
86
|
França GS, Vibranovski MD, Galante PAF. Host gene constraints and genomic context impact the expression and evolution of human microRNAs. Nat Commun 2016; 7:11438. [PMID: 27109497 PMCID: PMC4848552 DOI: 10.1038/ncomms11438] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Accepted: 03/25/2016] [Indexed: 12/16/2022] Open
Abstract
Increasing evidence has shown that recent miRNAs tend to emerge within coding genes. Here we conjecture that human miRNA evolution is tightly influenced by the genomic context, especially by host genes. Our findings show a preferential emergence of intragenic miRNAs within old genes. We found that miRNAs within old host genes are significantly more broadly expressed than those within young ones. Young miRNAs within old genes are more broadly expressed than their intergenic counterparts, suggesting that young miRNAs have an initial advantage by residing in old genes, and benefit from their hosts' expression control and from the exposure to diverse cellular contexts and target genes. Our results demonstrate that host genes may provide stronger expression constraints to intragenic miRNAs in the long run. We also report associated functional implications, highlighting the genomic context and host genes as driving factors for the expression and evolution of human miRNAs. Recent miRNAs tend to emerge within coding genes. Here, by analysing miRNA expression data from six species and comparing genomes from 13 species, the authors report that host genes may provide stronger expression constraints to intragenic miRNAs in the long run.
Collapse
Affiliation(s)
- Gustavo S França
- Centro de Oncologia Molecular, Hospital Sírio-Libanês, Rua Daher Cutait 69, 01308-060 São Paulo, Brazil.,Departamento de Bioquímica, Instituto de Química, Universidade de São Paulo, Av. Prof. Lineu Prestes 748, 05508-000 São Paulo, Brazil
| | - Maria D Vibranovski
- Departamento de Genética e Biologia Evolutiva, Universidade de São Paulo, Rua do Matao 277, 05508-090 São Paulo, Brazil
| | - Pedro A F Galante
- Centro de Oncologia Molecular, Hospital Sírio-Libanês, Rua Daher Cutait 69, 01308-060 São Paulo, Brazil
| |
Collapse
|
87
|
Leenen FAD, Vernocchi S, Hunewald OE, Schmitz S, Molitor AM, Muller CP, Turner JD. Where does transcription start? 5'-RACE adapted to next-generation sequencing. Nucleic Acids Res 2016; 44:2628-45. [PMID: 26615195 PMCID: PMC4824077 DOI: 10.1093/nar/gkv1328] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2015] [Revised: 11/11/2015] [Accepted: 11/13/2015] [Indexed: 01/27/2023] Open
Abstract
The variability and complexity of the transcription initiation process was examined by adapting RNA ligase-mediated rapid amplification of 5' cDNA ends (5'-RACE) to Next-Generation Sequencing (NGS). We oligo-labelled 5'-m(7)G-capped mRNA from two genes, the simple mono-exonic Beta-2-Adrenoceptor (ADRB2R)and the complex multi-exonic Glucocorticoid Receptor (GR, NR3C1), and detected a variability in TSS location that has received little attention up to now. Transcription was not initiated at a fixed TSS, but from loci of 4 to 10 adjacent nucleotides. Individual TSSs had frequencies from <0.001% to 38.5% of the total gene-specific 5' m(7)G-capped transcripts. ADRB2R used a single locus consisting of 4 adjacent TSSs. Unstimulated, the GR used a total of 358 TSSs distributed throughout 38 loci, that were principally in the 5' UTRs and were spliced using established donor and acceptor sites. Complete demethylation of the epigenetically sensitive GR promoter with 5-azacytidine induced one new locus and 127 TSSs, 12 of which were unique. We induced GR transcription with dexamethasone and Interferon-γ, adding one new locus and 185 additional TSSs distributed throughout the promoter region. In-vitro the TSS microvariability regulated mRNA translation efficiency and the relative abundance of the different GRN-terminal protein isoform levels.
Collapse
Affiliation(s)
- Fleur A D Leenen
- Department of Infection and Immunity, Luxembourg Institute of Health, Esch-Sur-Alzette L-4354, Grand-Duchy of Luxembourg Department of Immunology, Research Institute of Psychobiology, University of Trier, Trier D-54290, Germany
| | - Sara Vernocchi
- Department of Infection and Immunity, Luxembourg Institute of Health, Esch-Sur-Alzette L-4354, Grand-Duchy of Luxembourg Department of Immunology, Research Institute of Psychobiology, University of Trier, Trier D-54290, Germany
| | - Oliver E Hunewald
- Department of Infection and Immunity, Luxembourg Institute of Health, Esch-Sur-Alzette L-4354, Grand-Duchy of Luxembourg
| | - Stephanie Schmitz
- Department of Infection and Immunity, Luxembourg Institute of Health, Esch-Sur-Alzette L-4354, Grand-Duchy of Luxembourg
| | - Anne M Molitor
- Department of Infection and Immunity, Luxembourg Institute of Health, Esch-Sur-Alzette L-4354, Grand-Duchy of Luxembourg
| | - Claude P Muller
- Department of Infection and Immunity, Luxembourg Institute of Health, Esch-Sur-Alzette L-4354, Grand-Duchy of Luxembourg Department of Immunology, Research Institute of Psychobiology, University of Trier, Trier D-54290, Germany
| | - Jonathan D Turner
- Department of Infection and Immunity, Luxembourg Institute of Health, Esch-Sur-Alzette L-4354, Grand-Duchy of Luxembourg
| |
Collapse
|
88
|
Singh A, Jethva M, Singla-Pareek SL, Pareek A, Kushwaha HR. Analyses of Old "Prokaryotic" Proteins Indicate Functional Diversification in Arabidopsis and Oryza sativa. FRONTIERS IN PLANT SCIENCE 2016; 7:304. [PMID: 27014324 PMCID: PMC4792156 DOI: 10.3389/fpls.2016.00304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/05/2015] [Accepted: 02/26/2016] [Indexed: 06/05/2023]
Abstract
During evolution, various processes such as duplication, divergence, recombination, and many other events leads to the evolution of new genes with novel functions. These evolutionary events, thus significantly impact the evolution of cellular, physiological, morphological, and other phenotypic trait of organisms. While evolving, eukaryotes have acquired large number of genes from the earlier prokaryotes. This work is focused upon identification of old "prokaryotic" proteins in Arabidopsis and Oryza sativa genome, further highlighting their possible role(s) in the two genomes. Our results suggest that with respect to their genome size, the fraction of old "prokaryotic" proteins is higher in Arabidopsis than in Oryza sativa. The large fractions of such proteins encoding genes were found to be localized in various endo-symbiotic organelles. The domain architecture of the old "prokaryotic" proteins revealed similar distribution in both Arabidopsis and Oryza sativa genomes showing their conserved evolution. In Oryza sativa, the old "prokaryotic" proteins were more involved in developmental processes, might be due to constant man-made selection pressure for better agronomic traits/productivity. While in Arabidopsis, these proteins were involved in metabolic functions. Overall, the analysis indicates the distinct pattern of evolution of old "prokaryotic" proteins in Arabidopsis and Oryza sativa.
Collapse
Affiliation(s)
- Anupama Singh
- School of Computational and Integrative Sciences, Jawaharlal Nehru UniversityNew Delhi, India
| | - Minesh Jethva
- International Center for Genetic Engineering and BiotechnologyNew Delhi, India
| | - Sneh L. Singla-Pareek
- Plant Stress Biology, International Center for Genetic Engineering and BiotechnologyNew Delhi, India
| | - Ashwani Pareek
- Stress Physiology and Molecular Biology Laboratory, School of Life Sciences, Jawaharlal Nehru UniversityNew Delhi, India
| | - Hemant R. Kushwaha
- International Center for Genetic Engineering and BiotechnologyNew Delhi, India
| |
Collapse
|
89
|
Binet M, Gascuel O, Scornavacca C, Douzery EJP, Pardi F. Fast and accurate branch lengths estimation for phylogenomic trees. BMC Bioinformatics 2016; 17:23. [PMID: 26744021 PMCID: PMC4705742 DOI: 10.1186/s12859-015-0821-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2015] [Accepted: 11/02/2015] [Indexed: 01/26/2023] Open
Abstract
Background Branch lengths are an important attribute of phylogenetic trees, providing essential information for many studies in evolutionary biology. Yet, part of the current methodology to reconstruct a phylogeny from genomic information — namely supertree methods — focuses on the topology or structure of the phylogenetic tree, rather than the evolutionary divergences associated to it. Moreover, accurate methods to estimate branch lengths — typically based on probabilistic analysis of a concatenated alignment — are limited by large demands in memory and computing time, and may become impractical when the data sets are too large. Results Here, we present a novel phylogenomic distance-based method, named ERaBLE (Evolutionary Rates and Branch Length Estimation), to estimate the branch lengths of a given reference topology, and the relative evolutionary rates of the genes employed in the analysis. ERaBLE uses as input data a potentially very large collection of distance matrices, where each matrix is obtained from a different genomic region — either directly from its sequence alignment, or indirectly from a gene tree inferred from the alignment. Our experiments show that ERaBLE is very fast and fairly accurate when compared to other possible approaches for the same tasks. Specifically, it efficiently and accurately deals with large data sets, such as the OrthoMaM v8 database, composed of 6,953 exons from up to 40 mammals. Conclusions ERaBLE may be used as a complement to supertree methods — or it may provide an efficient alternative to maximum likelihood analysis of concatenated alignments — to estimate branch lengths from phylogenomic data sets. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0821-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Manuel Binet
- Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), CNRS, Université de Montpellier, Montpellier, France. .,Institut de Biologie Computationnelle, Montpellier, France. .,Institut des Sciences de l'Evolution de Montpellier, CNRS, IRD, EPHE, Université de Montpellier, France.
| | - Olivier Gascuel
- Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), CNRS, Université de Montpellier, Montpellier, France. .,Institut de Biologie Computationnelle, Montpellier, France.
| | - Celine Scornavacca
- Institut de Biologie Computationnelle, Montpellier, France. .,Institut des Sciences de l'Evolution de Montpellier, CNRS, IRD, EPHE, Université de Montpellier, France.
| | - Emmanuel J P Douzery
- Institut des Sciences de l'Evolution de Montpellier, CNRS, IRD, EPHE, Université de Montpellier, France.
| | - Fabio Pardi
- Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), CNRS, Université de Montpellier, Montpellier, France. .,Institut de Biologie Computationnelle, Montpellier, France.
| |
Collapse
|
90
|
Glastad KM, Goodisman MAD, Yi SV, Hunt BG. Effects of DNA Methylation and Chromatin State on Rates of Molecular Evolution in Insects. G3 (BETHESDA, MD.) 2015; 6:357-63. [PMID: 26637432 PMCID: PMC4751555 DOI: 10.1534/g3.115.023499] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/06/2015] [Accepted: 11/30/2015] [Indexed: 01/03/2023]
Abstract
Epigenetic information is widely appreciated for its role in gene regulation in eukaryotic organisms. However, epigenetic information can also influence genome evolution. Here, we investigate the effects of epigenetic information on gene sequence evolution in two disparate insects: the fly Drosophila melanogaster, which lacks substantial DNA methylation, and the ant Camponotus floridanus, which possesses a functional DNA methylation system. We found that DNA methylation was positively correlated with the synonymous substitution rate in C. floridanus, suggesting a key effect of DNA methylation on patterns of gene evolution. However, our data suggest the link between DNA methylation and elevated rates of synonymous substitution was explained, in large part, by the targeting of DNA methylation to genes with signatures of transcriptionally active chromatin, rather than the mutational effect of DNA methylation itself. This phenomenon may be explained by an elevated mutation rate for genes residing in transcriptionally active chromatin, or by increased structural constraints on genes in inactive chromatin. This result highlights the importance of chromatin structure as the primary epigenetic driver of genome evolution in insects. Overall, our study demonstrates how different epigenetic systems contribute to variation in the rates of coding sequence evolution.
Collapse
Affiliation(s)
- Karl M Glastad
- School of Biology, Georgia Institute of Technology, Atlanta, Georgia 30332
| | | | - Soojin V Yi
- School of Biology, Georgia Institute of Technology, Atlanta, Georgia 30332
| | - Brendan G Hunt
- Department of Entomology, University of Georgia, Griffin, Georgia 30223
| |
Collapse
|
91
|
Gu X, Tang W. Model parameters of molecular evolution explain genomic correlations. Brief Bioinform 2015; 18:37-42. [PMID: 26628558 DOI: 10.1093/bib/bbv098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2015] [Revised: 10/01/2015] [Indexed: 11/13/2022] Open
Abstract
One long-standing research focus in evolutionary genomics is trying to resolve how biological variables (expression, essentiality, protein-protein interaction, structural stability, etc.) determine the rate of protein evolution. While these studies have considerably deepened our understanding of molecular evolution, many issues remain unsolved. In this opinion article, after having a brief survey of literatures, we establish relationships between model parameters of molecular evolution and genomic variables, based on which, most-observed genomic correlations and confounds can be explained by model parameter combinations under different conditions, which include the strength of stabilizing selection, mutational variance, expression sufficiency, gene pleiotropy, as well as the effective population size. We suggest that the problem to discern biological variable(s) that may determine the rate of protein evolution can be tackled at two levels. The first level, as discussed here, is to demonstrate how the model of molecular evolution can predict potential genomic correlations under various conditions. And the second level is to estimate genome-wide variations of model parameters (or combinations) that help to identify canonical biological variables that may underlie the rate variation among genes that ranges up to at least three magnitudes.
Collapse
|
92
|
Li J, Li R, Wang Y, Hu X, Zhao Y, Li L, Feng C, Gu X, Liang F, Lamont SJ, Hu S, Zhou H, Li N. Genome-wide DNA methylome variation in two genetically distinct chicken lines using MethylC-seq. BMC Genomics 2015; 16:851. [PMID: 26497311 PMCID: PMC4619007 DOI: 10.1186/s12864-015-2098-8] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2014] [Accepted: 10/15/2015] [Indexed: 12/30/2022] Open
Abstract
Background DNA cytosine methylation is an important epigenetic modification that has significant effects on a variety of biological processes in animals. Avian species hold a crucial position in evolutionary history. In this study, we used whole-genome bisulfite sequencing (MethylC-seq) to generate single base methylation profiles of lungs in two genetically distinct and highly inbred chicken lines (Fayoumi and Leghorn) that differ in genetic resistance to multiple pathogens, and we explored the potential regulatory role of DNA methylation associated with immune response differences between the two chicken lines. Methods The MethylC-seq was used to generate single base DNA methylation profiles of Fayoumi and Leghorn birds. In addition, transcriptome profiling using RNA–seq from the same chickens and tissues were obtained to interrogate how DNA methylation regulates gene transcription on a genome-wide scale. Results The general DNA methylation pattern across different regions of genes was conserved compared to other species except for hyper-methylation of repeat elements, which was not observed in chicken. The methylation level of miRNA and pseudogene promoters was high, which indicates that silencing of these genes may be partially due to promoter hyper-methylation. Interestingly, the promoter regions of more recently evolved genes tended to be more highly methylated, whereas the gene body regions of evolutionarily conserved genes were more highly methylated than those of more recently evolved genes. Immune-related GO (Gene Ontology) terms were significantly enriched from genes within the differentially methylated regions (DMR) between Fayoumi and Leghorn, which implicates DNA methylation as one of the regulatory mechanisms modulating immune response differences between these lines. Conclusions This study establishes a single-base resolution DNA methylation profile of chicken lung and suggests a regulatory role of DNA methylation in controlling gene expression and maintaining genome transcription stability. Furthermore, profiling the DNA methylomes of two genetic lines that differ in disease resistance provides a unique opportunity to investigate the potential role of DNA methylation in host disease resistance. Our study provides a foundation for future studies on epigenetic modulation of host immune response to pathogens in chickens. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-2098-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jinxiu Li
- The State Key Laboratory for Agro-biotechnology, China Agricultural University, Beijing, 100193, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Rujiao Li
- Core Genomic Facility, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Ying Wang
- Department of Animal Science, University of California, Davis, CA, 95616, USA
| | - Xiaoxiang Hu
- The State Key Laboratory for Agro-biotechnology, China Agricultural University, Beijing, 100193, China
| | - Yiqiang Zhao
- The State Key Laboratory for Agro-biotechnology, China Agricultural University, Beijing, 100193, China
| | - Li Li
- The State Key Laboratory for Agro-biotechnology, China Agricultural University, Beijing, 100193, China
| | - Chungang Feng
- The State Key Laboratory for Agro-biotechnology, China Agricultural University, Beijing, 100193, China
| | - Xiaorong Gu
- The State Key Laboratory for Agro-biotechnology, China Agricultural University, Beijing, 100193, China
| | - Fang Liang
- Core Genomic Facility, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Susan J Lamont
- Department of Animal Science, Iowa State University, Ames, IA, 50011, USA
| | - Songnian Hu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Huaijun Zhou
- Department of Animal Science, University of California, Davis, CA, 95616, USA. .,Department of Poultry Science, Texas A&M University, College Station, TX, 77845, USA.
| | - Ning Li
- The State Key Laboratory for Agro-biotechnology, China Agricultural University, Beijing, 100193, China. .,National Engineering Laboratory for Animal Breeding, China Agricultural University, Beijing, 100193, China. .,College of Animal Science and Technology, Yunnan Agricultural University, Kunming, Yunnan, China.
| |
Collapse
|
93
|
Chain FJJ. Sex-Biased Expression of Young Genes in Silurana (Xenopus) tropicalis. Cytogenet Genome Res 2015; 145:265-77. [PMID: 26065714 DOI: 10.1159/000430942] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Sex-biased gene expression can evolve from sex-specific selection and is often associated with sex-linked genes. Gene duplication is a particularly effective mechanism for the generation of sex-biased genes, in which a new copy can help resolve intralocus sexual conflicts. This study assesses sex-biased gene expression in an amphibian with homomorphic ZW sex chromosomes, the Western clawed frog Silurana (Xenopus)tropicalis. Previous work has shown that the sex chromosomes in this species are mainly undifferentiated and pseudoautosomal. Consistent with ongoing recombination between the sex chromosomes, this study detected little evidence for the general sexualization of sex-linked regions. A subset of genes closely linked to the sex determining locus displays a tendency for male-biased expression and elevated rates of evolution relative to genes in other genomic locations. This may be a symptom of an early stage of sex chromosome differentiation driven by, for example, chromosomal degeneration or natural selection on genes in this portion of the Z chromosome. Alternatively, it could reflect variation between the sexes in allelic copy number coupled with a lack of dosage compensation. Irrespective of the genomic location, lineage-specific genes and recently duplicated genes had significantly high levels of sex-biased expression, offering insights into the early transcriptional differentiation of young genes.
Collapse
|
94
|
Faure G, Koonin EV. Universal distribution of mutational effects on protein stability, uncoupling of protein robustness from sequence evolution and distinct evolutionary modes of prokaryotic and eukaryotic proteins. Phys Biol 2015; 12:035001. [PMID: 25927823 DOI: 10.1088/1478-3975/12/3/035001] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Robustness to destabilizing effects of mutations is thought of as a key factor of protein evolution. The connections between two measures of robustness, the relative core size and the computationally estimated effect of mutations on protein stability (ΔΔG), protein abundance and the selection pressure on protein-coding genes (dN/dS) were analyzed for the organisms with a large number of available protein structures including four eukaryotes, two bacteria and one archaeon. The distribution of the effects of mutations in the core on protein stability is universal and indistinguishable in eukaryotes and bacteria, centered at slightly destabilizing amino acid replacements, and with a heavy tail of more strongly destabilizing replacements. The distribution of mutational effects in the hyperthermophilic archaeon Thermococcus gammatolerans is significantly shifted toward strongly destabilizing replacements which is indicative of stronger constraints that are imposed on proteins in hyperthermophiles. The median effect of mutations is strongly, positively correlated with the relative core size, in evidence of the congruence between the two measures of protein robustness. However, both measures show only limited correlations to the expression level and selection pressure on protein-coding genes. Thus, the degree of robustness reflected in the universal distribution of mutational effects appears to be a fundamental, ancient feature of globular protein folds whereas the observed variations are largely neutral and uncoupled from short term protein evolution. A weak anticorrelation between protein core size and selection pressure is observed only for surface residues in prokaryotes but a stronger anticorrelation is observed for all residues in eukaryotic proteins. This substantial difference between proteins of prokaryotes and eukaryotes is likely to stem from the demonstrable higher compactness of prokaryotic proteins.
Collapse
Affiliation(s)
- Guilhem Faure
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | | |
Collapse
|
95
|
Brion C, Pflieger D, Friedrich A, Schacherer J. Evolution of intraspecific transcriptomic landscapes in yeasts. Nucleic Acids Res 2015; 43:4558-68. [PMID: 25897111 PMCID: PMC4482089 DOI: 10.1093/nar/gkv363] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2015] [Accepted: 04/02/2015] [Indexed: 01/13/2023] Open
Abstract
Variations in gene expression have been widely explored in order to obtain an accurate overview of the changes in regulatory networks that underlie phenotypic diversity. Numerous studies have characterized differences in genomic expression between large numbers of individuals of model organisms such as Saccharomyces cerevisiae. To more broadly survey the evolution of the transcriptomic landscape across species, we measured whole-genome expression in a large collection of another yeast species: Lachancea kluyveri (formerly Saccharomyces kluyveri), using RNAseq. Interestingly, this species diverged from the S. cerevisiae lineage prior to its ancestral whole genome duplication. Moreover, L. kluyveri harbors a chromosome-scale compositional heterogeneity due to a 1-Mb ancestral introgressed region as well as a large set of unique unannotated genes. In this context, our comparative transcriptomic analysis clearly showed a link between gene evolutionary history and expression behavior. Indeed, genes that have been recently acquired or under function relaxation tend to be less transcribed show a higher intraspecific variation (plasticity) and are less involved in network (connectivity). Moreover, utilizing this approach in L. kluyveri also highlighted specific regulatory network signatures in aerobic respiration, amino-acid biosynthesis and glycosylation, presumably due to its different lifestyle. Our data set sheds an important light on the evolution of intraspecific transcriptomic variation across distant species.
Collapse
Affiliation(s)
- Christian Brion
- Department of Genetics, Genomics and Microbiology, University of Strasbourg, CNRS, UMR7156, Strasbourg, France
| | - David Pflieger
- Department of Genetics, Genomics and Microbiology, University of Strasbourg, CNRS, UMR7156, Strasbourg, France
| | - Anne Friedrich
- Department of Genetics, Genomics and Microbiology, University of Strasbourg, CNRS, UMR7156, Strasbourg, France
| | - Joseph Schacherer
- Department of Genetics, Genomics and Microbiology, University of Strasbourg, CNRS, UMR7156, Strasbourg, France
| |
Collapse
|
96
|
Merkin JJ, Chen P, Alexis MS, Hautaniemi SK, Burge CB. Origins and impacts of new mammalian exons. Cell Rep 2015; 10:1992-2005. [PMID: 25801031 DOI: 10.1016/j.celrep.2015.02.058] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2014] [Revised: 01/09/2015] [Accepted: 02/23/2015] [Indexed: 02/08/2023] Open
Abstract
Mammalian genes are composed of exons, but the evolutionary origins and functions of new internal exons are poorly understood. Here, we analyzed patterns of exon gain using deep cDNA sequencing data from five mammals and one bird, identifying thousands of species- and lineage-specific exons. Most new exons derived from unique rather than repetitive intronic sequence. Unlike exons conserved across mammals, species-specific internal exons were mostly located in 5' UTRs and alternatively spliced. They were associated with upstream intronic deletions, increased nucleosome occupancy, and RNA polymerase II pausing. Genes containing new internal exons had increased gene expression, but only in tissues in which the exon was included. Increased expression correlated with the level of exon inclusion, promoter proximity, and signatures of cotranscriptional splicing. Altogether, these findings suggest that increased splicing at the 5' ends of genes enhances expression and that changes in 5' end splicing alter gene expression between tissues and between species.
Collapse
Affiliation(s)
- Jason J Merkin
- Departments of Biology and Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Ping Chen
- Research Programs Unit, Genome-Scale Biology and Institute of Biomedicine, University of Helsinki, 00014 Helsinki, Finland
| | - Maria S Alexis
- Departments of Biology and Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02142, USA; Program in Computational and Systems Biology, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Sampsa K Hautaniemi
- Research Programs Unit, Genome-Scale Biology and Institute of Biomedicine, University of Helsinki, 00014 Helsinki, Finland
| | - Christopher B Burge
- Departments of Biology and Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02142, USA; Program in Computational and Systems Biology, Massachusetts Institute of Technology, Cambridge, MA 02142, USA.
| |
Collapse
|
97
|
Acharya D, Mukherjee D, Podder S, Ghosh TC. Investigating different duplication pattern of essential genes in mouse and human. PLoS One 2015; 10:e0120784. [PMID: 25751152 PMCID: PMC4353620 DOI: 10.1371/journal.pone.0120784] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2014] [Accepted: 01/27/2015] [Indexed: 11/18/2022] Open
Abstract
Gene duplication is one of the major driving forces shaping genome and organism evolution and thought to be itself regulated by some intrinsic properties of the gene. Comparing the essential genes among mouse and human, we observed that the essential genes avoid duplication in mouse while prefer to remain duplicated in humans. In this study, we wanted to explore the reasons behind such differences in gene essentiality by cross-species comparison of human and mouse. Moreover, we examined essential genes that are duplicated in humans are functionally more redundant than that in mouse. The proportion of paralog pseudogenization of essential genes is higher in mouse than that of humans. These duplicates of essential genes are under stringent dosage regulation in human than in mouse. We also observed slower evolutionary rate in the paralogs of human essential genes than the mouse counterpart. Together, these results clearly indicate that human essential genes are retained as duplicates to serve as backed up copies that may shield themselves from harmful mutations.
Collapse
Affiliation(s)
- Debarun Acharya
- Bioinformatics Centre, Bose Institute, Kolkata, West Bengal, India
| | - Dola Mukherjee
- Bioinformatics Centre, Bose Institute, Kolkata, West Bengal, India
| | - Soumita Podder
- Bioinformatics Centre, Bose Institute, Kolkata, West Bengal, India
| | - Tapash C. Ghosh
- Bioinformatics Centre, Bose Institute, Kolkata, West Bengal, India
- * E-mail:
| |
Collapse
|
98
|
Badyaev AV. Epigenetic resolution of the 'curse of complexity' in adaptive evolution of complex traits. J Physiol 2015; 592:2251-60. [PMID: 24882810 DOI: 10.1113/jphysiol.2014.272625] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
The age of most genes exceeds the longevity of their genomic and physiological associations by many orders of magnitude. Such transient contexts modulate the expression of ancient genes to produce currently appropriate and often highly distinct developmental and functional outcomes. The efficacy of such adaptive modulation is diminished by the high dimensionality of complex organisms and associated vast areas of neutrality in their genotypic and developmental networks (and, thus, weak natural selection). Here I explore whether epigenetic effects facilitate adaptive modulation of complex phenotypes by effectively reducing the dimensionality of their deterministic networks and thus delineating their developmental and evolutionary trajectories even under weak selection. Epigenetic effects that link unconnected or widely dispersed elements of genotype space in ecologically relevant time could account for the rapid appearance of functionally integrated adaptive modifications. On an organismal time scale, conceptually similar processes occur during recurrent epigenetic reprogramming of somatic stem cells to produce, recurrently and reversibly, a bewildering array of differentiated and persistent cell lineages, all sharing identical genomic sequences despite strongly distinct phenotypes. I discuss whether close dependency of onset, scope and duration of epigenetic effects on cellular and genomic context in stem cells could provide insights into contingent modulation of conserved genomic material on a much longer evolutionary time scale. I review potential empirical examples of epigenetic bridges that reduce phenotype dimensionality and accomplish rapid adaptive modulation in the evolution of novelties, expression of behavioural types, and stress-induced ossification schedules.
Collapse
Affiliation(s)
- Alexander V Badyaev
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA
| |
Collapse
|
99
|
Snir S. On the number of genomic pacemakers: a geometric approach. Algorithms Mol Biol 2014; 9:26. [PMID: 25648755 PMCID: PMC4301663 DOI: 10.1186/s13015-014-0026-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2014] [Accepted: 11/11/2014] [Indexed: 11/13/2022] Open
Abstract
The universal pacemaker (UPM) model extends the classical molecular clock (MC) model, by allowing each gene, in addition to its individual intrinsic rate as in the MC, to accelerate or decelerate according to the universal pacemaker. Under UPM, the relative evolutionary rates of all genes remain nearly constant whereas the absolute rates can change arbitrarily. It was shown on several taxa groups spanning the entire tree of life that the UPM model describes the evolutionary process better than the MC model. In this work we provide a natural generalization to the UPM model that we denote multiple pacemakers (MPM). Under the MPM model every gene is still affected by a single pacemaker, however the number of pacemakers is not confined to one. Such a model induces a partition over the gene set where all the genes in one part are affected by the same pacemaker and task is to identify the pacemaker partition, or in other words, finding for each gene its associated pacemaker. We devise a novel heuristic procedure, relying on statistical and geometrical tools, to solve the problem and demonstrate by simulation that this approach can cope satisfactorily with considerable noise and realistic problem sizes. We applied this procedure to a set of over 2000 genes in 100 prokaryotes and demonstrated the significant existence of two pacemakers.
Collapse
|
100
|
Popadin K, Gutierrez-Arcelus M, Lappalainen T, Buil A, Steinberg J, Nikolaev S, Lukowski S, Bazykin G, Seplyarskiy V, Ioannidis P, Zdobnov E, Dermitzakis E, Antonarakis S. Gene age predicts the strength of purifying selection acting on gene expression variation in humans. Am J Hum Genet 2014; 95:660-74. [PMID: 25480033 DOI: 10.1016/j.ajhg.2014.11.003] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2014] [Accepted: 11/10/2014] [Indexed: 10/24/2022] Open
Abstract
Gene expression levels can be subject to selection. We hypothesized that the age of gene origin is associated with expression constraints, given that it affects the level of gene integration into the functional cellular environment. By studying the genetic variation affecting gene expression levels (cis expression quantitative trait loci [cis-eQTLs]) and protein levels (cis protein QTLs [cis-pQTLs]), we determined that young, primate-specific genes are enriched in cis-eQTLs and cis-pQTLs. Compared to cis-eQTLs of old genes originating before the zebrafish divergence, cis-eQTLs of young genes have a higher effect size, are located closer to the transcription start site, are more significant, and tend to influence genes in multiple tissues and populations. These results suggest that the expression constraint of each gene increases throughout its lifespan. We also detected a positive correlation between expression constraints (approximated by cis-eQTL properties) and coding constraints (approximated by Ka/Ks) and observed that this correlation might be driven by gene age. To uncover factors associated with the increase in gene-age-related expression constraints, we demonstrated that gene connectivity, gene involvement in complex regulatory networks, gene haploinsufficiency, and the strength of posttranscriptional regulation increase with gene age. We also observed an increase in heritability of gene expression levels with age, implying a reduction of the environmental component. In summary, we show that gene age shapes key gene properties during evolution and is therefore an important component of genome function.
Collapse
|