1
|
Siddiqui M, Conant GC. POInT browse: orthology prediction and synteny exploration for paleopolyploid genomes. BMC Bioinformatics 2023; 24:174. [PMID: 37106333 PMCID: PMC10134530 DOI: 10.1186/s12859-023-05298-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Accepted: 04/19/2023] [Indexed: 04/29/2023] Open
Abstract
We describe POInTbrowse, a web portal that gives access to the orthology inferences made for polyploid genomes with POInT, the Polyploidy Orthology Inference Tool. Ancient, or paleo-, polyploidy events are widely distributed across the eukaryotic phylogeny, and the combination of duplicated and lost duplicated genes that these polyploidies produce can confound the identification of orthologous genes between genomes. POInT uses conserved synteny and phylogenetic models to infer orthologous genes between genomes with a shared polyploidy. It also gives confidence estimates for those orthology inferences. POInTbrowse gives both graphical and query-based access to these inferences from 12 different polyploidy events, allowing users to visualize genomic regions produced by polyploidies and perform batch queries for each polyploidy event, downloading genes trees and coding sequences for orthologous genes meeting user-specified criteria. POInTbrowse and the associated data are online at https://wgd.statgen.ncsu.edu .
Collapse
Affiliation(s)
- Mustafa Siddiqui
- Department of Biological Sciences, North Carolina State University, Raleigh, NC, USA
| | - Gavin C Conant
- Department of Biological Sciences, North Carolina State University, Raleigh, NC, USA.
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA.
- Program in Genetics, North Carolina State University, Raleigh, NC, USA.
| |
Collapse
|
2
|
Abstract
Ancient polyploidy events are widely distributed across the evolutionary history of eukaryotes. Here, we describe a likelihood-based tool, POInT (the Polyploidy Orthology Inference Tool), for modeling ancient whole genome duplications and triplications, assigning homoeologous genes to subgenomes and inferring gene losses across different parental subgenomes after polyploidy.
Collapse
Affiliation(s)
- Yue Hao
- Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe, AZ, USA
| | - Gavin C Conant
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA.
- Program in Genetics, North Carolina State University, Raleigh, NC, USA.
- Department of Biological Sciences, North Carolina State University, Raleigh, NC, USA.
| |
Collapse
|
3
|
Karn RC, Yazdanifar G, Pezer Ž, Boursot P, Laukaitis CM. Androgen-Binding Protein (Abp) Evolutionary History: Has Positive Selection Caused Fixation of Different Paralogs in Different Taxa of the Genus Mus? Genome Biol Evol 2021; 13:6377336. [PMID: 34581786 PMCID: PMC8525912 DOI: 10.1093/gbe/evab220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/20/2021] [Indexed: 11/14/2022] Open
Abstract
Comparison of the androgen-binding protein (Abp) gene regions of six Mus genomes provides insights into the evolutionary history of this large murid rodent gene family. We identified 206 unique Abp sequences and mapped their physical relationships. At least 48 are duplicated and thus present in more than two identical copies. All six taxa have substantially elevated LINE1 densities in Abp regions compared with flanking regions, similar to levels in mouse and rat genomes, although nonallelic homologous recombination seems to have only occurred in Mus musculus domesticus. Phylogenetic and structural relationships support the hypothesis that the extensive Abp expansion began in an ancestor of the genus Mus. We also found duplicated Abpa27's in two taxa, suggesting that previously reported selection on a27 alleles may have actually detected selection on haplotypes wherein different paralogs were lost in each. Other studies reported that a27 gene and species trees were incongruent, likely because of homoplasy. However, L1MC3 phylogenies, supposed to be homoplasy-free compared with coding regions, support our paralog hypothesis because the L1MC3 phylogeny was congruent with the a27 topology. This paralog hypothesis provides an alternative explanation for the origin of the a27 gene that is suggested to be fixed in the three different subspecies of Mus musculus and to mediate sexual selection and incipient reinforcement between at least two of them. Finally, we ask why there are so many Abp genes, especially given the high frequency of pseudogenes and suggest that relaxed selection operates over a large part of the gene clusters.
Collapse
Affiliation(s)
- Robert C Karn
- Gene Networks in Neural and Developmental Plasticity, Institute for Genomic Biology, University of Illinois, Urbana, Illinois, USA
| | | | - Željka Pezer
- Division of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia
| | - Pierre Boursot
- Institut des Sciences de l'Evolution Montpellier, Université de Montpellier, CNRS, IRD, France
| | - Christina M Laukaitis
- Carle Health and Carle Illinois College of Medicine, University of Illinois, Urbana-Champaign, USA
| |
Collapse
|
4
|
Mullis A, Lu Z, Zhan Y, Wang TY, Rodriguez J, Rajeh A, Chatrath A, Lin Z. Parallel Concerted Evolution of Ribosomal Protein Genes in Fungi and Its Adaptive Significance. Mol Biol Evol 2020; 37:455-468. [PMID: 31589316 PMCID: PMC6993855 DOI: 10.1093/molbev/msz229] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Ribosomal protein (RP) genes encode structural components of ribosomes, the cellular machinery for protein synthesis. A single functional copy has been maintained in most of 78–80 RP families in animals due to evolutionary constraints imposed by gene dosage balance. Some fungal species have maintained duplicate copies in most RP families. The mechanisms by which the RP genes were duplicated and maintained and their functional significance are poorly understood. To address these questions, we identified all RP genes from 295 fungi and inferred the timing and nature of gene duplication events for all RP families. We found that massive duplications of RP genes have independently occurred by different mechanisms in three distantly related lineages: budding yeasts, fission yeasts, and Mucoromycota. The RP gene duplicates in budding yeasts and Mucoromycota were mainly created by whole genome duplication events. However, duplicate RP genes in fission yeasts were likely generated by retroposition, which is unexpected considering their dosage sensitivity. The sequences of most RP paralogs have been homogenized by repeated gene conversion in each species, demonstrating parallel concerted evolution, which might have facilitated the retention of their duplicates. Transcriptomic data suggest that the duplication and retention of RP genes increased their transcript abundance. Physiological data indicate that increased ribosome biogenesis allowed these organisms to rapidly consume sugars through fermentation while maintaining high growth rates, providing selective advantages to these species in sugar-rich environments.
Collapse
Affiliation(s)
- Alison Mullis
- Department of Biology, Saint Louis University, St. Louis, MO
| | - Zhaolian Lu
- Department of Biology, Saint Louis University, St. Louis, MO
| | - Yu Zhan
- Department of Biology, Saint Louis University, St. Louis, MO
| | - Tzi-Yuan Wang
- Biodiversity Research Center, Academia Sinica, Nankang, Taipei, Taiwan
| | - Judith Rodriguez
- Program of Bioinformatics and Computational Biology, Saint Louis University, St. Louis, MO
| | - Ahmad Rajeh
- Department of Biology, Saint Louis University, St. Louis, MO.,Program of Bioinformatics and Computational Biology, Saint Louis University, St. Louis, MO
| | - Ajay Chatrath
- Department of Biology, Saint Louis University, St. Louis, MO
| | - Zhenguo Lin
- Department of Biology, Saint Louis University, St. Louis, MO
| |
Collapse
|
5
|
Gene tree species tree reconciliation with gene conversion. J Math Biol 2019; 78:1981-2014. [PMID: 30767052 DOI: 10.1007/s00285-019-01331-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2017] [Revised: 10/03/2018] [Indexed: 01/19/2023]
Abstract
Gene tree/species tree reconciliation is a recent decisive progress in phylogenetic methods, accounting for the possible differences between gene histories and species histories. Reconciliation consists in explaining these differences by gene-scale events such as duplication, loss, transfer, which translates mathematically into a mapping between gene tree nodes and species tree nodes or branches. Gene conversion is a frequent and important evolutionary event, which results in the replacement of a gene by a copy of another from the same species and in the same gene tree. Including this event in reconciliation models has never been attempted because it introduces a dependency between lineages, and standard algorithms based on dynamic programming become ineffective. We propose here a novel mathematical framework including gene conversion as an evolutionary event in gene tree/species tree reconciliation. We describe a randomized algorithm that finds, in polynomial running time, a reconciliation minimizing the number of duplications, losses and conversions in the case when their weights are equal. We show that the space of optimal reconciliations includes an analog of the last common ancestor reconciliation, but is not limited to it. Our algorithm outputs any optimal reconciliation with a non-null probability. We argue that this study opens a research avenue on including gene conversion in reconciliation, and discuss its possible importance in biology.
Collapse
|
6
|
Contrasting patterns of coding and flanking region evolution in mammalian keratin associated protein-1 genes. Mol Phylogenet Evol 2018; 133:352-361. [PMID: 30599197 DOI: 10.1016/j.ympev.2018.12.031] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2018] [Revised: 12/15/2018] [Accepted: 12/26/2018] [Indexed: 12/17/2022]
Abstract
Mammalian genomes contain a number of duplicated genes, and sequence identity between these duplicates can be maintained by purifying selection. However, between-duplicate recombination can also maintain sequence identity between copies, resulting in a pattern known as concerted evolution where within-genome repeats are more similar to each other than to orthologous repeats in related species. Here we investigated the tandemly-repeated keratin-associated protein 1 (KAP1) gene family, KRTAP1, which encodes proteins that are important components of hair and wool in mammals. Comparison of eutherian mammal KRTAP1 gene repeats within and between species shows a strong pattern of concerted evolution. However, in striking contrast to the coding regions of these genes, we find that the flanking regions have a divergent pattern of evolution. This contrast in evolutionary pattern transitions abruptly near the start and stop codons of the KRTAP1 genes. We reveal that this difference in evolutionary patterns is not explained by conventional purifying selection, nor is it likely a consequence of codon adaptation or reverse transcription of KRTAP1-n mRNA. Instead, the evidence suggests that these contrasting patterns result from short-tract gene conversion events that are biased to the KRTAP1 coding region by selection and/or differential sequence divergence. This work demonstrates the power that gene conversion has to finely shape the evolution of repetitive genes, and provides another distinctive pattern of contrasting evolutionary outcomes that results from gene conversion. A greater emphasis on exploring the evolution of multi-gene eukaryotic families will reveal how common different contrasting evolutionary patterns are in gene duplicates.
Collapse
|
7
|
Cossu RM, Casola C, Giacomello S, Vidalis A, Scofield DG, Zuccolo A. LTR Retrotransposons Show Low Levels of Unequal Recombination and High Rates of Intraelement Gene Conversion in Large Plant Genomes. Genome Biol Evol 2018; 9:3449-3462. [PMID: 29228262 PMCID: PMC5751070 DOI: 10.1093/gbe/evx260] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/07/2017] [Indexed: 12/29/2022] Open
Abstract
The accumulation and removal of transposable elements (TEs) is a major driver of genome size evolution in eukaryotes. In plants, long terminal repeat (LTR) retrotransposons (LTR-RTs) represent the majority of TEs and form most of the nuclear DNA in large genomes. Unequal recombination (UR) between LTRs leads to removal of intervening sequence and formation of solo-LTRs. UR is a major mechanism of LTR-RT removal in many angiosperms, but our understanding of LTR-RT-associated recombination within the large, LTR-RT-rich genomes of conifers is quite limited. We employ a novel read-based methodology to estimate the relative rates of LTR-RT-associated UR within the genomes of four conifer and seven angiosperm species. We found the lowest rates of UR in the largest genomes studied, conifers and the angiosperm maize. Recombination may also resolve as gene conversion, which does not remove sequence, so we analyzed LTR-RT-associated gene conversion events (GCEs) in Norway spruce and six angiosperms. Opposite the trend for UR, we found the highest rates of GCEs in Norway spruce and maize. Unlike previous work in angiosperms, we found no evidence that rates of UR correlate with retroelement structural features in the conifers, suggesting that another process is suppressing UR in these species. Recent results from diverse eukaryotes indicate that heterochromatin affects the resolution of recombination, by favoring gene conversion over crossing-over, similar to our observation of opposed rates of UR and GCEs. Control of LTR-RT proliferation via formation of heterochromatin would be a likely step toward large genomes in eukaryotes carrying high LTR-RT content.
Collapse
Affiliation(s)
- Rosa Maria Cossu
- Institute of Life Sciences, Scuola Superiore Sant'Anna, Pisa, Italy.,Department of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia (IIT), Genova, Italy
| | - Claudio Casola
- Department of Ecosystem Science and Management, Texas A&M University
| | - Stefania Giacomello
- Science for Life Laboratory, School of Biotechnology, Royal Institute of Technology, Solna, Sweden.,Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Solna, Sweden
| | - Amaryllis Vidalis
- Department of Ecology and Environmental Science, Umeå University, Sweden.,Section of Population Epigenetics and Epigenomics, Center of Life and Food Sciences Weihenstephan, Technische Universität München, Freising, Germany
| | - Douglas G Scofield
- Department of Ecology and Environmental Science, Umeå University, Sweden.,Department of Ecology and Genetics: Evolutionary Biology, Uppsala University, Sweden.,Uppsala Multidisciplinary Center for Advanced Computational Science, Uppsala University, Sweden
| | - Andrea Zuccolo
- Institute of Life Sciences, Scuola Superiore Sant'Anna, Pisa, Italy.,Istituto di Genomica Applicata, Udine, Italy
| |
Collapse
|
8
|
Casola C, Koralewski TE. Pinaceae show elevated rates of gene turnover that are robust to incomplete gene annotation. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2018; 95:862-876. [PMID: 29901849 DOI: 10.1111/tpj.13994] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2017] [Revised: 05/22/2018] [Accepted: 05/29/2018] [Indexed: 06/08/2023]
Abstract
Gene duplications and gene losses are major determinants of genome evolution and phenotypic diversity. The frequency of gene turnover (gene gains and gene losses combined) is known to vary between organisms. Comparative genomic analyses of gene families can highlight such variation; however, estimates of gene turnover may be biased when using highly fragmented genome assemblies resulting in poor gene annotations. Here, we address potential biases introduced by gene annotation errors in estimates of gene turnover frequencies in a dataset including both well-annotated angiosperm genomes and the incomplete gene sets of four Pinaceae, including two pine species, Norway spruce and Douglas-fir. We show that Pinaceae experienced higher gene turnover rates than angiosperm lineages lacking recent whole-genome duplications. This finding is robust to both known major issues in Pinaceae gene sets: missing gene models and erroneous annotation of pseudogenes. A separate analysis limited to the four Pinaceae gene sets pointed to an accelerated gene turnover rate in pines compared with Norway spruce and Douglas-fir. Our results indicate that gene turnover significantly contributes to genome variation and possibly to speciation in Pinaceae, particularly in pines. Moreover, these findings indicate that reliable estimates of gene turnover frequencies can be discerned in incomplete and potentially inaccurate gene sets. Because gymnosperms are known to exhibit low overall substitution rates compared with angiosperms, our results suggest that the rate of single-base pair mutations is uncoupled from the rate of large DNA duplications and deletions associated with gene turnover in Pinaceae.
Collapse
Affiliation(s)
- Claudio Casola
- Department of Ecosystem Science and Management, Texas A&M University, College Station, TX, 77843-2138, USA
| | - Tomasz E Koralewski
- Department of Ecosystem Science and Management, Texas A&M University, College Station, TX, 77843-2138, USA
| |
Collapse
|
9
|
Emery M, Willis MMS, Hao Y, Barry K, Oakgrove K, Peng Y, Schmutz J, Lyons E, Pires JC, Edger PP, Conant GC. Preferential retention of genes from one parental genome after polyploidy illustrates the nature and scope of the genomic conflicts induced by hybridization. PLoS Genet 2018; 14:e1007267. [PMID: 29590103 PMCID: PMC5891031 DOI: 10.1371/journal.pgen.1007267] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2017] [Revised: 04/09/2018] [Accepted: 02/21/2018] [Indexed: 11/18/2022] Open
Abstract
Polyploidy is increasingly seen as a driver of both evolutionary innovation and ecological success. One source of polyploid organisms' successes may be their origins in the merging and mixing of genomes from two different species (e.g., allopolyploidy). Using POInT (the Polyploid Orthology Inference Tool), we model the resolution of three allopolyploidy events, one from the bakers' yeast (Saccharomyces cerevisiae), one from the thale cress (Arabidopsis thaliana) and one from grasses including Sorghum bicolor. Analyzing a total of 21 genomes, we assign to every gene a probability for having come from each parental subgenome (i.e., derived from the diploid progenitor species), yielding orthologous segments across all genomes. Our model detects statistically robust evidence for the existence of biased fractionation in all three lineages, whereby genes from one of the two subgenomes were more likely to be lost than those from the other subgenome. We further find that a driver of this pattern of biased losses is the co-retention of genes from the same parental genome that share functional interactions. The pattern of biased fractionation after the Arabidopsis and grass allopolyploid events was surprisingly constant in time, with the same parental genome favored throughout the lineages' history. In strong contrast, the yeast allopolyploid event shows evidence of biased fractionation only immediately after the event, with balanced gene losses more recently. The rapid loss of functionally associated genes from a single subgenome is difficult to reconcile with the action of genetic drift and suggests that selection may favor the removal of specific duplicates. Coupled to the evidence for continuing, functionally-associated biased fractionation after the A. thaliana At-α event, we suggest that, after allopolyploidy, there are functional conflicts between interacting genes encoded in different subgenomes that are ultimately resolved through preferential duplicate loss.
Collapse
Affiliation(s)
- Marianne Emery
- Division of Biological Sciences, University of Missouri-Columbia, Columbia, Missouri, United States of America
| | - M. Madeline S. Willis
- Department of Biochemistry, University of Missouri-Columbia, Columbia, Missouri, United States of America
| | - Yue Hao
- Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Kerrie Barry
- Department of Energy Joint Genome Institute, Walnut Creek, California, United States of America
| | - Khouanchy Oakgrove
- Department of Energy Joint Genome Institute, Walnut Creek, California, United States of America
| | - Yi Peng
- Department of Energy Joint Genome Institute, Walnut Creek, California, United States of America
| | - Jeremy Schmutz
- Department of Energy Joint Genome Institute, Walnut Creek, California, United States of America
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama, United States of America
| | - Eric Lyons
- School of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
| | - J. Chris Pires
- Division of Biological Sciences, University of Missouri-Columbia, Columbia, Missouri, United States of America
- Informatics Institute, University of Missouri-Columbia, Columbia, Missouri, United States of America
- Bond Life Sciences Center, University of Missouri-Columbia, Columbia, Missouri, United States of America
| | - Patrick P. Edger
- Department of Horticulture, Michigan State University, East Lansing, Michigan, United States of America
- Ecology, Evolutionary Biology and Behavior, Michigan State University, East Lansing, Michigan, United States of America
| | - Gavin C. Conant
- Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, United States of America
- Division of Animal Sciences, University of Missouri-Columbia, Columbia, Missouri, United States of America
- Program in Genetics, North Carolina State University, Raleigh, North Carolina, United States of America
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina, United States of America
- * E-mail:
| |
Collapse
|
10
|
Phylogenomic analysis demonstrates a pattern of rare and long-lasting concerted evolution in prokaryotes. Commun Biol 2018; 1:12. [PMID: 30271899 PMCID: PMC6053082 DOI: 10.1038/s42003-018-0014-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2017] [Accepted: 01/11/2018] [Indexed: 12/15/2022] Open
Abstract
Concerted evolution, where paralogs in the same species show higher sequence similarity to each other than to orthologs in other species, is widely found in many species. However, cases of concerted evolution that last for hundreds of millions of years are very rare. By genome-wide analysis of a broad selection of prokaryotes, we provide strong evidence of recurrent concerted evolution in 26 genes, most of which have lasted more than ~500 million years. We find that most concertedly evolving genes are key members of important pathways, and encode proteins from the same complexes and/or pathways, suggesting coevolution of genes via concerted evolution to maintain gene balance. We also present LRCE-DB, a comprehensive online repository of long-lasting concerted evolution. Collectively, our study reveals that although most duplicated genes may diverge in sequence over a long period, on rare occasions this constraint can be breached, leading to unexpected long-lasting concerted evolution in a recurrent manner. Sishuo Wang and Youhua Chen present an analysis of concerted evolution in prokaryotes using a new computational pipeline, iSeeCE. They find evidence in 26 genes for recurrent concerted evolution, most of which last more than ~500 million years, and provide a database, LRCE-DB, for data exploration.
Collapse
|
11
|
Frequent nonallelic gene conversion on the human lineage and its effect on the divergence of gene duplicates. Proc Natl Acad Sci U S A 2017; 114:12779-12784. [PMID: 29138319 DOI: 10.1073/pnas.1708151114] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Gene conversion is the copying of a genetic sequence from a "donor" region to an "acceptor." In nonallelic gene conversion (NAGC), the donor and the acceptor are at distinct genetic loci. Despite the role NAGC plays in various genetic diseases and the concerted evolution of gene families, the parameters that govern NAGC are not well characterized. Here, we survey duplicate gene families and identify converted tracts in 46% of them. These conversions reflect a large GC bias of NAGC. We develop a sequence evolution model that leverages substantially more information in duplicate sequences than used by previous methods and use it to estimate the parameters that govern NAGC in humans: a mean converted tract length of 250 bp and a probability of [Formula: see text] per generation for a nucleotide to be converted (an order of magnitude higher than the point mutation rate). Despite this high baseline rate, we show that NAGC slows down as duplicate sequences diverge-until an eventual "escape" of the sequences from its influence. As a result, NAGC has a small average effect on the sequence divergence of duplicates. This work improves our understanding of the NAGC mechanism and the role that it plays in the evolution of gene duplicates.
Collapse
|
12
|
Ji X, Griffing A, Thorne JL. A Phylogenetic Approach Finds Abundant Interlocus Gene Conversion in Yeast. Mol Biol Evol 2016; 33:2469-76. [PMID: 27297467 DOI: 10.1093/molbev/msw114] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Interlocus gene conversion (IGC) homogenizes repeats. While genomes can be repeat-rich, the evolutionary importance of IGC is poorly understood. Additional statistical tools for characterizing it are needed. We propose a composite likelihood strategy for incorporating IGC into widely-used probabilistic models for sequence changes that originate with point mutation. We estimated the percentage of nucleotide substitutions that originate with an IGC event rather than a point mutation in 14 groups of yeast ribosomal protein-coding genes, and found values ranging from 20% to 38%. We designed and applied a procedure to determine whether these percentages are inflated due to artifacts arising from model misspecification. The results of this procedure are consistent with IGC having had an important role in the evolution of each of these 14 gene families. We further investigate the properties of our IGC approach via simulation. In contrast to usual practice, our findings suggest that the IGC should and can be considered when multigene family evolution is investigated.
Collapse
Affiliation(s)
- Xiang Ji
- Bioinformatics Research Center, North Carolina State University Department of Statistics, North Carolina State University
| | - Alexander Griffing
- Bioinformatics Research Center, North Carolina State University Department of Biological Sciences, North Carolina State University
| | - Jeffrey L Thorne
- Bioinformatics Research Center, North Carolina State University Department of Statistics, North Carolina State University Department of Biological Sciences, North Carolina State University
| |
Collapse
|
13
|
Moyers BA, Zhang J. Evaluating Phylostratigraphic Evidence for Widespread De Novo Gene Birth in Genome Evolution. Mol Biol Evol 2016; 33:1245-56. [PMID: 26758516 DOI: 10.1093/molbev/msw008] [Citation(s) in RCA: 60] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
The source of genetic novelty is an area of wide interest and intense investigation. Although gene duplication is conventionally thought to dominate the production of new genes, this view was recently challenged by a proposal of widespread de novo gene origination in eukaryotic evolution. Specifically, distributions of various gene properties such as coding sequence length, expression level, codon usage, and probability of being subject to purifying selection among groups of genes with different estimated ages were reported to support a model in which new protein-coding proto-genes arise from noncoding DNA and gradually integrate into cellular networks. Here we show that the genomic patterns asserted to support widespread de novo gene origination are largely attributable to biases in gene age estimation by phylostratigraphy, because such patterns are also observed in phylostratigraphic analysis of simulated genes bearing identical ages. Furthermore, there is no evidence of purifying selection on very young de novo genes previously claimed to show such signals. Together, these findings are consistent with the prevailing view that de novo gene birth is a relatively minor contributor to new genes in genome evolution. They also illustrate the danger of using phylostratigraphy in the study of new gene origination without considering its inherent bias.
Collapse
Affiliation(s)
- Bryan A Moyers
- Department of Computational Medicine and Bioinformatics, University of Michigan
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan
| |
Collapse
|
14
|
Scienski K, Fay JC, Conant GC. Patterns of Gene Conversion in Duplicated Yeast Histones Suggest Strong Selection on a Coadapted Macromolecular Complex. Genome Biol Evol 2015; 7:3249-58. [PMID: 26560339 PMCID: PMC4700949 DOI: 10.1093/gbe/evv216] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
We find evidence for interlocus gene conversion in five duplicated histone genes from six yeast species. The sequences of these duplicated genes, surviving from the ancient genome duplication, show phylogenetic patterns inconsistent with the well-resolved orthology relationships inferred from a likelihood model of gene loss after the genome duplication. Instead, these paralogous genes are more closely related to each other than any is to its nearest ortholog. In addition to simulations supporting gene conversion, we also present evidence for elevated rates of radical amino acid substitutions along the branches implicated in the conversion events. As these patterns are similar to those seen in ribosomal proteins that have undergone gene conversion, we speculate that in cases where duplicated genes code for proteins that are a part of tightly interacting complexes, selection may favor the fixation of gene conversion events in order to maintain high protein identities between duplicated copies.
Collapse
Affiliation(s)
- Kathy Scienski
- Division of Animal Sciences, University of Missouri, Columbia Present address: Genetics Graduate Program, Texas A&M University, College Station, TX
| | - Justin C Fay
- Department of Genetics, Washington University Center for Genome Sciences and Systems Biology, Washington University
| | - Gavin C Conant
- Division of Animal Sciences, University of Missouri, Columbia Informatics Institute, University of Missouri, Columbia
| |
Collapse
|
15
|
Dhroso A, Korkin D, Conant GC. The yeast protein interaction network has a capacity for self-organization. FEBS J 2014; 281:3420-32. [PMID: 24924781 DOI: 10.1111/febs.12870] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2014] [Revised: 05/02/2014] [Accepted: 06/06/2014] [Indexed: 12/20/2022]
Abstract
The organization of the cellular interior gives rise to properties including metabolic channeling and micro-compartmentalization of signaling. Here, we use a lattice model of molecular crowding, together with literature-derived protein interactions and abundances, to describe the molecular organization and stoichiometry of local cellular regions, showing that physical protein-protein interactions induce emergent structures not seen when random interaction networks are modeled. Specifically, we find that the lattices give rise to micro-groups of enzymes on the surfaces of protein clusters. These arrangements of proteins are also robust to protein overexpression, while still showing evidence for expression tuning. Our results indicate that some of the complex organization of the cell may derive from simple rules of molecular aggregation and interaction.
Collapse
Affiliation(s)
- Andi Dhroso
- Department of Computer Science, University of Missouri, Columbia, MO, USA; Informatics Institute, University of Missouri, Columbia, MO, USA
| | | | | |
Collapse
|
16
|
|
17
|
Abstract
Gene duplications are a major source of evolutionary innovations. Understanding the functional divergence of duplicates and their role in genetic robustness is an important challenge in biology. Previously, analyses of genetic robustness were primarily focused on duplicates essentiality and epistasis in several laboratory conditions. In this study, we use several quantitative data sets to understand compensatory interactions between Saccharomyces cerevisiae duplicates that are likely to be relevant in natural biological populations. We find that, owing to their high functional load, close duplicates are unlikely to provide substantial backup in the context of large natural populations. Interestingly, as duplicates diverge from each other, their overall functional load is reduced. At intermediate divergence distances the quantitative decrease in fitness due to removal of one duplicate becomes smaller. At these distances, yeast duplicates display more balanced functional loads and their transcriptional control becomes significantly more complex. As yeast duplicates diverge beyond 70% sequence identity, their ability to compensate for each other becomes similar to that of random pairs of singletons.
Collapse
Affiliation(s)
- Germán Plata
- Department of Systems Biology, Center for Computational Biology and Bioinformatics, Columbia University, New York City, NY 10032, USA, Integrated Program in Cellular, Molecular, Structural, and Genetic Studies, Columbia University, New York City, NY 10032, USA and Department of Biomedical Informatics, Columbia University, New York City, NY 10032, USA
| | | |
Collapse
|
18
|
Hanikenne M, Kroymann J, Trampczynska A, Bernal M, Motte P, Clemens S, Krämer U. Hard selective sweep and ectopic gene conversion in a gene cluster affording environmental adaptation. PLoS Genet 2013; 9:e1003707. [PMID: 23990800 PMCID: PMC3749932 DOI: 10.1371/journal.pgen.1003707] [Citation(s) in RCA: 71] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2013] [Accepted: 06/22/2013] [Indexed: 12/27/2022] Open
Abstract
Among the rare colonizers of heavy-metal rich toxic soils, Arabidopsis halleri is a compelling model extremophile, physiologically distinct from its sister species A. lyrata, and A. thaliana. Naturally selected metal hypertolerance and extraordinarily high leaf metal accumulation in A. halleri both require Heavy Metal ATPase4 (HMA4) encoding a PIB-type ATPase that pumps Zn(2+) and Cd(2+) out of specific cell types. Strongly enhanced HMA4 expression results from a combination of gene copy number expansion and cis-regulatory modifications, when compared to A. thaliana. These findings were based on a single accession of A. halleri. Few studies have addressed nucleotide sequence polymorphism at loci known to govern adaptations. We thus sequenced 13 DNA segments across the HMA4 genomic region of multiple A. halleri individuals from diverse habitats. Compared to control loci flanking the three tandem HMA4 gene copies, a gradual depletion of nucleotide sequence diversity and an excess of low-frequency polymorphisms are hallmarks of positive selection in HMA4 promoter regions, culminating at HMA4-3. The accompanying hard selective sweep is segmentally eclipsed as a consequence of recurrent ectopic gene conversion among HMA4 protein-coding sequences, resulting in their concerted evolution. Thus, HMA4 coding sequences exhibit a network-like genealogy and locally enhanced nucleotide sequence diversity within each copy, accompanied by lowered sequence divergence between paralogs in any given individual. Quantitative PCR corroborated that, across A. halleri, three genomic HMA4 copies generate overall 20- to 130-fold higher transcript levels than in A. thaliana. Together, our observations constitute an unexpectedly complex profile of polymorphism resulting from natural selection for increased gene product dosage. We propose that these findings are paradigmatic of a category of multi-copy genes from a broad range of organisms. Our results emphasize that enhanced gene product dosage, in addition to neo- and sub-functionalization, can account for the genomic maintenance of gene duplicates underlying environmental adaptation.
Collapse
Affiliation(s)
- Marc Hanikenne
- Functional Genomics and Plant Molecular Imaging, Center for Protein Engineering (CIP), Department of Life Sciences, University of Liège, Liège, Belgium
| | - Juergen Kroymann
- Laboratoire d'Ecologie, Systématique et Evolution, Université Paris-Sud/CNRS, Orsay, France
| | | | - María Bernal
- Department of Plant Physiology, Ruhr University Bochum, Bochum, Germany
| | - Patrick Motte
- Functional Genomics and Plant Molecular Imaging, Center for Protein Engineering (CIP), Department of Life Sciences, University of Liège, Liège, Belgium
| | - Stephan Clemens
- Department of Plant Physiology, University of Bayreuth, Bayreuth, Germany
| | - Ute Krämer
- Department of Plant Physiology, Ruhr University Bochum, Bochum, Germany
| |
Collapse
|
19
|
Pegueroles C, Laurie S, Albà MM. Accelerated evolution after gene duplication: a time-dependent process affecting just one copy. Mol Biol Evol 2013; 30:1830-42. [PMID: 23625888 DOI: 10.1093/molbev/mst083] [Citation(s) in RCA: 87] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Gene duplication is widely regarded as a major mechanism modeling genome evolution and function. However, the mechanisms that drive the evolution of the two, initially redundant, gene copies are still ill defined. Many gene duplicates experience evolutionary rate acceleration, but the relative contribution of positive selection and random drift to the retention and subsequent evolution of gene duplicates, and for how long the molecular clock may be distorted by these processes, remains unclear. Focusing on rodent genes that duplicated before and after the mouse and rat split, we find significantly increased sequence divergence after duplication in only one of the copies, which in nearly all cases corresponds to the novel daughter copy, independent of the mechanism of duplication. We observe that the evolutionary rate of the accelerated copy, measured as the ratio of nonsynonymous to synonymous substitutions, is on average 5-fold higher in the period spanning 4-12 My after the duplication than it was before the duplication. This increase can be explained, at least in part, by the action of positive selection according to the results of the maximum likelihood-based branch-site test. Subsequently, the rate decelerates until purifying selection completely returns to preduplication levels. Reversion to the original rates has already been accomplished 40.5 My after the duplication event, corresponding to a genetic distance of about 0.28 synonymous substitutions per site. Differences in tissue gene expression patterns parallel those of substitution rates, reinforcing the role of neofunctionalization in explaining the evolution of young gene duplicates.
Collapse
Affiliation(s)
- Cinta Pegueroles
- Evolutionary Genomics Group, Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Research Institute (IMIM), Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | | | | |
Collapse
|