1
|
Greenhalgh R, Klure DM, Orr TJ, Armstrong NM, Shapiro MD, Dearing MD. The desert woodrat (Neotoma lepida) induces a diversity of biotransformation genes in response to creosote bush resin. Comp Biochem Physiol C Toxicol Pharmacol 2024; 280:109870. [PMID: 38428625 PMCID: PMC11006593 DOI: 10.1016/j.cbpc.2024.109870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 01/26/2024] [Accepted: 02/24/2024] [Indexed: 03/03/2024]
Abstract
Liver biotransformation enzymes have long been thought to enable animals to feed on diets rich in xenobiotic compounds. However, despite decades of pharmacological research in humans and rodents, little is known about hepatic gene expression in specialized mammalian herbivores feeding on toxic diets. Leveraging a recently identified population of the desert woodrat (Neotoma lepida) found to be highly tolerant to toxic creosote bush (Larrea tridentata), we explored the expression changes of suites of biotransformation genes in response to diets enriched with varying amounts of creosote resin. Analysis of hepatic RNA-seq data indicated a dose-dependent response to these compounds, including the upregulation of several genes encoding transcription factors and numerous phase I, II, and III biotransformation families. Notably, elevated expression of five biotransformation families - carboxylesterases, cytochromes P450, aldo-keto reductases, epoxide hydrolases, and UDP-glucuronosyltransferases - corresponded to species-specific duplication events in the genome, suggesting that these genes play a prominent role in N. lepida's adaptation to creosote bush. Building on pharmaceutical studies in model rodents, we propose a hypothesis for how the differentially expressed genes are involved in the biotransformation of creosote xenobiotics. Our results provide some of the first details about how these processes likely operate in the liver of a specialized mammalian herbivore.
Collapse
Affiliation(s)
- Robert Greenhalgh
- School of Biological Sciences, University of Utah, 257 S 1400 E, Salt Lake City, UT 84112, USA.
| | - Dylan M Klure
- School of Biological Sciences, University of Utah, 257 S 1400 E, Salt Lake City, UT 84112, USA.
| | - Teri J Orr
- School of Biological Sciences, University of Utah, 257 S 1400 E, Salt Lake City, UT 84112, USA.
| | - Noah M Armstrong
- School of Biological Sciences, University of Utah, 257 S 1400 E, Salt Lake City, UT 84112, USA.
| | - Michael D Shapiro
- School of Biological Sciences, University of Utah, 257 S 1400 E, Salt Lake City, UT 84112, USA.
| | - M Denise Dearing
- School of Biological Sciences, University of Utah, 257 S 1400 E, Salt Lake City, UT 84112, USA.
| |
Collapse
|
2
|
Zhang H, Ding Y, Yang K, Wang X, Gao W, Xie Q, Liu Z, Gao C. An Insight of Betula platyphylla SWEET Gene Family through Genome-Wide Identification, Expression Profiling and Function Analysis of BpSWEET1c under Cold Stress. Int J Mol Sci 2023; 24:13626. [PMID: 37686432 PMCID: PMC10488219 DOI: 10.3390/ijms241713626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 08/05/2023] [Accepted: 08/11/2023] [Indexed: 09/10/2023] Open
Abstract
SWEET proteins play important roles in plant growth and development, sugar loading in phloem and resistance to abiotic stress through sugar transport. In this study, 13 BpSWEET genes were identified from birch genome. Collinearity analysis showed that there were one tandem repeating gene pair (BpSWEET1b/BpSWEET1c) and two duplicative gene pairs (BpSWEET17a/BpSWEET17b) in the BpSWEET gene family. The BpSWEET gene promoter regions contained several cis-acting elements related to stress resistance, for example: hormone-responsive and low-temperature-responsive cis-elements. Analysis of transcriptome data showed that BpSWEET genes were highly expressed in several sink organs, and the most BpSWEET genes were rapidly up-regulated under cold stress. BpSWEET1c, which was highly expressed in cold stress, was selected for further analysis. It was found that BpSWEET1c was located on the cell membrane. After 6 h of 4 °C stress, sucrose content in the leaves and roots of transient overexpressed BpSWEET1c was significantly higher than that of the control. MDA content in roots was significantly lower than that of the control. These results indicate that BpSWEET1c may play a positive role in the response to cold stress by promoting the metabolism and transport of sucrose. In conclusion, 13 BpSWEET genes were identified from the whole genome level. Most of the SWEET genes of birch were expressed in the sink organs and could respond to cold stress. Transient overexpression of BpSWEET1c changed the soluble sugar content and improved the cold tolerance of birch.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Caiqiu Gao
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, Harbin 150040, China; (H.Z.); (Y.D.); (K.Y.); (X.W.); (W.G.); (Q.X.); (Z.L.)
| |
Collapse
|
3
|
Chen YH, Sharma S, Bewg WP, Xue LJ, Gizelbach CR, Tsai CJ. Multiplex Editing of the Nucleoredoxin1 Tandem Array in Poplar: From Small Indels to Translocations and Complex Inversions. CRISPR J 2023; 6:339-349. [PMID: 37307061 PMCID: PMC10460964 DOI: 10.1089/crispr.2022.0096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 04/21/2023] [Indexed: 06/13/2023] Open
Abstract
The CRISPR-Cas9 system has been deployed for precision mutagenesis in an ever-growing number of species, including agricultural crops and forest trees. Its application to closely linked genes with extremely high sequence similarities has been less explored. In this study, we used CRISPR-Cas9 to mutagenize a tandem array of seven Nucleoredoxin1 (NRX1) genes spanning ∼100 kb in Populus tremula × Populus alba. We demonstrated efficient multiplex editing with one single guide RNA in 42 transgenic lines. The mutation profiles ranged from small insertions and deletions and local deletions in individual genes to large genomic dropouts and rearrangements spanning tandem genes. We also detected complex rearrangements including translocations and inversions resulting from multiple cleavage and repair events. Target capture sequencing was instrumental for unbiased assessments of repair outcomes to reconstruct unusual mutant alleles. The work highlights the power of CRISPR-Cas9 for multiplex editing of tandemly duplicated genes to generate diverse mutants with structural and copy number variations to aid future functional characterization.
Collapse
Affiliation(s)
- Yen-Ho Chen
- Department of Plant Biology, University of Georgia, Athens, Georgia, USA; College of Forestry, Nanjing Forestry University, Nanjing, China
| | - Shakuntala Sharma
- Warnell School of Forestry and Natural Resources, University of Georgia, Athens, Georgia, USA; College of Forestry, Nanjing Forestry University, Nanjing, China
| | - William P. Bewg
- Department of Plant Biology, University of Georgia, Athens, Georgia, USA; College of Forestry, Nanjing Forestry University, Nanjing, China
- Warnell School of Forestry and Natural Resources, University of Georgia, Athens, Georgia, USA; College of Forestry, Nanjing Forestry University, Nanjing, China
- Department of Genetics, University of Georgia, Athens, Georgia, USA; and College of Forestry, Nanjing Forestry University, Nanjing, China
| | - Liang-Jiao Xue
- Warnell School of Forestry and Natural Resources, University of Georgia, Athens, Georgia, USA; College of Forestry, Nanjing Forestry University, Nanjing, China
- Department of Genetics, University of Georgia, Athens, Georgia, USA; and College of Forestry, Nanjing Forestry University, Nanjing, China
- State Key Laboratory of Tree Genetics and Breeding, College of Forestry, Nanjing Forestry University, Nanjing, China
| | - Cole R. Gizelbach
- Department of Genetics, University of Georgia, Athens, Georgia, USA; and College of Forestry, Nanjing Forestry University, Nanjing, China
| | - Chung-Jui Tsai
- Department of Plant Biology, University of Georgia, Athens, Georgia, USA; College of Forestry, Nanjing Forestry University, Nanjing, China
- Warnell School of Forestry and Natural Resources, University of Georgia, Athens, Georgia, USA; College of Forestry, Nanjing Forestry University, Nanjing, China
- Department of Genetics, University of Georgia, Athens, Georgia, USA; and College of Forestry, Nanjing Forestry University, Nanjing, China
| |
Collapse
|
4
|
Zhang X, Smith DR. An overview of online resources for intra-species detection of gene duplications. Front Genet 2022; 13:1012788. [PMCID: PMC9606816 DOI: 10.3389/fgene.2022.1012788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 09/20/2022] [Indexed: 11/13/2022] Open
Abstract
Gene duplication plays an important role in evolutionary mechanism, which can act as a new source of genetic material in genome evolution. However, detecting duplicate genes from genomic data can be challenging. Various bioinformatics resources have been developed to identify duplicate genes from single and/or multiple species. Here, we summarize the metrics used to measure sequence identity among gene duplicates within species, compare several computational approaches that have been used to predict gene duplicates, and review recent advancements of a Basic Local Alignment Search Tool (BLAST)-based web tool and database, allowing future researchers to easily identify intra-species gene duplications. This article is a quick reference guide for research tools used for detecting gene duplicates.
Collapse
Affiliation(s)
- Xi Zhang
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, NS, Canada
- Institute for Comparative Genomics, Dalhousie University, Halifax, NS, Canada
- *Correspondence: Xi Zhang, ; David Roy Smith,
| | - David Roy Smith
- Department of Biology, Western University, London, ON, Canada
- *Correspondence: Xi Zhang, ; David Roy Smith,
| |
Collapse
|
5
|
Sánchez AL, Lafond M. Colorful orthology clustering in bounded-degree similarity graphs. J Bioinform Comput Biol 2021; 19:2140010. [PMID: 34775924 DOI: 10.1142/s0219720021400102] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Clustering genes in similarity graphs is a popular approach for orthology prediction. Most algorithms group genes without considering their species, which results in clusters that contain several paralogous genes. Moreover, clustering is known to be problematic when in-paralogs arise from ancient duplications. Recently, we proposed a two-step process that avoids these problems. First, we infer clusters of only orthologs (i.e. with only genes from distinct species), and second, we infer the missing inter-cluster orthologs. In this paper, we focus on the first step, which leads to a problem we call Colorful Clustering. In general, this is as hard as classical clustering. However, in similarity graphs, the number of species is usually small, as well as the neighborhood size of genes in other species. We therefore study the problem of clustering in which the number of colors is bounded by [Formula: see text], and each gene has at most [Formula: see text] neighbors in another species. We show that the well-known cluster editing formulation remains NP-hard even when [Formula: see text] and [Formula: see text]. We then propose a fixed-parameter algorithm in [Formula: see text] to find the single best cluster in the graph. We implemented this algorithm and included it in the aforementioned two-step approach. Experiments on simulated data show that this approach performs favorably to applying only an unconstrained clustering step.
Collapse
Affiliation(s)
- Alitzel López Sánchez
- Computer Science Department, Université de Sherbrooke, 2500 Boulevard de l'Université, Sherbrooke, Québec J1K 2R1, Canada
| | - Manuel Lafond
- Computer Science Department, Université de Sherbrooke, 2500 Boulevard de l'Université, Sherbrooke, Québec J1K 2R1, Canada
| |
Collapse
|
6
|
Karn RC, Yazdanifar G, Pezer Ž, Boursot P, Laukaitis CM. Androgen-Binding Protein (Abp) Evolutionary History: Has Positive Selection Caused Fixation of Different Paralogs in Different Taxa of the Genus Mus? Genome Biol Evol 2021; 13:6377336. [PMID: 34581786 PMCID: PMC8525912 DOI: 10.1093/gbe/evab220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/20/2021] [Indexed: 11/14/2022] Open
Abstract
Comparison of the androgen-binding protein (Abp) gene regions of six Mus genomes provides insights into the evolutionary history of this large murid rodent gene family. We identified 206 unique Abp sequences and mapped their physical relationships. At least 48 are duplicated and thus present in more than two identical copies. All six taxa have substantially elevated LINE1 densities in Abp regions compared with flanking regions, similar to levels in mouse and rat genomes, although nonallelic homologous recombination seems to have only occurred in Mus musculus domesticus. Phylogenetic and structural relationships support the hypothesis that the extensive Abp expansion began in an ancestor of the genus Mus. We also found duplicated Abpa27's in two taxa, suggesting that previously reported selection on a27 alleles may have actually detected selection on haplotypes wherein different paralogs were lost in each. Other studies reported that a27 gene and species trees were incongruent, likely because of homoplasy. However, L1MC3 phylogenies, supposed to be homoplasy-free compared with coding regions, support our paralog hypothesis because the L1MC3 phylogeny was congruent with the a27 topology. This paralog hypothesis provides an alternative explanation for the origin of the a27 gene that is suggested to be fixed in the three different subspecies of Mus musculus and to mediate sexual selection and incipient reinforcement between at least two of them. Finally, we ask why there are so many Abp genes, especially given the high frequency of pseudogenes and suggest that relaxed selection operates over a large part of the gene clusters.
Collapse
Affiliation(s)
- Robert C Karn
- Gene Networks in Neural and Developmental Plasticity, Institute for Genomic Biology, University of Illinois, Urbana, Illinois, USA
| | | | - Željka Pezer
- Division of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia
| | - Pierre Boursot
- Institut des Sciences de l'Evolution Montpellier, Université de Montpellier, CNRS, IRD, France
| | - Christina M Laukaitis
- Carle Health and Carle Illinois College of Medicine, University of Illinois, Urbana-Champaign, USA
| |
Collapse
|
7
|
Aviña-Padilla K, Ramírez-Rafael JA, Herrera-Oropeza GE, Muley VY, Valdivia DI, Díaz-Valenzuela E, García-García A, Varela-Echavarría A, Hernández-Rosales M. Evolutionary Perspective and Expression Analysis of Intronless Genes Highlight the Conservation of Their Regulatory Role. Front Genet 2021; 12:654256. [PMID: 34306008 PMCID: PMC8302217 DOI: 10.3389/fgene.2021.654256] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Accepted: 06/01/2021] [Indexed: 11/13/2022] Open
Abstract
The structure of eukaryotic genes is generally a combination of exons interrupted by intragenic non-coding DNA regions (introns) removed by RNA splicing to generate the mature mRNA. A fraction of genes, however, comprise a single coding exon with introns in their untranslated regions or are intronless genes (IGs), lacking introns entirely. The latter code for essential proteins involved in development, growth, and cell proliferation and their expression has been proposed to be highly specialized for neuro-specific functions and linked to cancer, neuropathies, and developmental disorders. The abundant presence of introns in eukaryotic genomes is pivotal for the precise control of gene expression. Notwithstanding, IGs exempting splicing events entail a higher transcriptional fidelity, making them even more valuable for regulatory roles. This work aimed to infer the functional role and evolutionary history of IGs centered on the mouse genome. IGs consist of a subgroup of genes with one exon including coding genes, non-coding genes, and pseudogenes, which conform approximately 6% of a total of 21,527 genes. To understand their prevalence, biological relevance, and evolution, we identified and studied 1,116 IG functional proteins validating their differential expression in transcriptomic data of embryonic mouse telencephalon. Our results showed that overall expression levels of IGs are lower than those of MEGs. However, strongly up-regulated IGs include transcription factors (TFs) such as the class 3 of POU (HMG Box), Neurog1, Olig1, and BHLHe22, BHLHe23, among other essential genes including the β-cluster of protocadherins. Most striking was the finding that IG-encoded BHLH TFs fit the criteria to be classified as microproteins. Finally, predicted protein orthologs in other six genomes confirmed high conservation of IGs associated with regulating neural processes and with chromatin organization and epigenetic regulation in Vertebrata. Moreover, this study highlights that IGs are essential modulators of regulatory processes, such as the Wnt signaling pathway and biological processes as pivotal as sensory organ developing at a transcriptional and post-translational level. Overall, our results suggest that IG proteins have specialized, prevalent, and unique biological roles and that functional divergence between IGs and MEGs is likely to be the result of specific evolutionary constraints.
Collapse
Affiliation(s)
- Katia Aviña-Padilla
- Instituto de Neurobiología, Universidad Nacional Autónoma de México, Querétaro, Mexico
- Centro de Investigacioìn y de Estudios Avanzados del IPN, Unidad Irapuato, Guanajuato, Mexico
| | | | - Gabriel Emilio Herrera-Oropeza
- Instituto de Neurobiología, Universidad Nacional Autónoma de México, Querétaro, Mexico
- Centre for Developmental Neurobiology, Institute of Psychiatry, Psychology, and Neuroscience, King’s College London, London, United Kingdom
| | | | - Dulce I. Valdivia
- Centro de Investigacioìn y de Estudios Avanzados del IPN, Unidad Irapuato, Guanajuato, Mexico
| | - Erik Díaz-Valenzuela
- Centro de Investigacioìn y de Estudios Avanzados del IPN, Unidad Irapuato, Guanajuato, Mexico
| | - Andrés García-García
- Centro de Física Aplicada y Tecnología Avanzada, Universidad Nacional Autónoma de México, Querétaro, Mexico
| | | | | |
Collapse
|
8
|
Gonçalves-Carneiro D, Takata MA, Ong H, Shilton A, Bieniasz PD. Origin and evolution of the zinc finger antiviral protein. PLoS Pathog 2021; 17:e1009545. [PMID: 33901262 PMCID: PMC8102003 DOI: 10.1371/journal.ppat.1009545] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Revised: 05/06/2021] [Accepted: 04/08/2021] [Indexed: 01/24/2023] Open
Abstract
The human zinc finger antiviral protein (ZAP) recognizes RNA by binding to CpG dinucleotides. Mammalian transcriptomes are CpG-poor, and ZAP may have evolved to exploit this feature to specifically target non-self viral RNA. Phylogenetic analyses reveal that ZAP and its paralogue PARP12 share an ancestral gene that arose prior to extensive eukaryote divergence, and the ZAP lineage diverged from the PARP12 lineage in tetrapods. Notably, the CpG content of modern eukaryote genomes varies widely, and ZAP-like genes arose subsequent to the emergence of CpG-suppression in vertebrates. Human PARP12 exhibited no antiviral activity against wild type and CpG-enriched HIV-1, but ZAP proteins from several tetrapods had antiviral activity when expressed in human cells. In some cases, ZAP antiviral activity required a TRIM25 protein from the same or related species, suggesting functional co-evolution of these genes. Indeed, a hypervariable sequence in the N-terminal domain of ZAP contributed to species-specific TRIM25 dependence in antiviral activity assays. Crosslinking immunoprecipitation coupled with RNA sequencing revealed that ZAP proteins from human, mouse, bat and alligator exhibit a high degree of CpG-specificity, while some avian ZAP proteins appear more promiscuous. Together, these data suggest that the CpG- rich RNA directed antiviral activity of ZAP-related proteins arose in tetrapods, subsequent to the onset of CpG suppression in certain eukaryote lineages, with subsequent species-specific adaptation of cofactor requirements and RNA target specificity. To control viral infections, cells have evolved a variety of mechanisms that detect, modify and sometimes eliminate viral components. One of such mechanism is the Zinc Finger Antiviral Protein (ZAP) which binds RNA sequences that are rich in elements composed of a cytosine followed by a guanine. Selection of viral RNA can only be achieved because such elements are sparse in RNAs encoded by human genes. Here, we traced the molecular evolution of ZAP. We found that ZAP and a closely related gene, PARP12, originated from the same ancestral gene that existed in a predecessor of vertebrates and invertebrates. We found that ZAP proteins from mammals, birds and reptiles have antiviral activity but only in the presence of a co-factor, TRIM25, from the same species. ZAP proteins from birds were particularly interesting since they demonstrated a broader antiviral activity, primarily driven by relaxed requirement for cytosine-guanine. Our findings suggest that viruses that infect birds–which are important vectors for human diseases–are under differential selective pressures and this property may influence the outcome of interspecies transmission.
Collapse
Affiliation(s)
- Daniel Gonçalves-Carneiro
- Laboratory of Retrovirology, The Rockefeller University, New York City, New York, United States of America
| | - Matthew A. Takata
- Laboratory of Retrovirology, The Rockefeller University, New York City, New York, United States of America
| | - Heley Ong
- Laboratory of Retrovirology, The Rockefeller University, New York City, New York, United States of America
| | - Amanda Shilton
- Laboratory of Retrovirology, The Rockefeller University, New York City, New York, United States of America
| | - Paul D. Bieniasz
- Laboratory of Retrovirology, The Rockefeller University, New York City, New York, United States of America
- Howard Hughes Medical Institute, The Rockefeller University, New York City, New York, United States of America
- * E-mail:
| |
Collapse
|
9
|
Correa M, Lerat E, Birmelé E, Samson F, Bouillon B, Normand K, Rizzon C. The Transposable Element Environment of Human Genes Differs According to Their Duplication Status and Essentiality. Genome Biol Evol 2021; 13:6273345. [PMID: 33973013 PMCID: PMC8155550 DOI: 10.1093/gbe/evab062] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/17/2021] [Indexed: 12/13/2022] Open
Abstract
Transposable elements (TEs) are major components of eukaryotic genomes and represent approximately 45% of the human genome. TEs can be important sources of novelty in genomes and there is increasing evidence that TEs contribute to the evolution of gene regulation in mammals. Gene duplication is an evolutionary mechanism that also provides new genetic material and opportunities to acquire new functions. To investigate how duplicated genes are maintained in genomes, here, we explored the TE environment of duplicated and singleton genes. We found that singleton genes have more short-interspersed nuclear elements and DNA transposons in their vicinity than duplicated genes, whereas long-interspersed nuclear elements and long-terminal repeat retrotransposons have accumulated more near duplicated genes. We also discovered that this result is highly associated with the degree of essentiality of the genes with an unexpected accumulation of short-interspersed nuclear elements and DNA transposons around the more-essential genes. Our results underline the importance of taking into account the TE environment of genes to better understand how duplicated genes are maintained in genomes.
Collapse
Affiliation(s)
- Margot Correa
- Laboratoire de Mathématiques et Modélisation d'Evry (LaMME), UMR CNRS 8071, ENSIIE, USC INRA, Université d'Evry Val d'Essonne, Evry, France
| | - Emmanuelle Lerat
- Laboratoire de Biométrie et Biologie Evolutive, UMR 5558, Université de Lyon, Université Lyon 1, CNRS, Villeurbanne, France
| | - Etienne Birmelé
- Laboratoire MAP5 UMR 8145, Université de Paris, Paris, France
| | - Franck Samson
- Laboratoire de Mathématiques et Modélisation d'Evry (LaMME), UMR CNRS 8071, ENSIIE, USC INRA, Université d'Evry Val d'Essonne, Evry, France
| | - Bérengère Bouillon
- Laboratoire de Mathématiques et Modélisation d'Evry (LaMME), UMR CNRS 8071, ENSIIE, USC INRA, Université d'Evry Val d'Essonne, Evry, France
| | - Kévin Normand
- Laboratoire de Mathématiques et Modélisation d'Evry (LaMME), UMR CNRS 8071, ENSIIE, USC INRA, Université d'Evry Val d'Essonne, Evry, France
| | - Carène Rizzon
- Laboratoire de Mathématiques et Modélisation d'Evry (LaMME), UMR CNRS 8071, ENSIIE, USC INRA, Université d'Evry Val d'Essonne, Evry, France
| |
Collapse
|
10
|
Schaller D, Geiß M, Stadler PF, Hellmuth M. Complete Characterization of Incorrect Orthology Assignments in Best Match Graphs. J Math Biol 2021; 82:20. [PMID: 33606106 PMCID: PMC7894253 DOI: 10.1007/s00285-021-01564-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Revised: 09/23/2020] [Accepted: 12/21/2020] [Indexed: 02/06/2023]
Abstract
Genome-scale orthology assignments are usually based on reciprocal best matches. In the absence of horizontal gene transfer (HGT), every pair of orthologs forms a reciprocal best match. Incorrect orthology assignments therefore are always false positives in the reciprocal best match graph. We consider duplication/loss scenarios and characterize unambiguous false-positive (u-fp) orthology assignments, that is, edges in the best match graphs (BMGs) that cannot correspond to orthologs for any gene tree that explains the BMG. Moreover, we provide a polynomial-time algorithm to identify all u-fp orthology assignments in a BMG. Simulations show that at least [Formula: see text] of all incorrect orthology assignments can be detected in this manner. All results rely only on the structure of the BMGs and not on any a priori knowledge about underlying gene or species trees.
Collapse
Affiliation(s)
- David Schaller
- Max-Planck-Institute for Mathematics in the Sciences, Inselstraße 22, D-04103, Leipzig, Germany.,Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center of Bioinformatics, University of Leipzig, Härtelstraße 16-18, D-04107, Leipzig, Germany
| | - Manuela Geiß
- Software Competence Center Hagenberg GmbH, Softwarepark 21, A-4232, Hagenberg, Austria
| | - Peter F Stadler
- Max-Planck-Institute for Mathematics in the Sciences, Inselstraße 22, D-04103, Leipzig, Germany.,Bioinformatics Group, Department of Computer Science, Interdisciplinary Center of Bioinformatics, German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Competence Center for Scalable Data Services and Solutions, and Leipzig Research Center for Civilization Diseases, Leipzig University, Härtelstraße 16-18, D-04107, Leipzig, Germany.,Inst. f. Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090, Wien, Austria.,Facultad de Ciencias, Universidad National de Colombia, Bogotá, Colombia.,Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM, 87501, USA
| | - Marc Hellmuth
- Department of Mathematics, Faculty of Science, Stockholm University, SE 106 91, Stockholm, Sweden.
| |
Collapse
|
11
|
Margres MJ, Rautsaw RM, Strickland JL, Mason AJ, Schramer TD, Hofmann EP, Stiers E, Ellsworth SA, Nystrom GS, Hogan MP, Bartlett DA, Colston TJ, Gilbert DM, Rokyta DR, Parkinson CL. The Tiger Rattlesnake genome reveals a complex genotype underlying a simple venom phenotype. Proc Natl Acad Sci U S A 2021; 118:e2014634118. [PMID: 33468678 PMCID: PMC7848695 DOI: 10.1073/pnas.2014634118] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Variation in gene regulation is ubiquitous, yet identifying the mechanisms producing such variation, especially for complex traits, is challenging. Snake venoms provide a model system for studying the phenotypic impacts of regulatory variation in complex traits because of their genetic tractability. Here, we sequence the genome of the Tiger Rattlesnake, which possesses the simplest and most toxic venom of any rattlesnake species, to determine whether the simple venom phenotype is the result of a simple genotype through gene loss or a complex genotype mediated through regulatory mechanisms. We generate the most contiguous snake-genome assembly to date and use this genome to show that gene loss, chromatin accessibility, and methylation levels all contribute to the production of the simplest, most toxic rattlesnake venom. We provide the most complete characterization of the venom gene-regulatory network to date and identify key mechanisms mediating phenotypic variation across a polygenic regulatory network.
Collapse
Affiliation(s)
- Mark J Margres
- Department of Biological Sciences, Clemson University, Clemson, SC 29634;
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138
- Department of Integrative Biology, University of South Florida, Tampa, FL 33620
| | - Rhett M Rautsaw
- Department of Biological Sciences, Clemson University, Clemson, SC 29634
| | - Jason L Strickland
- Department of Biological Sciences, Clemson University, Clemson, SC 29634
- Department of Biology, University of South Alabama, Mobile, AL 36688
| | - Andrew J Mason
- Department of Biological Sciences, Clemson University, Clemson, SC 29634
| | - Tristan D Schramer
- Department of Biological Sciences, Clemson University, Clemson, SC 29634
| | - Erich P Hofmann
- Department of Biological Sciences, Clemson University, Clemson, SC 29634
| | - Erin Stiers
- Department of Biological Sciences, Clemson University, Clemson, SC 29634
| | - Schyler A Ellsworth
- Department of Biological Science, Florida State University, Tallahassee, FL 32306
| | - Gunnar S Nystrom
- Department of Biological Science, Florida State University, Tallahassee, FL 32306
| | - Michael P Hogan
- Department of Biological Science, Florida State University, Tallahassee, FL 32306
| | - Daniel A Bartlett
- Department of Biological Science, Florida State University, Tallahassee, FL 32306
| | - Timothy J Colston
- Department of Biological Science, Florida State University, Tallahassee, FL 32306
| | - David M Gilbert
- Department of Biological Science, Florida State University, Tallahassee, FL 32306
| | - Darin R Rokyta
- Department of Biological Science, Florida State University, Tallahassee, FL 32306
| | - Christopher L Parkinson
- Department of Biological Sciences, Clemson University, Clemson, SC 29634;
- Department of Forestry and Environmental Conservation, Clemson University, Clemson, SC 29634
| |
Collapse
|
12
|
Prawer YDJ, Stroehlein AJ, Young ND, Kapoor S, Hall RS, Ghazali R, Batterham P, Gasser RB, Perry T, Anstead CA. Major SCP/TAPS protein expansion in Lucilia cuprina is associated with novel tandem array organisation and domain architecture. Parasit Vectors 2020; 13:598. [PMID: 33246493 PMCID: PMC7694928 DOI: 10.1186/s13071-020-04476-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Accepted: 11/05/2020] [Indexed: 11/20/2022] Open
Abstract
Background Larvae of the Australian sheep blowfly, Lucilia cuprina, parasitise sheep by feeding on skin excretions, dermal tissue and blood, causing severe damage known as flystrike or myiasis. Recent advances in -omic technologies and bioinformatic data analyses have led to a greater understanding of blowfly biology and should allow the identification of protein families involved in host-parasite interactions and disease. Current literature suggests that proteins of the SCP (Sperm-Coating Protein)/TAPS (Tpx-1/Ag5/PR-1/Sc7) (SCP/TAPS) superfamily play key roles in immune modulation, cross-talk between parasite and host as well as developmental and reproductive processes in parasites. Methods Here, we employed a bioinformatics workflow to curate the SCP/TAPS protein gene family in L. cuprina. Protein sequence, the presence and number of conserved CAP-domains and phylogeny were used to group identified SCP/TAPS proteins; these were compared to those found in Drosophila melanogaster to make functional predictions. In addition, transcription levels of SCP/TAPS protein-encoding genes were explored in different developmental stages. Results A total of 27 genes were identified as belonging to the SCP/TAPS gene family: encoding 26 single-domain proteins each with a single CAP domain and a solitary double-domain protein containing two conserved cysteine-rich secretory protein/antigen 5/pathogenesis related-1 (CAP) domains. Surprisingly, 16 SCP/TAPS predicted proteins formed an extended tandem array spanning a 53 kb region of one genomic region, which was confirmed by MinION long-read sequencing. RNA-seq data indicated that these 16 genes are highly transcribed in all developmental stages (excluding the embryo). Conclusions Future work should assess the potential of selected SCP/TAPS proteins as novel targets for the control of L. cuprina and related parasitic flies of major socioeconomic importance.![]()
Collapse
Affiliation(s)
- Yair D J Prawer
- Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Parkville, VIC, 3010, Australia.
| | - Andreas J Stroehlein
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, VIC, 3010, Australia
| | - Neil D Young
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, VIC, 3010, Australia
| | - Shilpa Kapoor
- Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Parkville, VIC, 3010, Australia
| | - Ross S Hall
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, VIC, 3010, Australia
| | - Razi Ghazali
- Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Parkville, VIC, 3010, Australia
| | - Phillip Batterham
- Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Parkville, VIC, 3010, Australia
| | - Robin B Gasser
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, VIC, 3010, Australia
| | - Trent Perry
- Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Parkville, VIC, 3010, Australia
| | - Clare A Anstead
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, VIC, 3010, Australia.
| |
Collapse
|
13
|
Genome-Wide Analysis of Chemosensory Protein Genes (CSPs) Family in Fig Wasps (Hymenoptera, Chalcidoidea). Genes (Basel) 2020; 11:genes11101149. [PMID: 33003564 PMCID: PMC7599541 DOI: 10.3390/genes11101149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2020] [Revised: 09/23/2020] [Accepted: 09/25/2020] [Indexed: 11/30/2022] Open
Abstract
Chemosensory proteins (CSP) are a class of acidic soluble proteins which have various functions in chemoreception, resistance and immunity, but we still have very little knowledge on this gene family in fig wasps, a peculiar insects group (Hymenoptera, Chalcidoidea) that shelter in the fig syconia of Ficus trees. Here, we made the first comprehensive analysis of CSP gene family in the 11 fig wasps at whole-genome level. We manually annotated 104 CSP genes in the genomes of the 11 fig wasps, comprehensively analyzed them in gene characteristics, conserved cysteine patterns, motif orders, phylogeny, genome distribution, gene tandem duplication, and expansion and contraction patterns of the gene family. We also approximately predicted the gene expression by codon adaptation index analysis. Our study shows that the CSP gene family is conserved in the 11 fig wasps; the CSP gene numbers in pollinating fig wasps are less than in non-pollinating fig wasps, which may be due to their longer history of adaptation to fig syconia; the expansion of CSP gene in two non-pollinating fig wasps, Philotrypesis tridentata and Sycophaga agraensis, may be a species-specific phenomenon. These results provide us with useful information for understanding the evolution of the CSP gene family of insects in diverse living environments.
Collapse
|
14
|
Lallemand T, Leduc M, Landès C, Rizzon C, Lerat E. An Overview of Duplicated Gene Detection Methods: Why the Duplication Mechanism Has to Be Accounted for in Their Choice. Genes (Basel) 2020; 11:E1046. [PMID: 32899740 PMCID: PMC7565063 DOI: 10.3390/genes11091046] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 09/01/2020] [Accepted: 09/02/2020] [Indexed: 12/11/2022] Open
Abstract
Gene duplication is an important evolutionary mechanism allowing to provide new genetic material and thus opportunities to acquire new gene functions for an organism, with major implications such as speciation events. Various processes are known to allow a gene to be duplicated and different models explain how duplicated genes can be maintained in genomes. Due to their particular importance, the identification of duplicated genes is essential when studying genome evolution but it can still be a challenge due to the various fates duplicated genes can encounter. In this review, we first describe the evolutionary processes allowing the formation of duplicated genes but also describe the various bioinformatic approaches that can be used to identify them in genome sequences. Indeed, these bioinformatic approaches differ according to the underlying duplication mechanism. Hence, understanding the specificity of the duplicated genes of interest is a great asset for tool selection and should be taken into account when exploring a biological question.
Collapse
Affiliation(s)
- Tanguy Lallemand
- IRHS, Agrocampus-Ouest, INRAE, Université d’Angers, SFR 4207 QuaSaV, 49071 Beaucouzé, France; (T.L.); (M.L.); (C.L.)
| | - Martin Leduc
- IRHS, Agrocampus-Ouest, INRAE, Université d’Angers, SFR 4207 QuaSaV, 49071 Beaucouzé, France; (T.L.); (M.L.); (C.L.)
| | - Claudine Landès
- IRHS, Agrocampus-Ouest, INRAE, Université d’Angers, SFR 4207 QuaSaV, 49071 Beaucouzé, France; (T.L.); (M.L.); (C.L.)
| | - Carène Rizzon
- Laboratoire de Mathématiques et Modélisation d’Evry (LaMME), Université d’Evry Val d’Essonne, Université Paris-Saclay, UMR CNRS 8071, ENSIIE, USC INRAE, 23 bvd de France, CEDEX, 91037 Evry Paris, France;
| | - Emmanuelle Lerat
- Université de Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Evolutive UMR 5558, F-69622 Villeurbanne, France
| |
Collapse
|
15
|
Lu L, Loker ES, Zhang SM, Buddenborg SK, Bu L. Genome-wide discovery, and computational and transcriptional characterization of an AIG gene family in the freshwater snail Biomphalaria glabrata, a vector for Schistosoma mansoni. BMC Genomics 2020; 21:190. [PMID: 32122294 PMCID: PMC7053062 DOI: 10.1186/s12864-020-6534-z] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2019] [Accepted: 01/23/2020] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND The AIG (avrRpt2-induced gene) family of GTPases, characterized by the presence of a distinctive AIG1 domain, is mysterious in having a peculiar phylogenetic distribution, a predilection for undergoing expansion and loss, and an uncertain functional role, especially in invertebrates. AIGs are frequently represented as GIMAPs (GTPase of the immunity associated protein family), characterized by presence of the AIG1 domain along with coiled-coil domains. Here we provide an overview of the remarkably expanded AIG repertoire of the freshwater gastropod Biomphalaria glabrata, compare it with AIGs in other organisms, and detail patterns of expression in B. glabrata susceptible or resistant to infection with Schistosoma mansoni, responsible for the neglected tropical disease of intestinal schistosomiasis. RESULTS We define the 7 conserved motifs that comprise the AIG1 domain in B. glabrata and detail its association with at least 7 other domains, indicative of functional versatility of B. glabrata AIGs. AIG genes were usually found in tandem arrays in the B. glabrata genome, suggestive of an origin by segmental gene duplication. We found 91 genes with complete AIG1 domains, including 64 GIMAPs and 27 AIG genes without coiled-coils, more than known for any other organism except Danio (with > 100). We defined expression patterns of AIG genes in 12 different B. glabrata organs and characterized whole-body AIG responses to microbial PAMPs, and of schistosome-resistant or -susceptible strains of B. glabrata to S. mansoni exposure. Biomphalaria glabrata AIG genes clustered with expansions of AIG genes from other heterobranch gastropods yet showed unique lineage-specific subclusters. Other gastropods and bivalves had separate but also diverse expansions of AIG genes, whereas cephalopods seem to lack AIG genes. CONCLUSIONS The AIG genes of B. glabrata exhibit expansion in both numbers and potential functions, differ markedly in expression between strains varying in susceptibility to schistosomes, and are responsive to immune challenge. These features provide strong impetus to further explore the functional role of AIG genes in the defense responses of B. glabrata, including to suppress or support the development of medically relevant S. mansoni parasites.
Collapse
Affiliation(s)
- Lijun Lu
- Center for Evolutionary and Theoretical Immunology, Department of Biology, University of New Mexico, Albuquerque, NM 87131 USA
| | - Eric S. Loker
- Center for Evolutionary and Theoretical Immunology, Department of Biology, University of New Mexico, Albuquerque, NM 87131 USA
| | - Si-Ming Zhang
- Center for Evolutionary and Theoretical Immunology, Department of Biology, University of New Mexico, Albuquerque, NM 87131 USA
| | - Sarah K. Buddenborg
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SA UK
| | - Lijing Bu
- Center for Evolutionary and Theoretical Immunology, Department of Biology, University of New Mexico, Albuquerque, NM 87131 USA
| |
Collapse
|
16
|
Xing Y, Liu Y, Zhang Q, Nie X, Sun Y, Zhang Z, Li H, Fang K, Wang G, Huang H, Bisseling T, Cao Q, Qin L. Hybrid de novo genome assembly of Chinese chestnut (Castanea mollissima). Gigascience 2019; 8:giz112. [PMID: 31513707 PMCID: PMC6741814 DOI: 10.1093/gigascience/giz112] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2018] [Revised: 04/01/2019] [Accepted: 08/19/2019] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND The Chinese chestnut (Castanea mollissima) is widely cultivated in China for nut production. This plant also plays an important ecological role in afforestation and ecosystem services. To facilitate and expand the use of C. mollissima for breeding and its genetic improvement, we report here the whole-genome sequence of C. mollissima. FINDINGS We produced a high-quality assembly of the C. mollissima genome using Pacific Biosciences single-molecule sequencing. The final draft genome is ∼785.53 Mb long, with a contig N50 size of 944 kb, and we further annotated 36,479 protein-coding genes in the genome. Phylogenetic analysis showed that C. mollissima diverged from Quercus robur, a member of the Fagaceae family, ∼13.62 million years ago. CONCLUSIONS The high-quality whole-genome assembly of C. mollissima will be a valuable resource for further genetic improvement and breeding for disease resistance and nut quality.
Collapse
Affiliation(s)
- Yu Xing
- Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, Beijing University of Agriculture, 7 Beinong Rd., Beijing 102206, China
- College of Plant Science and Technology, Beijing Key Laboratory for Agricultural Application and New Technique, Beijing University of Agriculture, 7 Beinong Rd., Beijing 102206, China
| | - Yang Liu
- College of Plant Science and Technology, Beijing Key Laboratory for Agricultural Application and New Technique, Beijing University of Agriculture, 7 Beinong Rd., Beijing 102206, China
| | - Qing Zhang
- College of Plant Science and Technology, Beijing Key Laboratory for Agricultural Application and New Technique, Beijing University of Agriculture, 7 Beinong Rd., Beijing 102206, China
| | - Xinghua Nie
- College of Plant Science and Technology, Beijing Key Laboratory for Agricultural Application and New Technique, Beijing University of Agriculture, 7 Beinong Rd., Beijing 102206, China
| | - Yamin Sun
- Research Center for Functional Genomics and Biochip, 23 Hongda St., Tianjin 300457, China
| | - Zhiyong Zhang
- Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, Beijing University of Agriculture, 7 Beinong Rd., Beijing 102206, China
- College of Plant Science and Technology, Beijing Key Laboratory for Agricultural Application and New Technique, Beijing University of Agriculture, 7 Beinong Rd., Beijing 102206, China
| | - Huchen Li
- Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, Beijing University of Agriculture, 7 Beinong Rd., Beijing 102206, China
- Laboratory of Molecular Biology, Department of Plant Sciences, Wageningen University, Droevendaalsesteeg 1, Wageningen 6708 PB, The Netherlands
| | - Kefeng Fang
- College of Landscape Architecture, Beijing Collaborative Innovation Center for Eco-Environmental Improvement with Forestry and Fruit Trees, Beijing University of Agriculture, 7 Beinong Rd., Beijing 102206, China
| | - Guangpeng Wang
- Changli Institute of Pomology, Hebei Academy of Agriculture and Forestry Sciences, 39 E Jieyangdajie, Changli 066600, China
| | - Hongwen Huang
- South China Botanical Garden, Chinese Academy of Sciences, 723 Xingke Rd., Guangzhou 510650, China
| | - Ton Bisseling
- Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, Beijing University of Agriculture, 7 Beinong Rd., Beijing 102206, China
- Laboratory of Molecular Biology, Department of Plant Sciences, Wageningen University, Droevendaalsesteeg 1, Wageningen 6708 PB, The Netherlands
| | - Qingqin Cao
- Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, Beijing University of Agriculture, 7 Beinong Rd., Beijing 102206, China
- College of Plant Science and Technology, Beijing Key Laboratory for Agricultural Application and New Technique, Beijing University of Agriculture, 7 Beinong Rd., Beijing 102206, China
| | - Ling Qin
- Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, Beijing University of Agriculture, 7 Beinong Rd., Beijing 102206, China
- College of Plant Science and Technology, Beijing Key Laboratory for Agricultural Application and New Technique, Beijing University of Agriculture, 7 Beinong Rd., Beijing 102206, China
| |
Collapse
|
17
|
Lucas JMEX, Roest Crollius H. High precision detection of conserved segments from synteny blocks. PLoS One 2017; 12:e0180198. [PMID: 28671949 PMCID: PMC5495381 DOI: 10.1371/journal.pone.0180198] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2016] [Accepted: 06/12/2017] [Indexed: 11/19/2022] Open
Abstract
A conserved segment, i.e. a segment of chromosome unbroken during evolution, is an important operational concept in comparative genomics. Until now, algorithms that are designed to identify conserved segments often return synteny blocks that overlap, synteny blocks that include micro-rearrangements or synteny blocks erroneously short. Here we present definitions of conserved segments and synteny blocks independent of any heuristic method and we describe four new post-processing strategies to refine synteny blocks into accurate conserved segments. The first strategy identifies micro-rearrangements, the second strategy identifies mono-genic conserved segments, the third returns non-overlapping segments and the fourth repairs incorrect ruptures of synteny. All these refinements are implemented in a new version of PhylDiag that has been benchmarked against i-ADHoRe 3.0 and Cyntenator, based on a realistic simulated evolution and true simulated conserved segments.
Collapse
Affiliation(s)
- Joseph MEX Lucas
- IBENS, Département de Biologie, Ecole Normale Supérieure, CNRS, Inserm, PSL Research, University, Paris, France
| | - Hugues Roest Crollius
- IBENS, Département de Biologie, Ecole Normale Supérieure, CNRS, Inserm, PSL Research, University, Paris, France
| |
Collapse
|
18
|
Efficient and rapid generation of large genomic variants in rats and mice using CRISMERE. Sci Rep 2017; 7:43331. [PMID: 28266534 PMCID: PMC5339700 DOI: 10.1038/srep43331] [Citation(s) in RCA: 55] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2016] [Accepted: 01/24/2017] [Indexed: 01/05/2023] Open
Abstract
Modelling Down syndrome (DS) in mouse has been crucial for the understanding of the disease and the evaluation of therapeutic targets. Nevertheless, the modelling so far has been limited to the mouse and, even in this model, generating duplication of genomic regions has been labour intensive and time consuming. We developed the CRISpr MEdiated REarrangement (CRISMERE) strategy, which takes advantage of the CRISPR/Cas9 system, to generate most of the desired rearrangements from a single experiment at much lower expenses and in less than 9 months. Deletions, duplications, and inversions of genomic regions as large as 24.4 Mb in rat and mouse founders were observed and germ line transmission was confirmed for fragment as large as 3.6 Mb. Interestingly we have been able to recover duplicated regions from founders in which we only detected deletions. CRISMERE is even more powerful than anticipated it allows the scientific community to manipulate the rodent and probably other genomes in a fast and efficient manner which was not possible before.
Collapse
|
19
|
Zou XD, Hu XJ, Ma J, Li T, Ye ZQ, Wu YD. Genome-wide Analysis of WD40 Protein Family in Human. Sci Rep 2016; 6:39262. [PMID: 27991561 PMCID: PMC5172248 DOI: 10.1038/srep39262] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2016] [Accepted: 11/22/2016] [Indexed: 01/16/2023] Open
Abstract
The WD40 proteins, often acting as scaffolds to form functional complexes in fundamental cellular processes, are one of the largest families encoded by the eukaryotic genomes. Systematic studies of this family on genome scale are highly required for understanding their detailed functions, but are currently lacking in the animal lineage. Here we present a comprehensive in silico study of the human WD40 family. We have identified 262 non-redundant WD40 proteins, and grouped them into 21 classes according to their domain architectures. Among them, 11 animal-specific domain architectures have been recognized. Sequence alignment indicates the complicated duplication and recombination events in the evolution of this family. Through further phylogenetic analysis, we have revealed that the WD40 family underwent more expansion than the overall average in the evolutionary early stage, and the early emerged WD40 proteins are prone to domain architectures with fundamental cellular roles and more interactions. While most widely and highly expressed human WD40 genes originated early, the tissue-specific ones often have late origin. These results provide a landscape of the human WD40 family concerning their classification, evolution, and expression, serving as a valuable complement to the previous studies in the plant lineage.
Collapse
Affiliation(s)
- Xu-Dong Zou
- Lab of Computational Chemistry and Drug Design, Laboratory of Chemical Genomics, Peking University Shenzhen Graduate School, Shenzhen 518055, P. R. China
| | - Xue-Jia Hu
- Lab of Computational Chemistry and Drug Design, Laboratory of Chemical Genomics, Peking University Shenzhen Graduate School, Shenzhen 518055, P. R. China
| | - Jing Ma
- Lab of Computational Chemistry and Drug Design, Laboratory of Chemical Genomics, Peking University Shenzhen Graduate School, Shenzhen 518055, P. R. China
| | - Tuan Li
- Lab of Computational Chemistry and Drug Design, Laboratory of Chemical Genomics, Peking University Shenzhen Graduate School, Shenzhen 518055, P. R. China
| | - Zhi-Qiang Ye
- Lab of Computational Chemistry and Drug Design, Laboratory of Chemical Genomics, Peking University Shenzhen Graduate School, Shenzhen 518055, P. R. China
| | - Yun-Dong Wu
- Lab of Computational Chemistry and Drug Design, Laboratory of Chemical Genomics, Peking University Shenzhen Graduate School, Shenzhen 518055, P. R. China.,College of Chemistry, Peking University, Beijing, 100871, P. R. China
| |
Collapse
|
20
|
Blanco-Melo D, Venkatesh S, Bieniasz PD. Origins and Evolution of tetherin, an Orphan Antiviral Gene. Cell Host Microbe 2016; 20:189-201. [PMID: 27427209 DOI: 10.1016/j.chom.2016.06.007] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2016] [Revised: 05/01/2016] [Accepted: 06/06/2016] [Indexed: 01/08/2023]
Abstract
Tetherin encodes an interferon-inducible antiviral protein that traps a broad spectrum of enveloped viruses at infected cell surfaces. Despite the absence of any clearly related gene or activity, we describe possible scenarios by which tetherin arose that exemplify how protein modularity, evolvability, and robustness can create and preserve new functions. We find that tetherin genes in various organisms exhibit no sequence similarity and share only a common architecture and location in modern genomes. Moreover, tetherin is part of a cluster of three potential sister genes encoding proteins of similar architecture, some variants of which exhibit antiviral activity while others can be endowed with antiviral activity by a simple modification. Only in slowly evolving species (e.g., coelacanths) does tetherin exhibit sequence similarity to one potential sister gene. Neofunctionalization, drift, and genetic conflict appear to have driven a near complete loss of sequence similarity among modern tetherin genes and their sister genes.
Collapse
Affiliation(s)
- Daniel Blanco-Melo
- Howard Hughes Medical Institute, Laboratory of Retrovirology, Aaron Diamond AIDS Research Center, The Rockefeller University, 455 First Avenue, New York, NY 10016, USA
| | - Siddarth Venkatesh
- Howard Hughes Medical Institute, Laboratory of Retrovirology, Aaron Diamond AIDS Research Center, The Rockefeller University, 455 First Avenue, New York, NY 10016, USA; Center for Genome Sciences and Systems Biology, Washington University School of Medicine, Saint Louis, MO 63108, USA
| | - Paul D Bieniasz
- Howard Hughes Medical Institute, Laboratory of Retrovirology, Aaron Diamond AIDS Research Center, The Rockefeller University, 455 First Avenue, New York, NY 10016, USA.
| |
Collapse
|
21
|
Li J, Shou J, Guo Y, Tang Y, Wu Y, Jia Z, Zhai Y, Chen Z, Xu Q, Wu Q. Efficient inversions and duplications of mammalian regulatory DNA elements and gene clusters by CRISPR/Cas9. J Mol Cell Biol 2015; 7:284-98. [PMID: 25757625 PMCID: PMC4524425 DOI: 10.1093/jmcb/mjv016] [Citation(s) in RCA: 93] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2015] [Accepted: 03/02/2015] [Indexed: 12/26/2022] Open
Abstract
The human genome contains millions of DNA regulatory elements and a large number of gene clusters, most of which have not been tested experimentally. The clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated nuclease 9 (Cas9) programed with a synthetic single-guide RNA (sgRNA) emerges as a method for genome editing in virtually any organisms. Here we report that targeted DNA fragment inversions and duplications could easily be achieved in human and mouse genomes by CRISPR with two sgRNAs. Specifically, we found that, in cultured human cells and mice, efficient precise inversions of DNA fragments ranging in size from a few tens of bp to hundreds of kb could be generated. In addition, DNA fragment duplications and deletions could also be generated by CRISPR through trans-allelic recombination between the Cas9-induced double-strand breaks (DSBs) on two homologous chromosomes (chromatids). Moreover, junctions of combinatorial inversions and duplications of the protocadherin (Pcdh) gene clusters induced by Cas9 with four sgRNAs could be detected. In mice, we obtained founders with alleles of precise inversions, duplications, and deletions of DNA fragments of variable sizes by CRISPR. Interestingly, we found that very efficient inversions were mediated by microhomology-mediated end joining (MMEJ) through short inverted repeats. We showed for the first time that DNA fragment inversions could be transmitted through germlines in mice. Finally, we applied this CRISPR method to a regulatory element of the Pcdhα cluster and found a new role in the regulation of members of the Pcdhγ cluster. This simple and efficient method should be useful in manipulating mammalian genomes to study millions of regulatory DNA elements as well as vast numbers of gene clusters.
Collapse
Affiliation(s)
- Jinhuan Li
- Key Laboratory of Systems Biomedicine (Ministry of Education), Center for Comparative Biomedicine, Institute of Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai 200240, China Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Bio-X Center, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China Collaborative Innovation Center of Systems Biomedicine, Shanghai Jiao Tong University School of Medicine, Shanghai 200240, China
| | - Jia Shou
- Key Laboratory of Systems Biomedicine (Ministry of Education), Center for Comparative Biomedicine, Institute of Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai 200240, China Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Bio-X Center, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China Collaborative Innovation Center of Systems Biomedicine, Shanghai Jiao Tong University School of Medicine, Shanghai 200240, China
| | - Ya Guo
- Key Laboratory of Systems Biomedicine (Ministry of Education), Center for Comparative Biomedicine, Institute of Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai 200240, China Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Bio-X Center, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China Collaborative Innovation Center of Systems Biomedicine, Shanghai Jiao Tong University School of Medicine, Shanghai 200240, China
| | - Yuanxiao Tang
- Key Laboratory of Systems Biomedicine (Ministry of Education), Center for Comparative Biomedicine, Institute of Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai 200240, China Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Bio-X Center, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China Collaborative Innovation Center of Systems Biomedicine, Shanghai Jiao Tong University School of Medicine, Shanghai 200240, China
| | - Yonghu Wu
- Key Laboratory of Systems Biomedicine (Ministry of Education), Center for Comparative Biomedicine, Institute of Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai 200240, China Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Bio-X Center, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China Collaborative Innovation Center of Systems Biomedicine, Shanghai Jiao Tong University School of Medicine, Shanghai 200240, China
| | - Zhilian Jia
- Key Laboratory of Systems Biomedicine (Ministry of Education), Center for Comparative Biomedicine, Institute of Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai 200240, China Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Bio-X Center, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China Collaborative Innovation Center of Systems Biomedicine, Shanghai Jiao Tong University School of Medicine, Shanghai 200240, China
| | - Yanan Zhai
- Key Laboratory of Systems Biomedicine (Ministry of Education), Center for Comparative Biomedicine, Institute of Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai 200240, China Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Bio-X Center, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China Collaborative Innovation Center of Systems Biomedicine, Shanghai Jiao Tong University School of Medicine, Shanghai 200240, China
| | - Zhifeng Chen
- Key Laboratory of Systems Biomedicine (Ministry of Education), Center for Comparative Biomedicine, Institute of Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai 200240, China Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Bio-X Center, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China Collaborative Innovation Center of Systems Biomedicine, Shanghai Jiao Tong University School of Medicine, Shanghai 200240, China
| | - Quan Xu
- Key Laboratory of Systems Biomedicine (Ministry of Education), Center for Comparative Biomedicine, Institute of Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai 200240, China Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Bio-X Center, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China Collaborative Innovation Center of Systems Biomedicine, Shanghai Jiao Tong University School of Medicine, Shanghai 200240, China
| | - Qiang Wu
- Key Laboratory of Systems Biomedicine (Ministry of Education), Center for Comparative Biomedicine, Institute of Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai 200240, China Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Bio-X Center, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China Collaborative Innovation Center of Systems Biomedicine, Shanghai Jiao Tong University School of Medicine, Shanghai 200240, China
| |
Collapse
|
22
|
Expansion of stochastic expression repertoire by tandem duplication in mouse Protocadherin-α cluster. Sci Rep 2014; 4:6263. [PMID: 25179445 PMCID: PMC4151104 DOI: 10.1038/srep06263] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2014] [Accepted: 08/13/2014] [Indexed: 11/08/2022] Open
Abstract
Tandem duplications are concentrated within the Pcdh cluster throughout vertebrate evolution and as copy number variations (CNVs) in human populations, but the effects of tandem duplication in the Pcdh cluster remain elusive. To investigate the effects of tandem duplication in the Pcdh cluster, here we generated and analyzed a new line of the Pcdh cluster mutant mice. In the mutant allele, a 218-kb region containing the Pcdh-α2 to Pcdh-αc2 variable exons with their promoters was duplicated and the individual duplicated Pcdh isoforms can be disctinguished. The individual duplicated Pcdh-α isoforms showed diverse expression level with stochastic expression manner, even though those have an identical promoter sequence. Interestingly, the 5'-located duplicated Pcdh-αc2, which is constitutively expressed in the wild-type brain, shifted to stochastic expression accompanied by increased DNA methylation. These results demonstrate that tandem duplication in the Pcdh cluster expands the stochastic expression repertoire irrespective of sequence divergence.
Collapse
|
23
|
Sharma A, Wolfgruber TK, Presting GG. Tandem repeats derived from centromeric retrotransposons. BMC Genomics 2013; 14:142. [PMID: 23452340 PMCID: PMC3648361 DOI: 10.1186/1471-2164-14-142] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2012] [Accepted: 02/23/2013] [Indexed: 12/26/2022] Open
Abstract
Background Tandem repeats are ubiquitous and abundant in higher eukaryotic genomes and constitute, along with transposable elements, much of DNA underlying centromeres and other heterochromatic domains. In maize, centromeric satellite repeat (CentC) and centromeric retrotransposons (CR), a class of Ty3/gypsy retrotransposons, are enriched at centromeres. Some satellite repeats have homology to retrotransposons and several mechanisms have been proposed to explain the expansion, contraction as well as homogenization of tandem repeats. However, the origin and evolution of tandem repeat loci remain largely unknown. Results CRM1TR and CRM4TR are novel tandem repeats that we show to be entirely derived from CR elements belonging to two different subfamilies, CRM1 and CRM4. Although these tandem repeats clearly originated in at least two separate events, they are derived from similar regions of their respective parent element, namely the long terminal repeat (LTR) and untranslated region (UTR). The 5′ ends of the monomer repeat units of CRM1TR and CRM4TR map to different locations within their respective LTRs, while their 3′ ends map to the same relative position within a conserved region of their UTRs. Based on the insertion times of heterologous retrotransposons that have inserted into these tandem repeats, amplification of the repeats is estimated to have begun at least ~4 (CRM1TR) and ~1 (CRM4TR) million years ago. Distinct CRM1TR sequence variants occupy the two CRM1TR loci, indicating that there is little or no movement of repeats between loci, even though they are separated by only ~1.4 Mb. Conclusions The discovery of two novel retrotransposon derived tandem repeats supports the conclusions from earlier studies that retrotransposons can give rise to tandem repeats in eukaryotic genomes. Analysis of monomers from two different CRM1TR loci shows that gene conversion is the major cause of sequence variation. We propose that successive intrastrand deletions generated the initial repeat structure, and gene conversions increased the size of each tandem repeat locus.
Collapse
|
24
|
Boldogköi Z. Transcriptional interference networks coordinate the expression of functionally related genes clustered in the same genomic loci. Front Genet 2012; 3:122. [PMID: 22783276 PMCID: PMC3389743 DOI: 10.3389/fgene.2012.00122] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2012] [Accepted: 06/15/2012] [Indexed: 11/25/2022] Open
Abstract
The regulation of gene expression is essential for normal functioning of biological systems in every form of life. Gene expression is primarily controlled at the level of transcription, especially at the phase of initiation. Non-coding RNAs are one of the major players at every level of genetic regulation, including the control of chromatin organization, transcription, various post-transcriptional processes, and translation. In this study, the Transcriptional Interference Network (TIN) hypothesis was put forward in an attempt to explain the global expression of antisense RNAs and the overall occurrence of tandem gene clusters in the genomes of various biological systems ranging from viruses to mammalian cells. The TIN hypothesis suggests the existence of a novel layer of genetic regulation, based on the interactions between the transcriptional machineries of neighboring genes at their overlapping regions, which are assumed to play a fundamental role in coordinating gene expression within a cluster of functionally linked genes. It is claimed that the transcriptional overlaps between adjacent genes are much more widespread in genomes than is thought today. The Waterfall model of the TIN hypothesis postulates a unidirectional effect of upstream genes on the transcription of downstream genes within a cluster of tandemly arrayed genes, while the Seesaw model proposes a mutual interdependence of gene expression between the oppositely oriented genes. The TIN represents an auto-regulatory system with an exquisitely timed and highly synchronized cascade of gene expression in functionally linked genes located in close physical proximity to each other. In this study, we focused on herpesviruses. The reason for this lies in the compressed nature of viral genes, which allows a tight regulation and an easier investigation of the transcriptional interactions between genes. However, I believe that the same or similar principles can be applied to cellular organisms too.
Collapse
Affiliation(s)
- Zsolt Boldogköi
- Department of Medical Biology, Faculty of Medicine, University of Szeged, Szeged, Hungary
| |
Collapse
|
25
|
Lu J, Peatman E, Tang H, Lewis J, Liu Z. Profiling of gene duplication patterns of sequenced teleost genomes: evidence for rapid lineage-specific genome expansion mediated by recent tandem duplications. BMC Genomics 2012; 13:246. [PMID: 22702965 PMCID: PMC3464592 DOI: 10.1186/1471-2164-13-246] [Citation(s) in RCA: 77] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2012] [Accepted: 06/15/2012] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Gene duplication has had a major impact on genome evolution. Localized (or tandem) duplication resulting from unequal crossing over and whole genome duplication are believed to be the two dominant mechanisms contributing to vertebrate genome evolution. While much scrutiny has been directed toward discerning patterns indicative of whole-genome duplication events in teleost species, less attention has been paid to the continuous nature of gene duplications and their impact on the size, gene content, functional diversity, and overall architecture of teleost genomes. RESULTS Here, using a Markov clustering algorithm directed approach we catalogue and analyze patterns of gene duplication in the four model teleost species with chromosomal coordinates: zebrafish, medaka, stickleback, and Tetraodon. Our analyses based on set size, duplication type, synonymous substitution rate (Ks), and gene ontology emphasize shared and lineage-specific patterns of genome evolution via gene duplication. Most strikingly, our analyses highlight the extraordinary duplication and retention rate of recent duplicates in zebrafish and their likely role in the structural and functional expansion of the zebrafish genome. We find that the zebrafish genome is remarkable in its large number of duplicated genes, small duplicate set size, biased Ks distribution toward minimal mutational divergence, and proportion of tandem and intra-chromosomal duplicates when compared with the other teleost model genomes. The observed gene duplication patterns have played significant roles in shaping the architecture of teleost genomes and appear to have contributed to the recent functional diversification and divergence of important physiological processes in zebrafish. CONCLUSIONS We have analyzed gene duplication patterns and duplication types among the available teleost genomes and found that a large number of genes were tandemly and intrachromosomally duplicated, suggesting their origin of independent and continuous duplication. This is particularly true for the zebrafish genome. Further analysis of the duplicated gene sets indicated that a significant portion of duplicated genes in the zebrafish genome were of recent, lineage-specific duplication events. Most strikingly, a subset of duplicated genes is enriched among the recently duplicated genes involved in immune or sensory response pathways. Such findings demonstrated the significance of continuous gene duplication as well as that of whole genome duplication in the course of genome evolution.
Collapse
Affiliation(s)
- Jianguo Lu
- Department of Fisheries and Allied Aquacultures and Program of Cell and Molecular Biosciences, Auburn University, Auburn, AL 36849, USA
| | | | | | | | | |
Collapse
|
26
|
Walker MB, King BL, Paigen K. Clusters of ancestrally related genes that show paralogy in whole or in part are a major feature of the genomes of humans and other species. PLoS One 2012; 7:e35274. [PMID: 22563380 PMCID: PMC3338513 DOI: 10.1371/journal.pone.0035274] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2011] [Accepted: 03/14/2012] [Indexed: 11/22/2022] Open
Abstract
Arrangements of genes along chromosomes are a product of evolutionary processes, and we can expect that preferable arrangements will prevail over the span of evolutionary time, often being reflected in the non-random clustering of structurally and/or functionally related genes. Such non-random arrangements can arise by two distinct evolutionary processes: duplications of DNA sequences that give rise to clusters of genes sharing both sequence similarity and common sequence features and the migration together of genes related by function, but not by common descent [1], [2], [3]. To provide a background for distinguishing between the two, which is important for future efforts to unravel the evolutionary processes involved, we here provide a description of the extent to which ancestrally related genes are found in proximity. Towards this purpose, we combined information from five genomic datasets, InterPro, SCOP, PANTHER, Ensembl protein families, and Ensembl gene paralogs. The results are provided in publicly available datasets (http://cgd.jax.org/datasets/clustering/paraclustering.shtml) describing the extent to which ancestrally related genes are in proximity beyond what is expected by chance (i.e. form paraclusters) in the human and nine other vertebrate genomes, as well as the D. melanogaster, C. elegans, A. thaliana, and S. cerevisiae genomes. With the exception of Saccharomyces, paraclusters are a common feature of the genomes we examined. In the human genome they are estimated to include at least 22% of all protein coding genes. Paraclusters are far more prevalent among some gene families than others, are highly species or clade specific and can evolve rapidly, sometimes in response to environmental cues. Altogether, they account for a large portion of the functional clustering previously reported in several genomes.
Collapse
Affiliation(s)
| | - Benjamin L. King
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
- Mount Desert Island Biological Laboratory, Salisbury Cove, Maine, United States of America
| | - Kenneth Paigen
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
- * E-mail:
| |
Collapse
|
27
|
Ezawa K, Ikeo K, Gojobori T, Saitou N. Evolutionary patterns of recently emerged animal duplogs. Genome Biol Evol 2011; 3:1119-35. [PMID: 21859807 PMCID: PMC3194840 DOI: 10.1093/gbe/evr074] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Duplogs, or intraspecies paralogs, constitute the important portion of eukaryote genomes and serve as a major source of functional innovation. We conducted detailed analyses of recently emerged animal duplogs. Genome data of three vertebrate species (Homo sapiens, Mus musculus, and Danio rerio), Caenorhabditis elegans, and two Drosophila species (Drosophila melanogaster and D. pseudoobscura) were used. Duplication events were divided into six age-groups according to the synonymous distance (dS) up to 0.6. Duplogs were classified into four equal-sized classes on physical distances and into three classes on relative orientations. We observed the following shared characteristics among intrachromosomal multiexon duplogs: 1) inverted duplogs account for 20-50%, and about a half of the physically most distant 25%; 2) except for C. elegans, the composition of physical distances, that of relative orientations, and the proportion of inverted duplogs in each physical distance category are more or less uniform; 3) except for C. elegans, the characteristics of the youngest (dS < 0.01) duplogs are similar to the overall characteristics of the entire set. These results suggest that intrachromosomal duplogs with fairly long physical distances were generated at once, rather than resulting from tandem duplications and subsequent genomic rearrangements. This is different from the three well-known modes of gene duplication: tandem duplication, retrotransposition, and genome duplication. We termed this new mode as "drift" duplication. The drift duplication has been producing duplicate copies at paces comparable with tandem duplications since the common ancestor of vertebrates, and it may have already operated in the common ancestor of bilateral animals.
Collapse
Affiliation(s)
- Kiyoshi Ezawa
- Division of Population Genetics, National Institute of Genetics, Mishima, Japan
| | | | | | | |
Collapse
|
28
|
Despons L, Baret PV, Frangeul L, Louis VL, Durrens P, Souciet JL. Genome-wide computational prediction of tandem gene arrays: application in yeasts. BMC Genomics 2010; 11:56. [PMID: 20092627 PMCID: PMC2822764 DOI: 10.1186/1471-2164-11-56] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2009] [Accepted: 01/21/2010] [Indexed: 11/10/2022] Open
Abstract
Background This paper describes an efficient in silico method for detecting tandem gene arrays (TGAs) in fully sequenced and compact genomes such as those of prokaryotes or unicellular eukaryotes. The originality of this method lies in the search of protein sequence similarities in the vicinity of each coding sequence, which allows the prediction of tandem duplicated gene copies independently of their functionality. Results Applied to nine hemiascomycete yeast genomes, this method predicts that 2% of the genes are involved in TGAs and gene relics are present in 11% of TGAs. The frequency of TGAs with degenerated gene copies means that a significant fraction of tandem duplicated genes follows the birth-and-death model of evolution. A comparison of sequence identity distributions between sets of homologous gene pairs shows that the different copies of tandem arrayed paralogs are less divergent than copies of dispersed paralogs in yeast genomes. It suggests that paralogs included in tandem structures are more recent or more subject to the gene conversion mechanism than other paralogs. Conclusion The method reported here is a useful computational tool to provide a database of TGAs composed of functional or nonfunctional gene copies. Such a database has obvious applications in the fields of structural and comparative genomics. Notably, a detailed study of the TGA catalog will make it possible to tackle the fundamental questions of the origin and evolution of tandem gene clusters.
Collapse
|
29
|
Shi G, Zhang L, Jiang T. MSOAR 2.0: Incorporating tandem duplications into ortholog assignment based on genome rearrangement. BMC Bioinformatics 2010; 11:10. [PMID: 20053291 PMCID: PMC2821317 DOI: 10.1186/1471-2105-11-10] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2009] [Accepted: 01/06/2010] [Indexed: 11/28/2022] Open
Abstract
BACKGROUND Ortholog assignment is a critical and fundamental problem in comparative genomics, since orthologs are considered to be functional counterparts in different species and can be used to infer molecular functions of one species from those of other species. MSOAR is a recently developed high-throughput system for assigning one-to-one orthologs between closely related species on a genome scale. It attempts to reconstruct the evolutionary history of input genomes in terms of genome rearrangement and gene duplication events. It assumes that a gene duplication event inserts a duplicated gene into the genome of interest at a random location (i.e., the random duplication model). However, in practice, biologists believe that genes are often duplicated by tandem duplications, where a duplicated gene is located next to the original copy (i.e., the tandem duplication model). RESULTS In this paper, we develop MSOAR 2.0, an improved system for one-to-one ortholog assignment. For a pair of input genomes, the system first focuses on the tandemly duplicated genes of each genome and tries to identify among them those that were duplicated after the speciation (i.e., the so-called inparalogs), using a simple phylogenetic tree reconciliation method. For each such set of tandemly duplicated inparalogs, all but one gene will be deleted from the concerned genome (because they cannot possibly appear in any one-to-one ortholog pairs), and MSOAR is invoked. Using both simulated and real data experiments, we show that MSOAR 2.0 is able to achieve a better sensitivity and specificity than MSOAR. In comparison with the well-known genome-scale ortholog assignment tool InParanoid, Ensembl ortholog database, and the orthology information extracted from the well-known whole-genome multiple alignment program MultiZ, MSOAR 2.0 shows the highest sensitivity. Although the specificity of MSOAR 2.0 is slightly worse than that of InParanoid in the real data experiments, it is actually better than that of InParanoid in the simulation tests. CONCLUSIONS Our preliminary experimental results demonstrate that MSOAR 2.0 is a highly accurate tool for one-to-one ortholog assignment between closely related genomes. The software is available to the public for free and included as online supplementary material.
Collapse
Affiliation(s)
- Guanqun Shi
- Department of Computer Science, University of California, Riverside, CA 92521, USA
| | - Liqing Zhang
- Department of Computer Science, Virginia Tech, Blacksburg, VA 24060, USA
| | - Tao Jiang
- Department of Computer Science, University of California, Riverside, CA 92521, USA
| |
Collapse
|
30
|
Luo Z, van Vuuren HJJ. Functional analyses of PAU genes in Saccharomyces cerevisiae. Microbiology (Reading) 2009; 155:4036-4049. [DOI: 10.1099/mic.0.030726-0] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
PAU genes constitute the largest gene family in Saccharomyces cerevisiae, with 24 members mostly located in the subtelomeric regions of chromosomes. Little information is available about PAU genes, other than expression data for some members. In this study, we systematically compared the sequences of all 24 members, examined the expression of PAU3, PAU5, DAN2, PAU17 and PAU20 in response to stresses, and investigated the stability of all Pau proteins. The chromosomal localization, synteny and sequence analyses revealed that PAU genes could have been amplified by segmental and retroposition duplication through mechanisms of chromosomal end translocation and Ty-associated recombination. The coding sequences diverged through nucleotide substitution and insertion/deletion of one to four codons, thus causing changes in amino acids, truncation or extension of Pau proteins. Pairwise comparison of non-coding regions revealed little homology in flanking sequences of some members. All 24 PAU promoters contain a TATA box, and 22 PAU promoters contain at least one copy of the anaerobic response element and the aerobic repression motif. Differential expression was observed among PAU3, PAU5, PAU17, PAU20 and DAN2 in response to stress, with PAU5 having the highest capacity to be induced by anaerobic conditions, low temperature and wine fermentations. Furthermore, Pau proteins with 124 aa were less stable than those with 120 or 122 aa. Our results indicate that duplicated PAU genes have been evolving, and the individual Pau proteins might possess specific roles for the adaptation of S. cerevisiae to certain environmental stresses.
Collapse
Affiliation(s)
- Zongli Luo
- Wine Research Centre, Faculty of Land and Food Systems, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Hennie J. J. van Vuuren
- Wine Research Centre, Faculty of Land and Food Systems, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| |
Collapse
|
31
|
Darzacq X, Yao J, Larson DR, Causse SZ, Bosanac L, de Turris V, Ruda VM, Lionnet T, Zenklusen D, Guglielmi B, Tjian R, Singer RH. Imaging transcription in living cells. Annu Rev Biophys 2009; 38:173-96. [PMID: 19416065 DOI: 10.1146/annurev.biophys.050708.133728] [Citation(s) in RCA: 102] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
The advent of new technologies for the imaging of living cells has made it possible to determine the properties of transcription, the kinetics of polymerase movement, the association of transcription factors, and the progression of the polymerase on the gene. We report here the current state of the field and the progress necessary to achieve a more complete understanding of the various steps in transcription. Our Consortium is dedicated to developing and implementing the technology to further this understanding.
Collapse
Affiliation(s)
- Xavier Darzacq
- Janelia Farm Research Consortium on Imaging Transcription, Janelia Farm Research Campus, Howard Hughes Medical Institute, Ashburn, Virginia 20147, USA.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|