1
|
Fodor E, Okendo J, Szabó N, Szabó K, Czimer D, Tarján-Rácz A, Szeverényi I, Low BW, Liew JH, Koren S, Rhie A, Orbán L, Miklósi Á, Varga M, Burgess SM. The reference genome of Macropodus opercularis (the paradise fish). Sci Data 2024; 11:540. [PMID: 38796485 PMCID: PMC11127978 DOI: 10.1038/s41597-024-03277-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 04/18/2024] [Indexed: 05/28/2024] Open
Abstract
Amongst fishes, zebrafish (Danio rerio) has gained popularity as a model system over most other species and while their value as a model is well documented, their usefulness is limited in certain fields of research such as behavior. By embracing other, less conventional experimental organisms, opportunities arise to gain broader insights into evolution and development, as well as studying behavioral aspects not available in current popular model systems. The anabantoid paradise fish (Macropodus opercularis), an "air-breather" species has a highly complex behavioral repertoire and has been the subject of many ethological investigations but lacks genomic resources. Here we report the reference genome assembly of M. opercularis using long-read sequences at 150-fold coverage. The final assembly consisted of 483,077,705 base pairs (~483 Mb) on 152 contigs. Within the assembled genome we identified and annotated 20,157 protein coding genes and assigned ~90% of them to orthogroups.
Collapse
Affiliation(s)
- Erika Fodor
- Department of Genetics, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Javan Okendo
- Translational and Functional Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA
| | - Nóra Szabó
- Department of Genetics, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Kata Szabó
- Department of Genetics, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Dávid Czimer
- Department of Genetics, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Anita Tarján-Rácz
- Department of Genetics, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Ildikó Szeverényi
- Frontline Fish Genomics Research Group, Department of Applied Fish Biology, Institute of Aquaculture and Environmental Safety, Hungarian University of Agriculture and Life Sciences, Georgikon Campus, Keszthely, Hungary
| | - Bi Wei Low
- Science Unit, Lingnan University, Hong Kong, China
| | | | - Sergey Koren
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA
| | - Arang Rhie
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA
| | - László Orbán
- Frontline Fish Genomics Research Group, Department of Applied Fish Biology, Institute of Aquaculture and Environmental Safety, Hungarian University of Agriculture and Life Sciences, Georgikon Campus, Keszthely, Hungary
| | - Ádám Miklósi
- Department of Ethology, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Máté Varga
- Department of Genetics, ELTE Eötvös Loránd University, Budapest, Hungary.
| | - Shawn M Burgess
- Translational and Functional Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA.
| |
Collapse
|
2
|
Fodor E, Okendo J, Szabó N, Szabó K, Czimer D, Tarján-Rácz A, Szeverényi I, Low BW, Liew JH, Koren S, Rhie A, Orbán L, Miklósi Á, Varga M, Burgess SM. The reference genome of the paradise fish ( Macropodus opercularis). BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.10.552018. [PMID: 37609174 PMCID: PMC10441432 DOI: 10.1101/2023.08.10.552018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/24/2023]
Abstract
Over the decades, a small number of model species, each representative of a larger taxa, have dominated the field of biological research. Amongst fishes, zebrafish (Danio rerio) has gained popularity over most other species and while their value as a model is well documented, their usefulness is limited in certain fields of research such as behavior. By embracing other, less conventional experimental organisms, opportunities arise to gain broader insights into evolution and development, as well as studying behavioral aspects not available in current popular model systems. The anabantoid paradise fish (Macropodus opercularis), an "air-breather" species from Southeast Asia, has a highly complex behavioral repertoire and has been the subject of many ethological investigations, but lacks genomic resources. Here we report the reference genome assembly of Macropodus opercularis using long-read sequences at 150-fold coverage. The final assembly consisted of ≈483 Mb on 152 contigs. Within the assembled genome we identified and annotated 20,157 protein coding genes and assigned ≈90% of them to orthogroups. Completeness analysis showed that 98.5% of the Actinopterygii core gene set (ODB10) was present as a complete ortholog in our reference genome with a further 1.2 % being present in a fragmented form. Additionally, we cloned multiple genes important during early development and using newly developed in situ hybridization protocols, we showed that they have conserved expression patterns.
Collapse
Affiliation(s)
- Erika Fodor
- Department of Genetics, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Javan Okendo
- Translational and Functional Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA
| | - Nóra Szabó
- Department of Genetics, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Kata Szabó
- Department of Genetics, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Dávid Czimer
- Department of Genetics, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Anita Tarján-Rácz
- Department of Genetics, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Ildikó Szeverényi
- Department of Ethology, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Bi Wei Low
- Science Unit, Lingnan University, Hong Kong, China
| | | | - Sergey Koren
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA
| | - Arang Rhie
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA
| | - László Orbán
- Frontline Fish Genomics Research Group, Department of Applied Fish Biology, Institute of Aquaculture and Environmental Safety, Hungarian University of Agriculture and Life Sciences, Georgikon Campus, Keszthely, Hungary
| | - Ádám Miklósi
- Department of Genetics, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Máté Varga
- Department of Genetics, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Shawn M. Burgess
- Translational and Functional Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA
| |
Collapse
|
3
|
Matoulek D, Ježek B, Vohnoutová M, Symonová R. Advances in Vertebrate (Cyto)Genomics Shed New Light on Fish Compositional Genome Evolution. Genes (Basel) 2023; 14:genes14020244. [PMID: 36833171 PMCID: PMC9956151 DOI: 10.3390/genes14020244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Accepted: 01/05/2023] [Indexed: 01/19/2023] Open
Abstract
Cytogenetic and compositional studies considered fish genomes rather poor in guanine-cytosine content (GC%) because of a putative "sharp increase in genic GC% during the evolution of higher vertebrates". However, the available genomic data have not been exploited to confirm this viewpoint. In contrast, further misunderstandings in GC%, mostly of fish genomes, originated from a misapprehension of the current flood of data. Utilizing public databases, we calculated the GC% in animal genomes of three different, technically well-established fractions: DNA (entire genome), cDNA (complementary DNA), and cds (exons). Our results across chordates help set borders of GC% values that are still incorrect in literature and show: (i) fish in their immense diversity possess comparably GC-rich (or even GC-richer) genomes as higher vertebrates, and fish exons are GC-enriched among vertebrates; (ii) animal genomes generally show a GC-enrichment from the DNA, over cDNA, to the cds level (i.e., not only the higher vertebrates); (iii) fish and invertebrates show a broad(er) inter-quartile range in GC%, while avian and mammalian genomes are more constrained in their GC%. These results indicate no sharp increase in the GC% of genes during the transition to higher vertebrates, as stated and numerously repeated before. We present our results in 2D and 3D space to explore the compositional genome landscape and prepared an online platform to explore the AT/GC compositional genome evolution.
Collapse
Affiliation(s)
- Dominik Matoulek
- Department of Physics, Faculty of Science, University of Hradec Králové, 500 03 Hradec Králové, Czech Republic
| | - Bruno Ježek
- Faculty of Informatics and Management, University of Hradec Králové, Rokitanského 62, 500 02 Hradec Králové, Czech Republic
| | - Marta Vohnoutová
- Department of Computer Science, Faculty of Science, University of South Bohemia, Branišovská 1760, 370 05 České Budějovice, Czech Republic
| | - Radka Symonová
- Department of Computer Science, Faculty of Science, University of South Bohemia, Branišovská 1760, 370 05 České Budějovice, Czech Republic
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, 85354 Freising, Germany
- Institute of Hydrobiology, Biology Centre of the Czech Academy of Sciences, 370 05 České Budějovice, Czech Republic
- Correspondence:
| |
Collapse
|
4
|
Jakt LM, Dubin A, Johansen SD. Intron size minimisation in teleosts. BMC Genomics 2022; 23:628. [PMID: 36050638 PMCID: PMC9438311 DOI: 10.1186/s12864-022-08760-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Accepted: 07/13/2022] [Indexed: 11/17/2022] Open
Abstract
Background Spliceosomal introns are parts of primary transcripts that are removed by RNA splicing. Although introns apparently do not contribute to the function of the mature transcript, in vertebrates they comprise the majority of the transcribed region increasing the metabolic cost of transcription. The persistence of long introns across evolutionary time suggests functional roles that can offset this metabolic cost. The teleosts comprise one of the largest vertebrate clades. They have unusually compact and variable genome sizes and provide a suitable system for analysing intron evolution. Results We have analysed intron lengths in 172 vertebrate genomes and show that teleost intron lengths are relatively short, highly variable and bimodally distributed. Introns that were long in teleosts were also found to be long in mammals and were more likely to be found in regulatory genes and to contain conserved sequences. Our results argue that intron length has decreased in parallel in a non-random manner throughout teleost evolution and represent a deviation from the ancestral state. Conclusion Our observations indicate an accelerated rate of intron size evolution in the teleosts and that teleost introns can be divided into two classes by their length. Teleost intron sizes have evolved primarily as a side-effect of genome size evolution and small genomes are dominated by short introns (<256 base pairs). However, a non-random subset of introns has resisted this process across the teleosts and these are more likely have functional roles in all vertebrate clades. Supplementary Information The online version contains supplementary material available at (10.1186/s12864-022-08760-w).
Collapse
Affiliation(s)
- Lars Martin Jakt
- Faculty for bioscience and aquaculture, Nord University, Universitetsalléen 11, Bodoe, 8026, Norway.
| | - Arseny Dubin
- Faculty for bioscience and aquaculture, Nord University, Universitetsalléen 11, Bodoe, 8026, Norway.,Currently at: Parental Investment and Immune Dynamics, GEOMAR Helmholtz Centre for Ocean Research, Düsternbrookerweg 20, Kiel, D-24105, Germany
| | - Steinar Daae Johansen
- Faculty for bioscience and aquaculture, Nord University, Universitetsalléen 11, Bodoe, 8026, Norway
| |
Collapse
|
5
|
Haddad-Mashadrizeh A, Hemmat J, Aslamkhan M. Intronic regions of the human coagulation factor VIII gene harboring transcription factor binding sites with a strong bias towards the short-interspersed elements. Heliyon 2020; 6:e04727. [PMID: 32944665 PMCID: PMC7481535 DOI: 10.1016/j.heliyon.2020.e04727] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2019] [Revised: 09/03/2019] [Accepted: 08/10/2020] [Indexed: 12/12/2022] Open
Abstract
Increasing data show that intronic derived regulatory elements, such as transcription factor binding sites (TFBs), play key roles in gene regulation, and malfunction. Accordingly, characterizing the sequence context of the intronic regions of the human coagulation factor VIII (hFVIII) gene can be important. In this study, the intronic regions of the hFVIII gene were scrutinized based on in-silico methods. The results disclosed that these regions harbor a rich array of functional elements such as repetitive elements (REs), splicing sites, and transcription factor binding sites (TFBs). Among these elements, TFBs and REs showed a significant distribution and correlation to each other. This survey indicated that 31% of TFBs are localized in the intronic regions of the gene. Moreover, TFBs indicate a strong bias in the regions far from splice sites of introns with mapping to different REs. Accordingly, TFBs showed highly bias toward Short Interspersed Elements (SINEs), which in turn they covering about 12% of the total of REs. However, the distribution pattern of TFBs-REs showed different bias in the intronic regions, spatially into the Introns 13 and 25. The rich array of SINE-TFBs and CR1-TFBs were situated within 5′UTR of the gene that may be an important driving force for regulatory innovation of the hFVIII gene. Taken together, these data may lead to revealing intronic regions with the capacity to renewing gene regulatory networks of the hFVIII gene. On the other hand, these correlations might provide the novel idea for a new hypothesis of molecular evolution of the FVIII gene, and treatment of Hemophilia A which should be considered in future studies.
Collapse
Affiliation(s)
- Aliakbar Haddad-Mashadrizeh
- Recombinant Proteins Research Group, Institute of Biotechnology, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Jafar Hemmat
- Biotechnology Department, Iranian Research Organization for Science and Technology (IROST), Tehran, Iran
| | - Muhammad Aslamkhan
- Human Genetics & Molecular Biology Dept., University of Health Sciences, Lahore, Pakistan.,Honorary Senior Lecturer in the School of the Medicine University of Liverpool, Liverpool, UK
| |
Collapse
|
6
|
Kadobianskyi M, Schulze L, Schuelke M, Judkewitz B. Hybrid genome assembly and annotation of Danionella translucida. Sci Data 2019; 6:156. [PMID: 31451709 PMCID: PMC6710283 DOI: 10.1038/s41597-019-0161-z] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2019] [Accepted: 07/26/2019] [Indexed: 11/09/2022] Open
Abstract
Studying neuronal circuits at cellular resolution is very challenging in vertebrates due to the size and optical turbidity of their brains. Danionella translucida, a close relative of zebrafish, was recently introduced as a model organism for investigating neural network interactions in adult individuals. Danionella remains transparent throughout its life, has the smallest known vertebrate brain and possesses a rich repertoire of complex behaviours. Here we sequenced, assembled and annotated the Danionella translucida genome employing a hybrid Illumina/Nanopore read library as well as RNA-seq of embryonic, larval and adult mRNA. We achieved high assembly continuity using low-coverage long-read data and annotated a large fraction of the transcriptome. This dataset will pave the way for molecular research and targeted genetic manipulation of this novel model organism.
Collapse
Affiliation(s)
- Mykola Kadobianskyi
- Einstein Center for Neurosciences, NeuroCure Cluster of Excellence, Charité - Universitätsmedizin Berlin, Charitéplatz 1, 10117, Berlin, Germany
| | - Lisanne Schulze
- Einstein Center for Neurosciences, NeuroCure Cluster of Excellence, Charité - Universitätsmedizin Berlin, Charitéplatz 1, 10117, Berlin, Germany
| | - Markus Schuelke
- Einstein Center for Neurosciences, NeuroCure Cluster of Excellence, Charité - Universitätsmedizin Berlin, Charitéplatz 1, 10117, Berlin, Germany.
| | - Benjamin Judkewitz
- Einstein Center for Neurosciences, NeuroCure Cluster of Excellence, Charité - Universitätsmedizin Berlin, Charitéplatz 1, 10117, Berlin, Germany.
| |
Collapse
|
7
|
Malmstrøm M, Britz R, Matschiner M, Tørresen OK, Hadiaty RK, Yaakob N, Tan HH, Jakobsen KS, Salzburger W, Rüber L. The Most Developmentally Truncated Fishes Show Extensive Hox Gene Loss and Miniaturized Genomes. Genome Biol Evol 2018; 10:1088-1103. [PMID: 29684203 PMCID: PMC5906920 DOI: 10.1093/gbe/evy058] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/13/2018] [Indexed: 12/20/2022] Open
Abstract
The world’s smallest fishes belong to the genus Paedocypris. These miniature fishes are endemic to an extreme habitat: the peat swamp forests in Southeast Asia, characterized by highly acidic blackwater. This threatened habitat is home to a large array of fishes, including a number of miniaturized but also developmentally truncated species. Especially the genus Paedocypris is characterized by profound, organism-wide developmental truncation, resulting in sexually mature individuals of <8 mm in length with a larval phenotype. Here, we report on evolutionary simplification in the genomes of two species of the dwarf minnow genus Paedocypris using whole-genome sequencing. The two species feature unprecedented Hox gene loss and genome reduction in association with their massive developmental truncation. We also show how other genes involved in the development of musculature, nervous system, and skeleton have been lost in Paedocypris, mirroring its highly progenetic phenotype. Further, our analyses suggest two mechanisms responsible for the genome streamlining in Paedocypris in relation to other Cypriniformes: severe intron shortening and reduced repeat content. As the first report on the genomic sequence of a vertebrate species with organism-wide developmental truncation, the results of our work enhance our understanding of genome evolution and how genotypes are translated to phenotypes. In addition, as a naturally simplified system closely related to zebrafish, Paedocypris provides novel insights into vertebrate development.
Collapse
Affiliation(s)
- Martin Malmstrøm
- Department of Biosciences, Centre for Ecological and Evolutionary Synthesis (CEES), University of Oslo, Norway.,Zoological Institute, University of Basel, Switzerland
| | - Ralf Britz
- Department of Life Sciences, Natural History Museum, London, United Kingdom
| | - Michael Matschiner
- Department of Biosciences, Centre for Ecological and Evolutionary Synthesis (CEES), University of Oslo, Norway.,Zoological Institute, University of Basel, Switzerland
| | - Ole K Tørresen
- Department of Biosciences, Centre for Ecological and Evolutionary Synthesis (CEES), University of Oslo, Norway
| | - Renny Kurnia Hadiaty
- Ichthyology Laboratory, Division of Zoology, Research Center for Biology, Indonesian Institute of Sciences (LIPI), Cibinong, Indonesia
| | - Norsham Yaakob
- Forest Research Institute Malaysia (FRIM), Kepong, Selangor Darul Ehsan, Malaysia
| | - Heok Hui Tan
- Lee Kong Chian Natural History Museum, National University of Singapore, Singapore
| | - Kjetill Sigurd Jakobsen
- Department of Biosciences, Centre for Ecological and Evolutionary Synthesis (CEES), University of Oslo, Norway
| | - Walter Salzburger
- Department of Biosciences, Centre for Ecological and Evolutionary Synthesis (CEES), University of Oslo, Norway.,Zoological Institute, University of Basel, Switzerland
| | - Lukas Rüber
- Naturhistorisches Museum Bern, Switzerland.,Aquatic Ecology and Evolution, Institute of Ecology and Evolution, University of Bern, Switzerland
| |
Collapse
|
8
|
Xiong P, Hulsey CD, Meyer A, Franchini P. Evolutionary divergence of 3' UTRs in cichlid fishes. BMC Genomics 2018; 19:433. [PMID: 29866078 PMCID: PMC5987618 DOI: 10.1186/s12864-018-4821-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Accepted: 05/23/2018] [Indexed: 01/18/2023] Open
Abstract
Background Post-transcriptional regulation is crucial for the control of eukaryotic gene expression and might contribute to adaptive divergence. The three prime untranslated regions (3’ UTRs), that are located downstream of protein-coding sequences, play important roles in post-transcriptional regulation. These regions contain functional elements that influence the fate of mRNAs and could be exceptionally important in groups such as rapidly evolving cichlid fishes. Results To examine cichlid 3’ UTR evolution, we 1) identified gene features in nine teleost genomes and 2) performed comparative analyses to assess evolutionary variation in length, functional motifs, and evolutionary rates of 3’ UTRs. In all nine teleost genomes, we found a smaller proportion of repetitive elements in 3’ UTRs than in the whole genome. We found that the 3’ UTRs in cichlids tend to be longer than those in non-cichlids, and this was associated, on average, with one more miRNA target per gene in cichlids. Moreover, we provided evidence that 3’ UTRs on average have evolved faster in cichlids than in non-cichlids. Finally, analyses of gene function suggested that both the top 5% longest and 5% most rapidly evolving 3’ UTRs in cichlids tended to be involved in ribosome-associated pathways and translation. Conclusions Our results reveal novel patterns of evolution in the 3’ UTRs of teleosts in general and cichlids in particular. The data suggest that 3’ UTRs might serve as important meta-regulators, regulators of other mechanisms governing post-transcriptional regulation, especially in groups like cichlids that have undergone extremely fast rates of phenotypic diversification and speciation. Electronic supplementary material The online version of this article (10.1186/s12864-018-4821-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Peiwen Xiong
- Chair in Zoology and Evolutionary Biology, Department of Biology, University of Konstanz, 78457, Konstanz, Germany
| | - C Darrin Hulsey
- Chair in Zoology and Evolutionary Biology, Department of Biology, University of Konstanz, 78457, Konstanz, Germany
| | - Axel Meyer
- Chair in Zoology and Evolutionary Biology, Department of Biology, University of Konstanz, 78457, Konstanz, Germany.,Radcliffe Institute for Advanced Study, Harvard University, Cambridge, MA, 02138, USA
| | - Paolo Franchini
- Chair in Zoology and Evolutionary Biology, Department of Biology, University of Konstanz, 78457, Konstanz, Germany.
| |
Collapse
|
9
|
Balik-Meisner M, Truong L, Scholl EH, Tanguay RL, Reif DM. Population genetic diversity in zebrafish lines. Mamm Genome 2018; 29:90-100. [PMID: 29368091 PMCID: PMC5851690 DOI: 10.1007/s00335-018-9735-x] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2017] [Accepted: 01/17/2018] [Indexed: 12/27/2022]
Abstract
Toxicological and pharmacological researchers have seized upon the many benefits of zebrafish, including the short generation time, well-characterized development, and early maturation as clear embryos. A major difference from many model organisms is that standard husbandry practices in zebrafish are designed to maintain population diversity. While this diversity is attractive for translational applications in human and ecological health, it raises critical questions on how interindividual genetic variation might contribute to chemical exposure or disease susceptibility differences. Findings from pooled samples of zebrafish support this supposition of diversity yet cannot directly measure allele frequencies for reference versus alternate alleles. Using the Tanguay lab Tropical 5D zebrafish line (T5D), we performed whole genome sequencing on a large group (n = 276) of individual zebrafish embryos. Paired-end reads were collected on an Illumina 3000HT, then aligned to the most recent zebrafish reference genome (GRCz10). These data were used to compare observed population genetic variation across species (humans, mice, zebrafish), then across lines within zebrafish. We found more single nucleotide polymorphisms (SNPs) in T5D than have been reported in SNP databases for any of the WIK, TU, TL, or AB lines. We theorize that some subset of the novel SNPs may be shared with other zebrafish lines but have not been identified in other studies due to the limitations of capturing population diversity in pooled sequencing strategies. We establish T5D as a model that is representative of diversity levels within laboratory zebrafish lines and demonstrate that experimental design and analysis can exert major effects when characterizing genetic diversity in heterogeneous populations.
Collapse
Affiliation(s)
- Michele Balik-Meisner
- Bioinformatics Research Center, Center for Human Health and the Environment, Department of Biological Sciences, North Carolina State University, Ricks Hall 344, 1 Lampe Drive, Box 7566, Raleigh, NC, 27695, USA
| | - Lisa Truong
- Sinnhuber Aquatic Research Laboratory, Department of Environmental and Molecular Toxicology, Oregon State University, Corvallis, OR, 97331, USA
| | - Elizabeth H Scholl
- Bioinformatics Research Center, Center for Human Health and the Environment, Department of Biological Sciences, North Carolina State University, Ricks Hall 344, 1 Lampe Drive, Box 7566, Raleigh, NC, 27695, USA
| | - Robert L Tanguay
- Sinnhuber Aquatic Research Laboratory, Department of Environmental and Molecular Toxicology, Oregon State University, Corvallis, OR, 97331, USA
| | - David M Reif
- Bioinformatics Research Center, Center for Human Health and the Environment, Department of Biological Sciences, North Carolina State University, Ricks Hall 344, 1 Lampe Drive, Box 7566, Raleigh, NC, 27695, USA.
| |
Collapse
|
10
|
Wilbrandt J, Misof B, Niehuis O. COGNATE: comparative gene annotation characterizer. BMC Genomics 2017; 18:535. [PMID: 28716078 PMCID: PMC5513398 DOI: 10.1186/s12864-017-3870-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2017] [Accepted: 06/19/2017] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND The comparison of gene and genome structures across species has the potential to reveal major trends of genome evolution. However, such a comparative approach is currently hampered by a lack of standardization (e.g., Elliott TA, Gregory TR, Philos Trans Royal Soc B: Biol Sci 370:20140331, 2015). For example, testing the hypothesis that the total amount of coding sequences is a reliable measure of potential proteome diversity (Wang M, Kurland CG, Caetano-Anollés G, PNAS 108:11954, 2011) requires the application of standardized definitions of coding sequence and genes to create both comparable and comprehensive data sets and corresponding summary statistics. However, such standard definitions either do not exist or are not consistently applied. These circumstances call for a standard at the descriptive level using a minimum of parameters as well as an undeviating use of standardized terms, and for software that infers the required data under these strict definitions. The acquisition of a comprehensive, descriptive, and standardized set of parameters and summary statistics for genome publications and further analyses can thus greatly benefit from the availability of an easy to use standard tool. RESULTS We developed a new open-source command-line tool, COGNATE (Comparative Gene Annotation Characterizer), which uses a given genome assembly and its annotation of protein-coding genes for a detailed description of the respective gene and genome structure parameters. Additionally, we revised the standard definitions of gene and genome structures and provide the definitions used by COGNATE as a working draft suggestion for further reference. Complete parameter lists and summary statistics are inferred using this set of definitions to allow down-stream analyses and to provide an overview of the genome and gene repertoire characteristics. COGNATE is written in Perl and freely available at the ZFMK homepage ( https://www.zfmk.de/en/COGNATE ) and on github ( https://github.com/ZFMK/COGNATE ). CONCLUSION The tool COGNATE allows comparing genome assemblies and structural elements on multiples levels (e.g., scaffold or contig sequence, gene). It clearly enhances comparability between analyses. Thus, COGNATE can provide the important standardization of both genome and gene structure parameter disclosure as well as data acquisition for future comparative analyses. With the establishment of comprehensive descriptive standards and the extensive availability of genomes, an encompassing database will become possible.
Collapse
Affiliation(s)
- Jeanne Wilbrandt
- Zoologisches Forschungsmuseum Alexander Koenig (ZFMK), Zentrum für Molekulare Biodiversitätsforschung (zmb), Bonn, Germany
| | - Bernhard Misof
- Zoologisches Forschungsmuseum Alexander Koenig (ZFMK), Zentrum für Molekulare Biodiversitätsforschung (zmb), Bonn, Germany
| | - Oliver Niehuis
- Abteilung Evolutionsbiologie und Ökologie, Albert-Ludwigs-Universität Freiburg, Institut für Biologie I (Zoologie), Freiburg, Germany
| |
Collapse
|
11
|
Tarifeño-Saldivia E, Lavergne A, Bernard A, Padamata K, Bergemann D, Voz ML, Manfroid I, Peers B. Transcriptome analysis of pancreatic cells across distant species highlights novel important regulator genes. BMC Biol 2017; 15:21. [PMID: 28327131 PMCID: PMC5360028 DOI: 10.1186/s12915-017-0362-x] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2016] [Accepted: 03/01/2017] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Defining the transcriptome and the genetic pathways of pancreatic cells is of great interest for elucidating the molecular attributes of pancreas disorders such as diabetes and cancer. As the function of the different pancreatic cell types has been maintained during vertebrate evolution, the comparison of their transcriptomes across distant vertebrate species is a means to pinpoint genes under strong evolutionary constraints due to their crucial function, which have therefore preserved their selective expression in these pancreatic cell types. RESULTS In this study, RNA-sequencing was performed on pancreatic alpha, beta, and delta endocrine cells as well as the acinar and ductal exocrine cells isolated from adult zebrafish transgenic lines. Comparison of these transcriptomes identified many novel markers, including transcription factors and signaling pathway components, specific for each cell type. By performing interspecies comparisons, we identified hundreds of genes with conserved enriched expression in endocrine and exocrine cells among human, mouse, and zebrafish. This list includes many genes known as crucial for pancreatic cell formation or function, but also pinpoints many factors whose pancreatic function is still unknown. A large set of endocrine-enriched genes can already be detected at early developmental stages as revealed by the transcriptomic profiling of embryonic endocrine cells, indicating a potential role in cell differentiation. The actual involvement of conserved endocrine genes in pancreatic cell differentiation was demonstrated in zebrafish for myt1b, whose invalidation leads to a reduction of alpha cells, and for cdx4, selectively expressed in endocrine delta cells and crucial for their specification. Intriguingly, comparison of the endocrine alpha and beta cell subtypes from human, mouse, and zebrafish reveals a much lower conservation of the transcriptomic signatures for these two endocrine cell subtypes compared to the signatures of pan-endocrine and exocrine cells. These data suggest that the identity of the alpha and beta cells relies on a few key factors, corroborating numerous examples of inter-conversion between these two endocrine cell subtypes. CONCLUSION This study highlights both evolutionary conserved and species-specific features that will help to unveil universal and fundamental regulatory pathways as well as pathways specific to human and laboratory animal models such as mouse and zebrafish.
Collapse
Affiliation(s)
- Estefania Tarifeño-Saldivia
- Laboratory of Zebrafish Development and Disease Models (ZDDM), GIGA, University of Liège, Avenue de l'Hôpital 1, B34, 4000 Sart Tilman, Liege, Belgium
| | - Arnaud Lavergne
- Laboratory of Zebrafish Development and Disease Models (ZDDM), GIGA, University of Liège, Avenue de l'Hôpital 1, B34, 4000 Sart Tilman, Liege, Belgium
| | - Alice Bernard
- Laboratory of Zebrafish Development and Disease Models (ZDDM), GIGA, University of Liège, Avenue de l'Hôpital 1, B34, 4000 Sart Tilman, Liege, Belgium
| | - Keerthana Padamata
- Laboratory of Zebrafish Development and Disease Models (ZDDM), GIGA, University of Liège, Avenue de l'Hôpital 1, B34, 4000 Sart Tilman, Liege, Belgium
| | - David Bergemann
- Laboratory of Zebrafish Development and Disease Models (ZDDM), GIGA, University of Liège, Avenue de l'Hôpital 1, B34, 4000 Sart Tilman, Liege, Belgium
| | - Marianne L Voz
- Laboratory of Zebrafish Development and Disease Models (ZDDM), GIGA, University of Liège, Avenue de l'Hôpital 1, B34, 4000 Sart Tilman, Liege, Belgium
| | - Isabelle Manfroid
- Laboratory of Zebrafish Development and Disease Models (ZDDM), GIGA, University of Liège, Avenue de l'Hôpital 1, B34, 4000 Sart Tilman, Liege, Belgium
| | - Bernard Peers
- Laboratory of Zebrafish Development and Disease Models (ZDDM), GIGA, University of Liège, Avenue de l'Hôpital 1, B34, 4000 Sart Tilman, Liege, Belgium.
| |
Collapse
|
12
|
Martin KJ, Holland PWH. Diversification of Hox Gene Clusters in Osteoglossomorph Fish in Comparison to Other Teleosts and the Spotted Gar Outgroup. JOURNAL OF EXPERIMENTAL ZOOLOGY PART B-MOLECULAR AND DEVELOPMENTAL EVOLUTION 2017; 328:638-644. [DOI: 10.1002/jez.b.22726] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2016] [Revised: 12/14/2016] [Accepted: 12/25/2016] [Indexed: 11/06/2022]
Affiliation(s)
- Kyle J Martin
- Department of Zoology; University of Oxford; Oxford UK
- Department of Animal and Plant Sciences; University of Sheffield; Sheffield UK
| | | |
Collapse
|
13
|
Li Y, Xu Y, Ma Z. Comparative Analysis of the Exon-Intron Structure in Eukaryotic Genomes. ACTA ACUST UNITED AC 2017. [DOI: 10.4236/ym.2017.11006] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
14
|
Kuang YY, Zheng XH, Li CY, Li XM, Cao DC, Tong GX, Lv WH, Xu W, Zhou Y, Zhang XF, Sun ZP, Mahboob S, Al-Ghanim KA, Li JT, Sun XW. The genetic map of goldfish (Carassius auratus) provided insights to the divergent genome evolutions in the Cyprinidae family. Sci Rep 2016; 6:34849. [PMID: 27708388 PMCID: PMC5052598 DOI: 10.1038/srep34849] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2016] [Accepted: 09/20/2016] [Indexed: 01/13/2023] Open
Abstract
A high-density linkage map of goldfish (Carassius auratus) was constructed using RNA-sequencing. This map consists of 50 linkage groups with 8,521 SNP markers and an average resolution of 0.62 cM. Approximately 84% of markers are in protein-coding genes orthologous to zebrafish proteins. We performed comparative genome analysis between zebrafish and medaka, common carp, grass carp, and goldfish to study the genome evolution events in the Cyprinidae family. The comparison revealed large synteny blocks among Cyprinidae fish and we hypothesized that the Cyprinidae ancestor undergone many inter-chromosome rearrangements after speciation from teleost ancestor. The study also showed that goldfish genome had one more round of whole genome duplication (WGD) than zebrafish. Our results illustrated that most goldfish markers were orthologous to genes in common carp, which had four rounds of WGD. Growth-related regions and genes were identified by QTL analysis and association study. Function annotations of the associated genes suggested that they might regulate development and growth in goldfish. This first genetic map enables us to study the goldfish genome evolution and provides an important resource for selective breeding of goldfish.
Collapse
Affiliation(s)
- You-Yi Kuang
- Heilongjiang River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Harbin 150070, China
| | - Xian-Hu Zheng
- Heilongjiang River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Harbin 150070, China
| | - Chun-Yan Li
- Centre for Applied Aquatic Genomics, Chinese Academy of Fishery Sciences, Beijing 10014, China.,Tianjin Fisheries Research Institute, Tianjin, 300221, China
| | - Xiao-Min Li
- Centre for Applied Aquatic Genomics, Chinese Academy of Fishery Sciences, Beijing 10014, China
| | - Ding-Chen Cao
- Heilongjiang River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Harbin 150070, China
| | - Guang-Xiang Tong
- Heilongjiang River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Harbin 150070, China
| | - Wei-Hua Lv
- Heilongjiang River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Harbin 150070, China
| | - Wei Xu
- Heilongjiang River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Harbin 150070, China
| | - Yi Zhou
- Stem Cell Program of Boston Children's Hospital, Division of Hematology/Oncology, Boston Children's Hospital and Dana Farber Cancer Institute, Harvard Medical School, Boston, MA 02115, USA
| | - Xiao-Feng Zhang
- Heilongjiang River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Harbin 150070, China
| | - Zhi-Peng Sun
- Heilongjiang River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Harbin 150070, China
| | - Shahid Mahboob
- Department of Zoology, College of Science, King Saud University, P.O. Box 2455, Riyadh 11451, Saudi Arabia
| | - Khalid A Al-Ghanim
- Department of Zoology, College of Science, King Saud University, P.O. Box 2455, Riyadh 11451, Saudi Arabia
| | - Jiong-Tang Li
- Centre for Applied Aquatic Genomics, Chinese Academy of Fishery Sciences, Beijing 10014, China
| | - Xiao-Wen Sun
- Heilongjiang River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Harbin 150070, China
| |
Collapse
|
15
|
Kuriya K, Higashiyama E, Avşar-Ban E, Tamaru Y, Ogata S, Takebayashi SI, Ogata M, Okumura K. Direct Visualization of DNA Replication Dynamics in Zebrafish Cells. Zebrafish 2015; 12:432-9. [PMID: 26540100 DOI: 10.1089/zeb.2015.1151] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Spatiotemporal regulation of DNA replication in the S-phase nucleus has been extensively studied in mammalian cells because it is tightly coupled with the regulation of other nuclear processes such as transcription. However, little is known about the replication dynamics in nonmammalian cells. Here, we analyzed the DNA replication processes of zebrafish (Danio rerio) cells through the direct visualization of replicating DNA in the nucleus and on DNA fiber molecules isolated from the nucleus. We found that zebrafish chromosomal DNA at the nuclear interior was replicated first, followed by replication of DNA at the nuclear periphery, which is reminiscent of the spatiotemporal regulation of mammalian DNA replication. However, the relative duration of interior DNA replication in zebrafish cells was longer compared to mammalian cells, possibly reflecting zebrafish-specific genomic organization. The rate of replication fork progression and ori-to-ori distance measured by the DNA combing technique were ∼ 1.4 kb/min and 100 kb, respectively, which are comparable to those in mammalian cells. To our knowledge, this is a first report that measures replication dynamics in zebrafish cells.
Collapse
Affiliation(s)
- Kenji Kuriya
- 1 Laboratory of Molecular and Cellular Biology, Department of Life Sciences, Graduate School of Bioresources, Mie University , Tsu, Japan
| | - Eriko Higashiyama
- 1 Laboratory of Molecular and Cellular Biology, Department of Life Sciences, Graduate School of Bioresources, Mie University , Tsu, Japan
| | - Eriko Avşar-Ban
- 2 Laboratory for the Utilization of Aquatic Bioresources, Department of Life Sciences, Graduate School of Bioresources, Mie University , Tsu, Japan
| | - Yutaka Tamaru
- 2 Laboratory for the Utilization of Aquatic Bioresources, Department of Life Sciences, Graduate School of Bioresources, Mie University , Tsu, Japan
| | - Shin Ogata
- 1 Laboratory of Molecular and Cellular Biology, Department of Life Sciences, Graduate School of Bioresources, Mie University , Tsu, Japan
| | - Shin-ichiro Takebayashi
- 3 Department of Biochemistry and Proteomics, Graduate School of Medicine, Mie University , Tsu, Japan
| | - Masato Ogata
- 3 Department of Biochemistry and Proteomics, Graduate School of Medicine, Mie University , Tsu, Japan
| | - Katsuzumi Okumura
- 1 Laboratory of Molecular and Cellular Biology, Department of Life Sciences, Graduate School of Bioresources, Mie University , Tsu, Japan
| |
Collapse
|
16
|
Rapid genomic DNA changes in allotetraploid fish hybrids. Heredity (Edinb) 2015; 114:601-9. [PMID: 25669608 PMCID: PMC4434252 DOI: 10.1038/hdy.2015.3] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2014] [Revised: 12/17/2014] [Accepted: 12/19/2014] [Indexed: 12/22/2022] Open
Abstract
Rapid genomic change has been demonstrated in several allopolyploid plant systems; however, few studies focused on animals. We addressed this issue using an allotetraploid lineage (4nAT) of freshwater fish originally derived from the interspecific hybridization of red crucian carp (Carassius auratus red var., ♀, 2n=100) × common carp (Cyprinus carpio L., ♂, 2n=100). We constructed a bacterial artificial chromosome (BAC) library from allotetraploid hybrids in the 20th generation (F20) and sequenced 14 BAC clones representing a total of 592.126 kb, identified 11 functional genes and estimated the guanine-cytosine content (37.10%) and the proportion of repetitive elements (17.46%). The analysis of intron evolution using nine orthologous genes across a number of selected fish species detected a gain of 39 introns and a loss of 30 introns in the 4nAT lineage. A comparative study based on seven functional genes among 4nAT, diploid F1 hybrids (2nF1) (first generation of hybrids) and their original parents revealed that both hybrid types (2nF1 and 4nAT) not only inherited genomic DNA from their parents, but also demonstrated rapid genomic DNA changes (homoeologous recombination, parental DNA fragments loss and formation of novel genes). However, 4nAT presented more genomic variations compared with their parents than 2nF1. Interestingly, novel gene fragments were found for the iqca1 gene in both hybrid types. This study provided a preliminary genomic characterization of allotetraploid F20 hybrids and revealed evolutionary and functional genomic significance of allopolyploid animals.
Collapse
|
17
|
Horstick EJ, Jordan DC, Bergeron SA, Tabor KM, Serpe M, Feldman B, Burgess HA. Increased functional protein expression using nucleotide sequence features enriched in highly expressed genes in zebrafish. Nucleic Acids Res 2015; 43:e48. [PMID: 25628360 PMCID: PMC4402511 DOI: 10.1093/nar/gkv035] [Citation(s) in RCA: 62] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2014] [Accepted: 01/12/2015] [Indexed: 12/18/2022] Open
Abstract
Many genetic manipulations are limited by difficulty in obtaining adequate levels of protein expression. Bioinformatic and experimental studies have identified nucleotide sequence features that may increase expression, however it is difficult to assess the relative influence of these features. Zebrafish embryos are rapidly injected with calibrated doses of mRNA, enabling the effects of multiple sequence changes to be compared in vivo. Using RNAseq and microarray data, we identified a set of genes that are highly expressed in zebrafish embryos and systematically analyzed for enrichment of sequence features correlated with levels of protein expression. We then tested enriched features by embryo microinjection and functional tests of multiple protein reporters. Codon selection, releasing factor recognition sequence and specific introns and 3′ untranslated regions each increased protein expression between 1.5- and 3-fold. These results suggested principles for increasing protein yield in zebrafish through biomolecular engineering. We implemented these principles for rational gene design in software for codon selection (CodonZ) and plasmid vectors incorporating the most active non-coding elements. Rational gene design thus significantly boosts expression in zebrafish, and a similar approach will likely elevate expression in other animal models.
Collapse
Affiliation(s)
- Eric J Horstick
- Program in Genomics of Differentiation, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD 20892, USA
| | - Diana C Jordan
- Program in Genomics of Differentiation, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD 20892, USA
| | - Sadie A Bergeron
- Program in Genomics of Differentiation, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD 20892, USA
| | - Kathryn M Tabor
- Program in Genomics of Differentiation, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD 20892, USA
| | - Mihaela Serpe
- Program in Cellular Regulation and Metabolism, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD 20892, USA
| | - Benjamin Feldman
- Zebrafish Core, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD 20892, USA
| | - Harold A Burgess
- Program in Genomics of Differentiation, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD 20892, USA
| |
Collapse
|
18
|
Zhou K, Kuo A, Grigoriev IV. Reverse transcriptase and intron number evolution. Stem Cell Investig 2014; 1:17. [PMID: 27358863 DOI: 10.3978/j.issn.2306-9759.2014.08.01] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2014] [Accepted: 08/04/2014] [Indexed: 11/14/2022]
Abstract
BACKGROUND Introns are universal in eukaryotic genomes and play important roles in transcriptional regulation, mRNA export to the cytoplasm, nonsense-mediated decay as both a regulatory and a splicing quality control mechanism, R-loop avoidance, alternative splicing, chromatin structure, and evolution by exon-shuffling. METHODS Sixteen complete fungal genomes were used 13 of which were sequenced and annotated by JGI. Ustilago maydis, Cryptococcus neoformans, and Coprinus cinereus (also named Coprinopsis cinerea) were from the Broad Institute. Gene models from JGI-annotated genomes were taken from the GeneCatalog track that contained the best representative gene models. Varying fractions of the GeneCatalog were manually curated by external users. For clarity, we used the JGI unique database identifier. RESULTS The last common ancestor of eukaryotes (LECA) has an estimated 6.4 coding exons per gene (EPG) and evolved into the diverse eukaryotic life forms, which is recapitulated by the development of a stem cell. We found a parallel between the simulated reverse transcriptase (RT)-mediated intron loss and the comparative analysis of 16 fungal genomes that spanned a wide range of intron density. Although footprints of RT (RTF) were dynamic, relative intron location (RIL) to the 5'-end of mRNA faithfully traced RT-mediated intron loss and revealed 7.7 EPG for LECA. The mode of exon length distribution was conserved in simulated intron loss, which was exemplified by the shared mode of 75 nt between fungal and Chlamydomonas genomes. The dominant ancient exon length was corroborated by the average exon length of the most intron-rich genes in fungal genomes and consistent with ancient protein modules being ~25 aa. Combined with the conservation of a protein length of 400 aa, the earliest ancestor of eukaryotes could have 16 EPG. During earlier evolution, Ascomycota's ancestor had significantly more 3'-biased RT-mediated intron loss that was followed by dramatic RTF loss. There was a down trend of EPG from more conserved to less conserved genes. Moreover, species-specific genes have higher exon-densities, shorter exons, and longer introns when compared to genes conserved at the phylum level. However, intron length in species-specific genes became shorter than that of genes conserved in all species after genomes experiencing drastic intron loss. The estimated EPG from the most frequent exon length is more than double that from the RIL method. CONCLUSIONS This implies significant intron loss during the very early period of eukaryotic evolution. De novo gene-birth contributes to shorter exons, longer introns, and higher exon-density in species-specific genes relative to conserved genes.
Collapse
Affiliation(s)
- Kemin Zhou
- 1 Computational Genomics, Bristol-Myers Squibb, 311 Pennington Rocky Hill Road, Pennington, NJ 08534, USA ; 2 US Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Alan Kuo
- 1 Computational Genomics, Bristol-Myers Squibb, 311 Pennington Rocky Hill Road, Pennington, NJ 08534, USA ; 2 US Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Igor V Grigoriev
- 1 Computational Genomics, Bristol-Myers Squibb, 311 Pennington Rocky Hill Road, Pennington, NJ 08534, USA ; 2 US Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| |
Collapse
|
19
|
Molecular cloning and functional characterization of zebrafish Slc4a3/Ae3 anion exchanger. Pflugers Arch 2014; 466:1605-18. [PMID: 24668450 DOI: 10.1007/s00424-014-1494-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2014] [Revised: 02/24/2014] [Accepted: 03/04/2014] [Indexed: 12/15/2022]
Abstract
The zebrafish genome encodes two slc4a1 genes, one expressed in erythroid tissues and the other in the HR (H(+)-ATPase-rich) type of embryonic skin ionocytes, and two slc4a2 genes, one in proximal pronephric duct and the other in several extrarenal tissues of the embryo. We now report cDNA cloning and functional characterization of zebrafish slc4a3/ae3 gene products. The single ae3 gene on chromosome 9 generates at least two low-abundance ae3 transcripts differing only in their 5'-untranslated regions and encoding a single definitive Ae3 polypeptide of 1170 amino acids. The 7 kb upstream of the apparent initiator Met in ae3 exon 3 comprises multiple diverse, mobile repeat elements which disrupt and appear to truncate the Ae3 N-terminal amino acid sequence that would otherwise align with brain Ae3 of other species. Embryonic ae3 mRNA expression was detected by whole mount in situ hybridization only in fin buds at 24-72 hpf, but was detectable by RT-PCR across a range of embryonic and adult tissues. Epitope-tagged Ae3 polypeptide was expressed at or near the surface of Xenopus oocytes, and mediated low rates of DIDS-sensitive (36)Cl(-)/Cl(-) exchange in influx and efflux assays. As previously reported for Ae2 polypeptides, (36)Cl(-) transport by Ae3 was inhibited by both extracellular and intracellular acidic pH, and stimulated by alkaline pH. However, zebrafish Ae3 differed from Ae2 polypeptides in its insensitivity to NH4Cl and to hypertonicity. We conclude that multiple repeat elements have disrupted the 5'-end of the zebrafish ae3 gene, associated with N-terminal truncation of the protein and reduced anion transport activity.
Collapse
|
20
|
Chen X, Wang Q, Yang C, Rao Y, Li Q, Wan Q, Peng L, Wu S, Su J. Identification, expression profiling of a grass carp TLR8 and its inhibition leading to the resistance to reovirus in CIK cells. DEVELOPMENTAL AND COMPARATIVE IMMUNOLOGY 2013; 41:82-93. [PMID: 23632252 DOI: 10.1016/j.dci.2013.04.015] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2013] [Revised: 04/19/2013] [Accepted: 04/22/2013] [Indexed: 06/02/2023]
Abstract
TLR8 (toll-like receptor 8), a homolog of TLR3, TLR7 and TLR9 as prototypical intracellular members of TLR family, is generally associated with sensing single stranded RNA and plays a pivotal role in antiviral immune response. In this study, a TLR8 gene from grass carp Ctenopharyngodon idella (designated as CiTLR8) was obtained and characterized. The full-length cDNA of CiTLR8 was of 3766 bp. The open reading frame was of 3072 bp and encoded a polypeptide of 1023 amino acids, including seventeen LRR (leucine-rich repeat) motifs, one transmembrane domain and one TIR (toll/interleukin-1 receptor) domain. A single intron with the size of 839 bp was found on the neck of start codon (ATG). CiTLR8 mRNA was ubiquitously expressed in the 15 tested tissues and the expression level in gas bladder, spleen, brain, hindgut and trunk kidney tissues was high. Besides, the CiTLR8 expression in spleen and head kidney was significantly up-regulated and reached peak at 24 h post-injection of grass carp reovirus (GCRV). CiTLR8 transcription reached peak at 8 h and then declined below the normal level post-GCRV infection in the C. idella kidney (CIK) cell line; and it was rapidly and significantly down-regulated by the stimulation of the synthetic double-stranded RNA polyriboinosinic-polyribocytidylic acid sodium salt (poly I:C) in CIK cells in a dose and time-dependent manner. The inhibitor expression vectors were constructed and transfected into CIK cell line to obtain stably expressing shRNA targeting TLR8. In CiTLR8-knockdown cells, CiTLR7 transcript weakly increased, CiIFN-I mRNA was significantly down-regulated, and the expression of CiMyD88, CiIRF7 and CiMx1 scarcely changed. Post poly I:C stimulation, CiTLR8, CiTLR7 and CiMyD88 transcripts significantly increased, CiIRF7 was down-regulated after an initial phase of increase, and CiIFN-I and CiMx1 transcripts were up-regulated. After GCRV infection, the transcripts of CiTLR8, CiTLR7, CiMyD88 and CiIRF7 were up-regulated, but CiIFN-I and CiMx1 transcripts were inhibited. Nevertheless, cells transfected with pshTLR8 vectors had strong resistance against GCRV injection, suggesting CiTLR8 might play a negative role in antiviral immune response. These results collectively suggested that CiTLR8 was a novel member of TLR gene family, engaging in antiviral innate immune defense in C. idella, which laid a foundation for the further mechanism research of TLR8 in fishes.
Collapse
Affiliation(s)
- Xiaohui Chen
- College of Animal Science and Technology, Northwest A&F University, Shaanxi Key Laboratory of Molecular Biology for Agriculture, Yangling 712100, China
| | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Hadjipantelis PZ, Jones NS, Moriarty J, Springate DA, Knight CG. Function-valued traits in evolution. J R Soc Interface 2013; 10:20121032. [PMID: 23427095 PMCID: PMC3627078 DOI: 10.1098/rsif.2012.1032] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2012] [Accepted: 01/28/2013] [Indexed: 01/28/2023] Open
Abstract
Many biological characteristics of evolutionary interest are not scalar variables but continuous functions. Given a dataset of function-valued traits generated by evolution, we develop a practical, statistical approach to infer ancestral function-valued traits, and estimate the generative evolutionary process. We do this by combining dimension reduction and phylogenetic Gaussian process regression, a non-parametric procedure that explicitly accounts for known phylogenetic relationships. We test the performance of methods on simulated, function-valued data generated from a stochastic evolutionary model. The methods are applied assuming that only the phylogeny, and the function-valued traits of taxa at its tips are known. Our method is robust and applicable to a wide range of function-valued data, and also offers a phylogenetically aware method for estimating the autocorrelation of function-valued traits.
Collapse
|
22
|
Katju V. In with the old, in with the new: the promiscuity of the duplication process engenders diverse pathways for novel gene creation. INTERNATIONAL JOURNAL OF EVOLUTIONARY BIOLOGY 2012; 2012:341932. [PMID: 23008799 PMCID: PMC3449122 DOI: 10.1155/2012/341932] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/18/2012] [Accepted: 06/03/2012] [Indexed: 01/26/2023]
Abstract
The gene duplication process has exhibited far greater promiscuity in the creation of paralogs with novel exon-intron structures than anticipated even by Ohno. In this paper I explore the history of the field, from the neo-Darwinian synthesis through Ohno's formulation of the canonical model for the evolution of gene duplicates and culminating in the present genomic era. I delineate the major tenets of Ohno's model and discuss its failure to encapsulate the full complexity of the duplication process as revealed in the era of genomics. I discuss the diverse classes of paralogs originating from both DNA- and RNA-mediated duplication events and their evolutionary potential for assuming radically altered functions, as well as the degree to which they can function unconstrained from the pressure of gene conversion. Lastly, I explore theoretical population-genetic considerations of how the effective population size (N(e)) of a species may influence the probability of emergence of genes with radically altered functions.
Collapse
Affiliation(s)
- Vaishali Katju
- Department of Biology, University of New Mexico, Albuquerque, NM 87131, USA
| |
Collapse
|
23
|
Wang D, Su Y, Wang X, Lei H, Yu J. Transposon-derived and satellite-derived repetitive sequences play distinct functional roles in Mammalian intron size expansion. Evol Bioinform Online 2012; 8:301-19. [PMID: 22807622 PMCID: PMC3396637 DOI: 10.4137/ebo.s9758] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
Background Repetitive sequences (RSs) are redundant, complex at times, and often lineage-specific, representing significant “building” materials for genes and genomes. According to their origins, sequence characteristics, and ways of propagation, repetitive sequences are divided into transposable elements (TEs) and satellite sequences (SSs) as well as related subfamilies and subgroups hierarchically. The combined changes attributable to the repetitive sequences alter gene and genome architectures, such as the expansion of exonic, intronic, and intergenic sequences, and most of them propagate in a seemingly random fashion and contribute very significantly to the entire mutation spectrum of mammalian genomes. Principal findings Our analysis is focused on evolutional features of TEs and SSs in the intronic sequence of twelve selected mammalian genomes. We divided them into four groups—primates, large mammals, rodents, and primary mammals—and used four non-mammalian vertebrate species as the out-group. After classifying intron size variation in an intron-centric way based on RS-dominance (TE-dominant or SS-dominant intron expansions), we observed several distinct profiles in intron length and positioning in different vertebrate lineages, such as retrotransposon-dominance in mammals and DNA transposon-dominance in the lower vertebrates, amphibians and fishes. The RS patterns of mouse and rat genes are most striking, which are not only distinct from those of other mammals but also different from that of the third rodent species analyzed in this study—guinea pig. Looking into the biological functions of relevant genes, we observed a two-dimensional divergence; in particular, genes that possess SS-dominant and/or RS-free introns are enriched in tissue-specific development and transcription regulation in all mammalian lineages. In addition, we found that the tendency of transposons in increasing intron size is much stronger than that of satellites, and the combined effect of both RSs is greater than either one of them alone in a simple arithmetic sum among the mammals and the opposite is found among the four non-mammalian vertebrates. Conclusions TE- and SS-derived RSs represent major mutational forces shaping the size and composition of vertebrate genes and genomes, and through natural selection they either fine-tune or facilitate changes in size expansion, position variation, and duplication, and thus in functions and evolutionary paths for better survival and fitness. When analyzed globally, not only are such changes significantly diversified but also comprehensible in lineages and biological implications.
Collapse
Affiliation(s)
- Dapeng Wang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, P.R. China
| | | | | | | | | |
Collapse
|