1
|
Smit SJ, Whitehead C, James SR, Jeffares DC, Godden G, Peng D, Sun H, Lichman BR. Pseudomolecule-scale genome assemblies of Drepanocaryum sewerzowii and Marmoritis complanata. G3 (BETHESDA, MD.) 2024; 14:jkae172. [PMID: 39047060 DOI: 10.1093/g3journal/jkae172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/02/2024] [Revised: 07/08/2024] [Accepted: 07/16/2024] [Indexed: 07/27/2024]
Abstract
The Nepetoideae, a subfamily of Lamiaceae (mint family), is rich in aromatic plants, many of which are sought after for their use as flavors and fragrances or for their medicinal properties. Here, we present genome assemblies for two species in Nepetiodeae: Drepanocaruym sewerzowii and Marmoritis complanata. Both assemblies were generated using Oxford Nanopore Q20 + reads with contigs anchored to nine pseudomolecules that resulted in 335 Mb and 305 Mb assemblies, respectively, and BUSCO scores above 95% for both the assembly and annotation. We furthermore provide a species tree for the Lamiaceae using only genome-derived gene models, complementing existing transcriptome and marker-based phylogenies.
Collapse
Affiliation(s)
- Samuel J Smit
- Centre for Novel Agricultural Products, Department of Biology, University of York, York YO10 5DD, UK
| | - Caragh Whitehead
- Centre for Novel Agricultural Products, Department of Biology, University of York, York YO10 5DD, UK
| | - Sally R James
- Bioscience Technology Facility, Department of Biology, University of York, York YO10 5DD, UK
| | - Daniel C Jeffares
- York Biomedical Research Institute, Department of Biology, University of York, York YO10 5DD, UK
| | - Grant Godden
- Florida Museum of Natural History, University of Florida, Gainesville, FL 32611, USA
| | - Deli Peng
- School of Life Science, Yunnan Normal University, Kunming 650092, Yunnan, China
- Key Laboratory of Yunnan for Biomass Energy and Biotechnology of Environment, Yunnan Normal University, Kunming 650500, China
| | - Hang Sun
- Key Laboratory for Plant Diversity and Biogeography of East Asia/Yunnan Key Laboratory for Integrative Conservation of Plant Species with Extremely Small Populations, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, China
| | - Benjamin R Lichman
- Centre for Novel Agricultural Products, Department of Biology, University of York, York YO10 5DD, UK
| |
Collapse
|
2
|
Abdel-Glil MY, Solle J, Wibberg D, Neubauer H, Sprague LD. Chromosome-level genome assembly of Tritrichomonas foetus, the causative agent of Bovine Trichomonosis. Sci Data 2024; 11:1030. [PMID: 39304666 PMCID: PMC11415386 DOI: 10.1038/s41597-024-03818-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Accepted: 08/21/2024] [Indexed: 09/22/2024] Open
Abstract
Tritrichomonas foetus is a parasitic protist responsible for bovine trichomonosis, a reproductive disease associated with significant economic burden to the livestock industry throughout the world. Here, we present a chromosome-level reference genome of T. foetus -KV-1 (ATCC 30924) using short-read (Illumina Miseq), long-read (Oxford Nanopore) and chromatin-linked (Hi-C) sequencing. This is the first chromosome-level genome of a parasitic protist of the order Tritrichomonadida and the second within the Parabasalia lineage, after Trichomonas vaginalis, the human-associated causative agent of the sexually transmitted infection in humans. Our constructed genome is 148 Mb in size, with a N50 length of the scaffolds of 22.9 Mb. The contigs are anchored in five super-scaffolds, corresponding to the expected five chromosomes of the species and covering 78% of the genome assembly. We predict 41,341 protein-coding genes, of which 95.10% have been functionally annotated. This high-quality genome assembly serves as a valuable reference genome for T. foetus to support future studies in functional genomics, genetic conservation and taxonomy.
Collapse
Affiliation(s)
- Mostafa Y Abdel-Glil
- Friedrich-Loeffler-Institut, Institut für Bakterielle Infektionen und Zoonosen (IBIZ), Naumburger Str. 96a, 07743, Jena, Germany.
- Jena University Hospital - Friedrich Schiller University, Institute for Infectious Diseases and Infection Control, Jena, Germany.
| | - Johannes Solle
- Friedrich-Loeffler-Institut, Institut für Bakterielle Infektionen und Zoonosen (IBIZ), Naumburger Str. 96a, 07743, Jena, Germany
| | - Daniel Wibberg
- Center for Biotechnology - CeBiTec, Bielefeld University, Universitätsstraße 27, D-33615, Bielefeld, Germany
- ELIXIR DE Administration Office, Institute of Bio- and Geosciences IBG-5, Forschungszentrum Jülich GmbH - Branch office Bielefeld, Universitätsstraße 27, D-33615, Bielefeld, Germany
| | - Heinrich Neubauer
- Friedrich-Loeffler-Institut, Institut für Bakterielle Infektionen und Zoonosen (IBIZ), Naumburger Str. 96a, 07743, Jena, Germany
| | - Lisa D Sprague
- Friedrich-Loeffler-Institut, Institut für Bakterielle Infektionen und Zoonosen (IBIZ), Naumburger Str. 96a, 07743, Jena, Germany.
| |
Collapse
|
3
|
Zhang J, Aunins AW, King TL, Cong Q, Shen J, Song L, Schuurman GW, Knutson RL, Grundel R, Hellmann J, Grishin NV. Range-wide population genomic structure of the Karner blue butterfly, Plebejus ( Lycaeides) samuelis. Ecol Evol 2024; 14:e70044. [PMID: 39279793 PMCID: PMC11392825 DOI: 10.1002/ece3.70044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 06/26/2024] [Accepted: 07/09/2024] [Indexed: 09/18/2024] Open
Abstract
The Karner blue butterfly, Plebejus (Lycaeides) samuelis, is an endangered North American climate change-vulnerable species that has undergone substantial historical habitat loss and population decline. To better understand the species' genetic status and support Karner blue conservation, we sampled 116 individuals from 22 localities across the species' geographical range in Wisconsin (WI), Michigan (MI), Indiana (IN), and New York (NY). Using genomic analysis, we found that these samples were divided into three major geographic groups, NY, WI, and MI-IN, with populations in WI and MI-IN each further divided into three subgroups. A high level of inbreeding was revealed by inbreeding coefficients above 10% in almost all populations in our study. However, strong correlation between F ST and geographical distance suggested that genetic divergence between populations increases with distance, such that introducing individuals from more distant populations may be a useful strategy for increasing population-level diversity and preserving the species. We also found that Karner blue populations had lower genetic diversity than closely related species and had more alleles that were present only at low frequencies (<5%) in other species. Some of these alleles may negatively impact individual fitness and may have become prevalent in Karner blue populations due to inbreeding. Finally, analysis of these possibly deleterious alleles in the context of predicted three-dimensional structures of proteins revealed potential molecular mechanisms behind population declines, providing insights for conservation. This rich new range-wide understanding of the species' population genomic structure can contextualize past extirpations and help conserve and even enhance Karner blue genetic diversity.
Collapse
Affiliation(s)
- Jing Zhang
- Eugene McDermott Center for Human Growth and Development University of Texas Southwestern Medical Center Dallas Texas USA
- Department of Biophysics University of Texas Southwestern Medical Center Dallas Texas USA
- Harold C. Simmons Comprehensive Cancer Center University of Texas Southwestern Medical Center Dallas Texas USA
| | - Aaron W Aunins
- U.S. Geological Survey, Eastern Ecological Science Center at the Leetown Research Laboratory Kearneysville West Virginia USA
| | - Timothy L King
- U.S. Geological Survey, Eastern Ecological Science Center at the Leetown Research Laboratory Kearneysville West Virginia USA
| | - Qian Cong
- Eugene McDermott Center for Human Growth and Development University of Texas Southwestern Medical Center Dallas Texas USA
- Department of Biophysics University of Texas Southwestern Medical Center Dallas Texas USA
- Harold C. Simmons Comprehensive Cancer Center University of Texas Southwestern Medical Center Dallas Texas USA
| | - Jinhui Shen
- Department of Biophysics University of Texas Southwestern Medical Center Dallas Texas USA
- Department of Biochemistry University of Texas Southwestern Medical Center Dallas Texas USA
| | - Leina Song
- Department of Biophysics University of Texas Southwestern Medical Center Dallas Texas USA
- Department of Biochemistry University of Texas Southwestern Medical Center Dallas Texas USA
| | - Gregor W Schuurman
- U.S. National Park Service, Climate Change Response Program Fort Collins Colorado USA
| | - Randy L Knutson
- U.S. National Park Service, Indiana Dunes National Park Porter Indiana USA
| | - Ralph Grundel
- U.S. Geological Survey, Great Lakes Science Center Chesterton Indiana USA
| | - Jessica Hellmann
- Department of Ecology, Evolution and Behavior, Institute on the Environment University of Minnesota Minneapolis Minnesota USA
| | - Nick V Grishin
- Department of Biophysics University of Texas Southwestern Medical Center Dallas Texas USA
- Department of Biochemistry University of Texas Southwestern Medical Center Dallas Texas USA
| |
Collapse
|
4
|
Boland DJ, Cornejo-Corona I, Browne DR, Murphy RL, Mullet J, Okada S, Devarenne TP. Reclassification of Botryococcus braunii chemical races into separate species based on a comparative genomics analysis. PLoS One 2024; 19:e0304144. [PMID: 39074348 DOI: 10.1371/journal.pone.0304144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 05/07/2024] [Indexed: 07/31/2024] Open
Abstract
The colonial green microalga Botryococcus braunii is well known for producing liquid hydrocarbons that can be utilized as biofuel feedstocks. B. braunii is taxonomically classified as a single species made up of three chemical races, A, B, and L, that are mainly distinguished by the hydrocarbons produced. We previously reported a B race draft nuclear genome, and here we report the draft nuclear genomes for the A and L races. A comparative genomic study of the three B. braunii races and 14 other algal species within Chlorophyta revealed significant differences in the genomes of each race of B. braunii. Phylogenomically, there was a clear divergence of the three races with the A race diverging earlier than both the B and L races, and the B and L races diverging from a later common ancestor not shared by the A race. DNA repeat content analysis suggested the B race had more repeat content than the A or L races. Orthogroup analysis revealed the B. braunii races displayed more gene orthogroup diversity than three closely related Chlamydomonas species, with nearly 24-36% of all genes in each B. braunii race being specific to each race. This analysis suggests the three races are distinct species based on sufficient differences in their respective genomes. We propose reclassification of the three chemical races to the following species names: Botryococcus alkenealis (A race), Botryococcus braunii (B race), and Botryococcus lycopadienor (L race).
Collapse
Affiliation(s)
- Devon J Boland
- Department of Biochemistry and Biophysics, Texas A & M University, College Station, Texas, United States of America
- Texas A&M Institute for Genome Sciences & Society (TIGSS), College Station, Texas, United States of America
| | - Ivette Cornejo-Corona
- Department of Biochemistry and Biophysics, Texas A & M University, College Station, Texas, United States of America
| | - Daniel R Browne
- Department of Biochemistry and Biophysics, Texas A & M University, College Station, Texas, United States of America
- AI & Computational Biology, LanzaTech Inc., Skokie, Illinois, United States of America
| | - Rebecca L Murphy
- Department of Biochemistry and Biophysics, Texas A & M University, College Station, Texas, United States of America
- Biology Department, Centenary College of Louisiana, Shreveport, Louisiana, United States of America
| | - John Mullet
- Department of Biochemistry and Biophysics, Texas A & M University, College Station, Texas, United States of America
| | - Shigeru Okada
- Laboratory of Aquatic Natural Products Chemistry, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Yayoi, Bunkyo, Tokyo, Japan
| | - Timothy P Devarenne
- Department of Biochemistry and Biophysics, Texas A & M University, College Station, Texas, United States of America
| |
Collapse
|
5
|
Jia H, Tan S, Cai Y, Guo Y, Shen J, Zhang Y, Ma H, Zhang Q, Chen J, Qiao G, Ruan J, Zhang YE. Low-input PacBio sequencing generates high-quality individual fly genomes and characterizes mutational processes. Nat Commun 2024; 15:5644. [PMID: 38969648 PMCID: PMC11226609 DOI: 10.1038/s41467-024-49992-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 06/20/2024] [Indexed: 07/07/2024] Open
Abstract
Long-read sequencing, exemplified by PacBio, revolutionizes genomics, overcoming challenges like repetitive sequences. However, the high DNA requirement ( > 1 µg) is prohibitive for small organisms. We develop a low-input (100 ng), low-cost, and amplification-free library-generation method for PacBio sequencing (LILAP) using Tn5-based tagmentation and DNA circularization within one tube. We test LILAP with two Drosophila melanogaster individuals, and generate near-complete genomes, surpassing preexisting single-fly genomes. By analyzing variations in these two genomes, we characterize mutational processes: complex transpositions (transposon insertions together with extra duplications and/or deletions) prefer regions characterized by non-B DNA structures, and gene conversion of transposons occurs on both DNA and RNA levels. Concurrently, we generate two complete assemblies for the endosymbiotic bacterium Wolbachia in these flies and similarly detect transposon conversion. Thus, LILAP promises a broad PacBio sequencing adoption for not only mutational studies of flies and their symbionts but also explorations of other small organisms or precious samples.
Collapse
Affiliation(s)
- Hangxing Jia
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.
| | - Shengjun Tan
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.
| | - Yingao Cai
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yanyan Guo
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Jieyu Shen
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yaqiong Zhang
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Huijing Ma
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Qingzhu Zhang
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Jinfeng Chen
- University of Chinese Academy of Sciences, Beijing, China
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Gexia Qiao
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Jue Ruan
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China.
| | - Yong E Zhang
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.
- University of Chinese Academy of Sciences, Beijing, China.
| |
Collapse
|
6
|
Kim BY, Gellert HR, Church SH, Suvorov A, Anderson SS, Barmina O, Beskid SG, Comeault AA, Crown KN, Diamond SE, Dorus S, Fujichika T, Hemker JA, Hrcek J, Kankare M, Katoh T, Magnacca KN, Martin RA, Matsunaga T, Medeiros MJ, Miller DE, Pitnick S, Schiffer M, Simoni S, Steenwinkel TE, Syed ZA, Takahashi A, Wei KHC, Yokoyama T, Eisen MB, Kopp A, Matute D, Obbard DJ, O’Grady PM, Price DK, Toda MJ, Werner T, Petrov DA. Single-fly genome assemblies fill major phylogenomic gaps across the Drosophilidae Tree of Life. PLoS Biol 2024; 22:e3002697. [PMID: 39024225 PMCID: PMC11257246 DOI: 10.1371/journal.pbio.3002697] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 06/03/2024] [Indexed: 07/20/2024] Open
Abstract
Long-read sequencing is driving rapid progress in genome assembly across all major groups of life, including species of the family Drosophilidae, a longtime model system for genetics, genomics, and evolution. We previously developed a cost-effective hybrid Oxford Nanopore (ONT) long-read and Illumina short-read sequencing approach and used it to assemble 101 drosophilid genomes from laboratory cultures, greatly increasing the number of genome assemblies for this taxonomic group. The next major challenge is to address the laboratory culture bias in taxon sampling by sequencing genomes of species that cannot easily be reared in the lab. Here, we build upon our previous methods to perform amplification-free ONT sequencing of single wild flies obtained either directly from the field or from ethanol-preserved specimens in museum collections, greatly improving the representation of lesser studied drosophilid taxa in whole-genome data. Using Illumina Novaseq X Plus and ONT P2 sequencers with R10.4.1 chemistry, we set a new benchmark for inexpensive hybrid genome assembly at US $150 per genome while assembling genomes from as little as 35 ng of genomic DNA from a single fly. We present 183 new genome assemblies for 179 species as a resource for drosophilid systematics, phylogenetics, and comparative genomics. Of these genomes, 62 are from pooled lab strains and 121 from single adult flies. Despite the sample limitations of working with small insects, most single-fly diploid assemblies are comparable in contiguity (>1 Mb contig N50), completeness (>98% complete dipteran BUSCOs), and accuracy (>QV40 genome-wide with ONT R10.4.1) to assemblies from inbred lines. We present a well-resolved multi-locus phylogeny for 360 drosophilid and 4 outgroup species encompassing all publicly available (as of August 2023) genomes for this group. Finally, we present a Progressive Cactus whole-genome, reference-free alignment built from a subset of 298 suitably high-quality drosophilid genomes. The new assemblies and alignment, along with updated laboratory protocols and computational pipelines, are released as an open resource and as a tool for studying evolution at the scale of an entire insect family.
Collapse
Affiliation(s)
- Bernard Y. Kim
- Department of Biology, Stanford University, Stanford, California, United States of America
| | - Hannah R. Gellert
- Department of Biology, Stanford University, Stanford, California, United States of America
| | - Samuel H. Church
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut United States of America
| | - Anton Suvorov
- Department of Biological Sciences, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Sean S. Anderson
- Department of Biology, University of North Carolina Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Olga Barmina
- Department of Evolution and Ecology, University of California Davis, Davis, California, United States of America
| | - Sofia G. Beskid
- Department of Biology, Stanford University, Stanford, California, United States of America
| | - Aaron A. Comeault
- School of Environmental and Natural Sciences, Bangor University, Bangor, United Kingdom
| | - K. Nicole Crown
- Department of Biology, Case Western Reserve University, Cleveland, Ohio, United States of America
| | - Sarah E. Diamond
- Department of Biology, Case Western Reserve University, Cleveland, Ohio, United States of America
| | - Steve Dorus
- Center for Reproductive Evolution, Department of Biology, Syracuse University, Syracuse, New York, United States of America
| | - Takako Fujichika
- Department of Biological Sciences, Tokyo Metropolitan University, Tokyo, Japan
| | - James A. Hemker
- Department of Developmental Biology, Stanford University, Stanford, California, United States of America
| | - Jan Hrcek
- Institute of Entomology, Biology Centre, Czech Academy of Sciences, České Budějovice, Czech Republic
| | - Maaria Kankare
- Department of Biological and Environmental Science, University of Jyväskylä, Jyväskylä, Finland
| | - Toru Katoh
- Department of Biological Sciences, Hokkaido University, Sapporo, Japan
| | - Karl N. Magnacca
- Hawaii Invertebrate Program, Division of Forestry & Wildlife, Honolulu, Hawaii, United States of America
| | - Ryan A. Martin
- Department of Biology, Case Western Reserve University, Cleveland, Ohio, United States of America
| | - Teruyuki Matsunaga
- Department of Complexity Science and Engineering, The University of Tokyo, Tokyo, Japan
| | - Matthew J. Medeiros
- Pacific Biosciences Research Center, University of Hawaiʻi, Mānoa, Hawaii, United States of America
| | - Danny E. Miller
- Division of Genetic Medicine, Department of Pediatrics; Department of Laboratory Medicine and Pathology, University of Washington, Seattle, Washington, United States of America
| | - Scott Pitnick
- Center for Reproductive Evolution, Department of Biology, Syracuse University, Syracuse, New York, United States of America
| | - Michele Schiffer
- Daintree Rainforest Observatory, James Cook University, Townsville, Australia
| | - Sara Simoni
- Department of Biology, Stanford University, Stanford, California, United States of America
| | | | - Zeeshan A. Syed
- Center for Reproductive Evolution, Department of Biology, Syracuse University, Syracuse, New York, United States of America
| | - Aya Takahashi
- Department of Biological Sciences, Tokyo Metropolitan University, Tokyo, Japan
| | - Kevin H-C. Wei
- Department of Zoology, The University of British Columbia, Vancouver, Canada
| | - Tsuya Yokoyama
- Department of Biology, Stanford University, Stanford, California, United States of America
| | - Michael B. Eisen
- Department of Cell and Molecular Biology, University of California Berkeley, Berkeley, California, United States of America
- Howard Hughes Medical Institute, University of California Berkeley, Berkeley, California, United States of America
| | - Artyom Kopp
- Department of Evolution and Ecology, University of California Davis, Davis, California, United States of America
| | - Daniel Matute
- Department of Biology, University of North Carolina Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Darren J. Obbard
- Institute of Ecology and Evolution, University of Edinburgh, Edinburgh, United Kingdom
| | - Patrick M. O’Grady
- Department of Entomology, Cornell University, Ithaca, New York, United States of America
| | - Donald K. Price
- School of Life Sciences, University of Nevada Las Vegas, Las Vegas, Nevada, United States of America
| | | | - Thomas Werner
- Department of Biological Sciences, Michigan Technological University, Houghton, Michigan, United States of America
| | - Dmitri A. Petrov
- Department of Biology, Stanford University, Stanford, California, United States of America
- CZ Biohub, Investigator, San Francisco, California, United States of America
| |
Collapse
|
7
|
Srivastav SP, Feschotte C, Clark AG. Rapid evolution of piRNA clusters in the Drosophila melanogaster ovary. Genome Res 2024; 34:711-724. [PMID: 38749655 PMCID: PMC11216404 DOI: 10.1101/gr.278062.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Accepted: 05/07/2024] [Indexed: 05/28/2024]
Abstract
The piRNA pathway is a highly conserved mechanism to repress transposable element (TE) activity in the animal germline via a specialized class of small RNAs called piwi-interacting RNAs (piRNAs). piRNAs are produced from discrete genomic regions called piRNA clusters (piCs). Although the molecular processes by which piCs function are relatively well understood in Drosophila melanogaster, much less is known about the origin and evolution of piCs in this or any other species. To investigate piC origin and evolution, we use a population genomic approach to compare piC activity and sequence composition across eight geographically distant strains of D. melanogaster with high-quality long-read genome assemblies. We perform annotations of ovary piCs and genome-wide TE content in each strain. Our analysis uncovers extensive variation in piC activity across strains and signatures of rapid birth and death of piCs. Most TEs inferred to be recently active show an enrichment of insertions into old and large piCs, consistent with the previously proposed "trap" model of piC evolution. In contrast, a small subset of active LTR families is enriched for the formation of new piCs, suggesting that these TEs have higher proclivity to form piCs. Thus, our findings uncover processes leading to the origin of piCs. We propose that piC evolution begins with the emergence of piRNAs from individual insertions of a few select TE families prone to seed new piCs that subsequently expand by accretion of insertions from most other TE families during evolution to form larger "trap" clusters. Our study shows that TEs themselves are the major force driving the rapid evolution of piCs.
Collapse
Affiliation(s)
- Satyam P Srivastav
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA
| | - Cédric Feschotte
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA
| | - Andrew G Clark
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA
| |
Collapse
|
8
|
Shukla HG, Chakraborty M, Emerson J. Genetic variation in recalcitrant repetitive regions of the Drosophila melanogaster genome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.11.598575. [PMID: 38915508 PMCID: PMC11195212 DOI: 10.1101/2024.06.11.598575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]
Abstract
Many essential functions of organisms are encoded in highly repetitive genomic regions, including histones involved in DNA packaging, centromeres that are core components of chromosome segregation, ribosomal RNA comprising the protein translation machinery, telomeres that ensure chromosome integrity, piRNA clusters encoding host defenses against selfish elements, and virtually the entire Y chromosome. These regions, formed by highly similar tandem arrays, pose significant challenges for experimental and informatic study, impeding sequence-level descriptions essential for understanding genetic variation. Here, we report the assembly and variation analysis of such repetitive regions in Drosophila melanogaster, offering significant improvements to the existing community reference assembly. Our work successfully recovers previously elusive segments, including complete reconstructions of the histone locus and the pericentric heterochromatin of the X chromosome, spanning the Stellate locus to the distal flank of the rDNA cluster. To infer structural changes in these regions where alignments are often not practicable, we introduce landmark anchors based on unique variants that are putatively orthologous. These regions display considerable structural variation between different D. melanogaster strains, exhibiting differences in copy number and organization of homologous repeat units between haplotypes. In the histone cluster, although we observe minimal genetic exchange indicative of crossing over, the variation patterns suggest mechanisms such as unequal sister chromatid exchange. We also examine the prevalence and scale of concerted evolution in the histone and Stellate clusters and discuss the mechanisms underlying these observed patterns.
Collapse
Affiliation(s)
- Harsh G. Shukla
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, California 92697, USA
- Graduate Program in Mathematical, Computational and Systems Biology, University of California Irvine, Irvine, California 92697, USA
| | - Mahul Chakraborty
- Department of Biology, Texas A&M University, College Station, Texas 77843, USA
| | - J.J. Emerson
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, California 92697, USA
- Center for Complex Biological Systems, University of California Irvine, Irvine, California 92697, USA
| |
Collapse
|
9
|
Chen X, Li H, Dong Y, Xu Y, Xu K, Zhang Q, Yao Z, Yu Q, Zhang H, Zhang Z. A wild melon reference genome provides novel insights into the domestication of a key gene responsible for melon fruit acidity. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2024; 137:144. [PMID: 38809285 DOI: 10.1007/s00122-024-04647-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Accepted: 05/07/2024] [Indexed: 05/30/2024]
Abstract
KEY MESSAGE A wild melon reference genome elucidates the genomic basis of fruit acidity domestication. Structural variants (SVs) have been reported to impose major effects on agronomic traits, representing a significant contributor to crop domestication. However, the landscape of SVs between wild and cultivated melons is elusive and how SVs have contributed to melon domestication remains largely unexplored. Here, we report a 379-Mb chromosome-scale genome of a wild progenitor melon accession "P84", with a contig N50 of 14.9 Mb. Genome comparison identifies 10,589 SVs between P84 and four cultivated melons with 6937 not characterized in previously analysis of 25 melon genome sequences. Furthermore, the population-scale genotyping of these SVs was determined in 1175 accessions, and 18 GWAS signals including fruit acidity, fruit length, fruit weight, fruit color and sex determination were detected. Based on these genotyped SVs, we identified 3317 highly diverged SVs between wild and cultivated melons, which could be the potential SVs associated with domestication-related traits. Furthermore, we identify novel SVs affecting fruit acidity and proposed the diverged evolutionary trajectories of CmPH, a key regulator of melon fruit acidity, during domestication and selection of different populations. These results will offer valuable resources for genomic studies and genetic improvement in melon.
Collapse
Affiliation(s)
- Xinxiu Chen
- Engineering Laboratory of Genetic Improvement of Horticultural Crops of Shandong Province, College of Horticulture, Qingdao Agricultural University, Qingdao, 266109, China
| | - Hongbo Li
- Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Shenzhen Branch, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, Guangdong, China
- College of Horticulture Science and Engineering, Shandong Agricultural University, Tai'an, 271018, Shandong, China
| | - Yuanhua Dong
- College of Horticulture Science and Engineering, Shandong Agricultural University, Tai'an, 271018, Shandong, China
| | - Yuanchao Xu
- Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Shenzhen Branch, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, Guangdong, China
| | - Kuipeng Xu
- Engineering Laboratory of Genetic Improvement of Horticultural Crops of Shandong Province, College of Horticulture, Qingdao Agricultural University, Qingdao, 266109, China
| | - Qiqi Zhang
- College of Horticulture Science and Engineering, Shandong Agricultural University, Tai'an, 271018, Shandong, China
| | - Zhiwang Yao
- Engineering Laboratory of Genetic Improvement of Horticultural Crops of Shandong Province, College of Horticulture, Qingdao Agricultural University, Qingdao, 266109, China
| | - Qing Yu
- Engineering Laboratory of Genetic Improvement of Horticultural Crops of Shandong Province, College of Horticulture, Qingdao Agricultural University, Qingdao, 266109, China
| | - Huimin Zhang
- Engineering Laboratory of Genetic Improvement of Horticultural Crops of Shandong Province, College of Horticulture, Qingdao Agricultural University, Qingdao, 266109, China.
| | - Zhonghua Zhang
- Engineering Laboratory of Genetic Improvement of Horticultural Crops of Shandong Province, College of Horticulture, Qingdao Agricultural University, Qingdao, 266109, China.
| |
Collapse
|
10
|
Zhimulev I, Vatolina T, Levitsky V, Tsukanov A. Developmental and Housekeeping Genes: Two Types of Genetic Organization in the Drosophila Genome. Int J Mol Sci 2024; 25:4068. [PMID: 38612878 PMCID: PMC11012173 DOI: 10.3390/ijms25074068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 04/01/2024] [Accepted: 04/02/2024] [Indexed: 04/14/2024] Open
Abstract
We developed a procedure for locating genes on Drosophila melanogaster polytene chromosomes and described three types of chromosome structures (gray bands, black bands, and interbands), which differed markedly in morphological and genetic properties. This was reached through the use of our original methods of molecular and genetic analysis, electron microscopy, and bioinformatics data processing. Analysis of the genome-wide distribution of these properties led us to a bioinformatics model of the Drosophila genome organization, in which the genome was divided into two groups of genes. One was constituted by 65, in which the genome was divided into two groups, 62 genes that are expressed in most cell types during life cycle and perform basic cellular functions (the so-called "housekeeping genes"). The other one was made up of 3162 genes that are expressed only at particular stages of development ("developmental genes"). These two groups of genes are so different that we may state that the genome has two types of genetic organization. Different are the timings of their expression, chromatin packaging levels, the composition of activating and deactivating proteins, the sizes of these genes, the lengths of their introns, the organization of the promoter regions of the genes, the locations of origin recognition complexes (ORCs), and DNA replication timings.
Collapse
Affiliation(s)
- Igor Zhimulev
- Institute of Molecular and Cellular Biology of the Siberian Branch of the Russian Academy of Science, 630090 Novosibirsk, Russia;
| | - Tatyana Vatolina
- Institute of Molecular and Cellular Biology of the Siberian Branch of the Russian Academy of Science, 630090 Novosibirsk, Russia;
| | - Victor Levitsky
- Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Science, 630090 Novosibirsk, Russia; (V.L.); (A.T.)
| | - Anton Tsukanov
- Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Science, 630090 Novosibirsk, Russia; (V.L.); (A.T.)
| |
Collapse
|
11
|
Nagy NA, Tóth GE, Kurucz K, Kemenesi G, Laczkó L. The updated genome of the Hungarian population of Aedes koreicus. Sci Rep 2024; 14:7545. [PMID: 38555322 PMCID: PMC10981705 DOI: 10.1038/s41598-024-58096-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Accepted: 03/25/2024] [Indexed: 04/02/2024] Open
Abstract
Vector-borne diseases pose a potential risk to human and animal welfare, and understanding their spread requires genomic resources. The mosquito Aedes koreicus is an emerging vector that has been introduced into Europe more than 15 years ago but only a low quality, fragmented genome was available. In this study, we carried out additional sequencing and assembled and characterized the genome of the species to provide a background for understanding its evolution and biology. The updated genome was 1.1 Gbp long and consisted of 6099 contigs with an N50 value of 329,610 bp and a BUSCO score of 84%. We identified 22,580 genes that could be functionally annotated and paid particular attention to the identification of potential insecticide resistance genes. The assessment of the orthology of the genes indicates a high turnover at the terminal branches of the species tree of mosquitoes with complete genomes, which could contribute to the adaptation and evolutionary success of the species. These results could form the basis for numerous downstream analyzes to develop targets for the control of mosquito populations.
Collapse
Affiliation(s)
- Nikoletta Andrea Nagy
- Department of Evolutionary Zoology and Human Biology, University of Debrecen, Debrecen, Hungary.
- HUN-REN-UD Behavioural Ecology Research Group, University of Debrecen, Debrecen, Hungary.
- Institute of Metagenomics, University of Debrecen, Debrecen, Hungary.
| | - Gábor Endre Tóth
- National Laboratory of Virology, Szentágothai Research Centre, University of Pécs, Pecs, Hungary
- Bernhard Nocht Institute for Tropical Medicine, WHO Collaborating Centre for Arbovirus and Hemorrhagic Fever Reference and Research, Hamburg, Germany
| | - Kornélia Kurucz
- National Laboratory of Virology, Szentágothai Research Centre, University of Pécs, Pecs, Hungary
- Institute of Biology, Faculty of Sciences, University of Pécs, Pecs, Hungary
| | - Gábor Kemenesi
- National Laboratory of Virology, Szentágothai Research Centre, University of Pécs, Pecs, Hungary
- Institute of Biology, Faculty of Sciences, University of Pécs, Pecs, Hungary
| | - Levente Laczkó
- HUN-REN-UD Conservation Biology Research Group, University of Debrecen, Debrecen, Hungary
- One Health Institute, University of Debrecen, Debrecen, Hungary
| |
Collapse
|
12
|
Laczkó L, Jordán S, Póliska S, Rácz HV, Nagy NA, Molnár V A, Sramkó G. The draft genome of Spiraea crenata L. (Rosaceae) - the first complete genome in tribe Spiraeeae. Sci Data 2024; 11:219. [PMID: 38368431 PMCID: PMC10874383 DOI: 10.1038/s41597-024-03046-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 02/05/2024] [Indexed: 02/19/2024] Open
Abstract
Spiraea crenata L. is a deciduous shrub distributed across the Eurasian steppe zone. The species is of cultural and horticultural importance and occurs in scattered populations throughout its westernmost range. Currently, there is no genomic information on the tribe of Spiraeeae. Therefore we sequenced and assembled the whole genome of S. crenata using second- and third-generation sequencing and a hybrid assembly approach to expand genomic resources for conservation and support research on this horticulturally important lineage. In addition to the organellar genomes (the plastome and the mitochondrion), we present the first draft genome of the species with an estimated size of 220 Mbp, an N50 value of 7.7 Mbp, and a BUSCO score of 96.0%. Being the first complete genome in tribe Spiraeeae, this may not only be the first step in the genomic study of a rare plant but also a contribution to genomic resources supporting the study of biodiversity and evolutionary history of Rosaceae.
Collapse
Affiliation(s)
- Levente Laczkó
- Department of Metagenomics, University of Debrecen, Debrecen, Hungary
- HUN-REN-UD Conservation Biology Research Group, University of Debrecen, Debrecen, Hungary
| | - Sándor Jordán
- Department of Metagenomics, University of Debrecen, Debrecen, Hungary
- HUN-REN-UD Conservation Biology Research Group, University of Debrecen, Debrecen, Hungary
- Juhász-Nagy Pál Doctoral School, University of Debrecen, Debrecen, Hungary
| | - Szilárd Póliska
- Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Hungary
| | - Hanna Viktória Rácz
- Department of Biotechnology and Microbiology, Faculty of Science and Technology, University of Debrecen, Debrecen, Hungary
| | - Nikoletta Andrea Nagy
- Department of Evolutionary Zoology and Human Biology, Faculty of Science and Technology, University of Debrecen, Debrecen, Hungary
- HUN-REN-UD Behavioural Ecology Research Group, University of Debrecen, Debrecen, Hungary
| | - Attila Molnár V
- HUN-REN-UD Conservation Biology Research Group, University of Debrecen, Debrecen, Hungary
- Evolutionary Genomics Research Group, Department of Botany, Faculty of Science and Technology, University of Debrecen, Debrecen, Hungary
| | - Gábor Sramkó
- HUN-REN-UD Conservation Biology Research Group, University of Debrecen, Debrecen, Hungary.
- Evolutionary Genomics Research Group, Department of Botany, Faculty of Science and Technology, University of Debrecen, Debrecen, Hungary.
| |
Collapse
|
13
|
Kröger C, Lerminiaux NA, Ershova AS, MacKenzie KD, Kirzinger MW, Märtlbauer E, Perry BJ, Cameron ADS, Schauer K. Plasmid-encoded lactose metabolism and mobilized colistin resistance ( mcr-9) genes in Salmonella enterica serovars isolated from dairy facilities in the 1980s. Microb Genom 2023; 9:001149. [PMID: 38031909 PMCID: PMC10711319 DOI: 10.1099/mgen.0.001149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Accepted: 11/16/2023] [Indexed: 12/01/2023] Open
Abstract
Horizontal gene transfer by plasmids can confer metabolic capabilities that expand a host cell's niche. Yet, it is less understood whether the coalescence of specialized catabolic functions, antibiotic resistances and metal resistances on plasmids provides synergistic benefits. In this study, we report whole-genome assembly and phenotypic analysis of five Salmonella enterica strains isolated in the 1980s from milk powder in Munich, Germany. All strains exhibited the unusual phenotype of lactose-fermentation and encoded either of two variants of the lac operon. Surprisingly, all strains encoded the mobilized colistin resistance gene 9 (mcr-9), long before the first report of this gene in the literature. In two cases, the mcr-9 gene and the lac locus were linked within a large gene island that formed an IncHI2A-type plasmid in one strain but was chromosomally integrated in the other strain. In two other strains, the mcr-9 gene was found on a large IncHI1B/IncP-type plasmid, whereas the lac locus was encoded on a separate chromosomally integrated plasmidic island. The mcr-9 sequences were identical and genomic contexts could not explain the wide range of colistin resistances exhibited by the Salmonella strains. Nucleotide variants did explain phenotypic differences in motility and exopolysaccharide production. The observed linkage of mcr-9 to lactose metabolism, an array of heavy-metal detoxification systems, and other antibiotic resistance genes may reflect a coalescence of specialized phenotypes that improve the spread of colistin resistance in dairy facilities, much earlier than previously suspected.
Collapse
Affiliation(s)
- Carsten Kröger
- Department of Microbiology, School of Genetics and Microbiology, Moyne Institute of Preventive Medicine, Trinity College Dublin, Dublin 2, Ireland
| | - Nicole A. Lerminiaux
- Department of Biology, University of Regina, Regina, Saskatchewan, S4S 0A2, Canada
- Institute for Microbial Systems and Society, Faculty of Science, University of Regina, Regina, Saskatchewan, S4S 0A2, Canada
| | - Anna S. Ershova
- Department of Microbiology, School of Genetics and Microbiology, Moyne Institute of Preventive Medicine, Trinity College Dublin, Dublin 2, Ireland
| | - Keith D. MacKenzie
- Department of Biology, University of Regina, Regina, Saskatchewan, S4S 0A2, Canada
- Institute for Microbial Systems and Society, Faculty of Science, University of Regina, Regina, Saskatchewan, S4S 0A2, Canada
| | - Morgan W. Kirzinger
- Department of Biology, University of Regina, Regina, Saskatchewan, S4S 0A2, Canada
- Institute for Microbial Systems and Society, Faculty of Science, University of Regina, Regina, Saskatchewan, S4S 0A2, Canada
- Present address: National Research Council Canada, Saskatoon, Saskatchewan, S7N 0W9, Canada
| | - Erwin Märtlbauer
- Department of Veterinary Sciences, Faculty of Veterinary Medicine, Ludwig-Maximilians-University Munich, Oberschleißheim, 85764, Germany
| | - Benjamin J. Perry
- Department of Biology, University of Regina, Regina, Saskatchewan, S4S 0A2, Canada
- Present address: AgResearch, 176 Puddle Alley, Mosgiel 9092, New Zealand
| | - Andrew D. S. Cameron
- Department of Biology, University of Regina, Regina, Saskatchewan, S4S 0A2, Canada
- Institute for Microbial Systems and Society, Faculty of Science, University of Regina, Regina, Saskatchewan, S4S 0A2, Canada
| | - Kristina Schauer
- Department of Microbiology, School of Genetics and Microbiology, Moyne Institute of Preventive Medicine, Trinity College Dublin, Dublin 2, Ireland
- Department of Veterinary Sciences, Faculty of Veterinary Medicine, Ludwig-Maximilians-University Munich, Oberschleißheim, 85764, Germany
| |
Collapse
|
14
|
Wierzbicki F, Kofler R. The composition of piRNA clusters in Drosophila melanogaster deviates from expectations under the trap model. BMC Biol 2023; 21:224. [PMID: 37858221 PMCID: PMC10588112 DOI: 10.1186/s12915-023-01727-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Accepted: 10/06/2023] [Indexed: 10/21/2023] Open
Abstract
BACKGROUND It is widely assumed that the invasion of a transposable element (TE) in mammals and invertebrates is stopped when a copy of the TE jumps into a piRNA cluster (i.e., the trap model). However, recent works, which for example showed that deletion of three major piRNA clusters has no effect on TE activity, cast doubt on the trap model. RESULTS Here, we test the trap model from a population genetics perspective. Our simulations show that the composition of regions that act as transposon traps (i.e., potentially piRNA clusters) ought to deviate from regions that have no effect on TE activity. We investigated TEs in five Drosophila melanogaster strains using three complementary approaches to test whether the composition of piRNA clusters matches these expectations. We found that the abundance of TE families inside and outside of piRNA clusters is highly correlated, although this is not expected under the trap model. Furthermore, the distribution of the number of TE insertions in piRNA clusters is also much broader than expected. CONCLUSIONS We found that the observed composition of piRNA clusters is not in agreement with expectations under the simple trap model. Dispersed piRNA producing TE insertions and temporal as well as spatial heterogeneity of piRNA clusters may account for these deviations.
Collapse
Affiliation(s)
- Filip Wierzbicki
- Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, Austria
- Vienna Graduate School of Population Genetics, Vienna, Austria
| | - Robert Kofler
- Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, Austria.
| |
Collapse
|
15
|
Chaturvedi A, Li X, Dhandapani V, Marshall H, Kissane S, Cuenca-Cambronero M, Asole G, Calvet F, Ruiz-Romero M, Marangio P, Guigó R, Rago D, Mirbahai L, Eastwood N, Colbourne J, Zhou J, Mallon E, Orsini L. The hologenome of Daphnia magna reveals possible DNA methylation and microbiome-mediated evolution of the host genome. Nucleic Acids Res 2023; 51:9785-9803. [PMID: 37638757 PMCID: PMC10570034 DOI: 10.1093/nar/gkad685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Revised: 07/07/2023] [Accepted: 08/09/2023] [Indexed: 08/29/2023] Open
Abstract
Properties that make organisms ideal laboratory models in developmental and medical research are often the ones that also make them less representative of wild relatives. The waterflea Daphnia magna is an exception, by both sharing many properties with established laboratory models and being a keystone species, a sentinel species for assessing water quality, an indicator of environmental change and an established ecotoxicology model. Yet, Daphnia's full potential has not been fully exploited because of the challenges associated with assembling and annotating its gene-rich genome. Here, we present the first hologenome of Daphnia magna, consisting of a chromosomal-level assembly of the D. magna genome and the draft assembly of its metagenome. By sequencing and mapping transcriptomes from exposures to environmental conditions and from developmental morphological landmarks, we expand the previously annotates gene set for this species. We also provide evidence for the potential role of gene-body DNA-methylation as a mutagen mediating genome evolution. For the first time, our study shows that the gut microbes provide resistance to commonly used antibiotics and virulence factors, potentially mediating Daphnia's environmental-driven rapid evolution. Key findings in this study improve our understanding of the contribution of DNA methylation and gut microbiota to genome evolution in response to rapidly changing environments.
Collapse
Affiliation(s)
- Anurag Chaturvedi
- Environmental Genomics Group, School of Biosciences, and Institute for Interdisciplinary Data Science and AI, the University of Birmingham, Birmingham B15 2TT, UK
| | - Xiaojing Li
- Environmental Genomics Group, School of Biosciences, and Institute for Interdisciplinary Data Science and AI, the University of Birmingham, Birmingham B15 2TT, UK
| | - Vignesh Dhandapani
- Environmental Genomics Group, School of Biosciences, and Institute for Interdisciplinary Data Science and AI, the University of Birmingham, Birmingham B15 2TT, UK
| | - Hollie Marshall
- Environmental Genomics Group, School of Biosciences, and Institute for Interdisciplinary Data Science and AI, the University of Birmingham, Birmingham B15 2TT, UK
- Department of Genetics and Genome Biology, the University of Leicester, Leicester LE1 7RH, UK
| | - Stephen Kissane
- Environmental Genomics Group, School of Biosciences, and Institute for Interdisciplinary Data Science and AI, the University of Birmingham, Birmingham B15 2TT, UK
| | - Maria Cuenca-Cambronero
- Environmental Genomics Group, School of Biosciences, and Institute for Interdisciplinary Data Science and AI, the University of Birmingham, Birmingham B15 2TT, UK
- Aquatic Ecology Group, University of Vic - Central University of Catalonia, 08500 Vic, Spain
| | - Giovanni Asole
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology (BIST), Barcelona, Catalonia, Spain
| | - Ferriol Calvet
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology (BIST), Barcelona, Catalonia, Spain
| | - Marina Ruiz-Romero
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology (BIST), Barcelona, Catalonia, Spain
| | - Paolo Marangio
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology (BIST), Barcelona, Catalonia, Spain
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology (BIST), Barcelona, Catalonia, Spain
| | - Daria Rago
- Environmental Genomics Group, School of Biosciences, and Institute for Interdisciplinary Data Science and AI, the University of Birmingham, Birmingham B15 2TT, UK
| | - Leda Mirbahai
- Warwick Medical School, University of Warwick, Coventry CV4 7AL, UK
| | - Niamh Eastwood
- Environmental Genomics Group, School of Biosciences, and Institute for Interdisciplinary Data Science and AI, the University of Birmingham, Birmingham B15 2TT, UK
| | - John K Colbourne
- Environmental Genomics Group, School of Biosciences, and Institute for Interdisciplinary Data Science and AI, the University of Birmingham, Birmingham B15 2TT, UK
| | - Jiarui Zhou
- Environmental Genomics Group, School of Biosciences, and Institute for Interdisciplinary Data Science and AI, the University of Birmingham, Birmingham B15 2TT, UK
| | - Eamonn Mallon
- Department of Genetics and Genome Biology, the University of Leicester, Leicester LE1 7RH, UK
| | - Luisa Orsini
- Environmental Genomics Group, School of Biosciences, and Institute for Interdisciplinary Data Science and AI, the University of Birmingham, Birmingham B15 2TT, UK
- The Alan Turing Institute, British Library, London NW1 2DB, UK
| |
Collapse
|
16
|
Kim BY, Gellert HR, Church SH, Suvorov A, Anderson SS, Barmina O, Beskid SG, Comeault AA, Crown KN, Diamond SE, Dorus S, Fujichika T, Hemker JA, Hrcek J, Kankare M, Katoh T, Magnacca KN, Martin RA, Matsunaga T, Medeiros MJ, Miller DE, Pitnick S, Simoni S, Steenwinkel TE, Schiffer M, Syed ZA, Takahashi A, Wei KHC, Yokoyama T, Eisen MB, Kopp A, Matute D, Obbard DJ, O'Grady PM, Price DK, Toda MJ, Werner T, Petrov DA. Single-fly assemblies fill major phylogenomic gaps across the Drosophilidae Tree of Life. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.02.560517. [PMID: 37873137 PMCID: PMC10592941 DOI: 10.1101/2023.10.02.560517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Long-read sequencing is driving rapid progress in genome assembly across all major groups of life, including species of the family Drosophilidae, a longtime model system for genetics, genomics, and evolution. We previously developed a cost-effective hybrid Oxford Nanopore (ONT) long-read and Illumina short-read sequencing approach and used it to assemble 101 drosophilid genomes from laboratory cultures, greatly increasing the number of genome assemblies for this taxonomic group. The next major challenge is to address the laboratory culture bias in taxon sampling by sequencing genomes of species that cannot easily be reared in the lab. Here, we build upon our previous methods to perform amplification-free ONT sequencing of single wild flies obtained either directly from the field or from ethanol-preserved specimens in museum collections, greatly improving the representation of lesser studied drosophilid taxa in whole-genome data. Using Illumina Novaseq X Plus and ONT P2 sequencers with R10.4.1 chemistry, we set a new benchmark for inexpensive hybrid genome assembly at US $150 per genome while assembling genomes from as little as 35 ng of genomic DNA from a single fly. We present 183 new genome assemblies for 179 species as a resource for drosophilid systematics, phylogenetics, and comparative genomics. Of these genomes, 62 are from pooled lab strains and 121 from single adult flies. Despite the sample limitations of working with small insects, most single-fly diploid assemblies are comparable in contiguity (>1Mb contig N50), completeness (>98% complete dipteran BUSCOs), and accuracy (>QV40 genome-wide with ONT R10.4.1) to assemblies from inbred lines. We present a well-resolved multi-locus phylogeny for 360 drosophilid and 4 outgroup species encompassing all publicly available (as of August 2023) genomes for this group. Finally, we present a Progressive Cactus whole-genome, reference-free alignment built from a subset of 298 suitably high-quality drosophilid genomes. The new assemblies and alignment, along with updated laboratory protocols and computational pipelines, are released as an open resource and as a tool for studying evolution at the scale of an entire insect family.
Collapse
Affiliation(s)
| | | | - Samuel H Church
- Department of Ecology and Evolutionary Biology, Yale University, USA
| | - Anton Suvorov
- Department of Biological Sciences, Virginia Tech, USA
| | - Sean S Anderson
- Department of Biology, University of North Carolina Chapel Hill, USA
| | - Olga Barmina
- Department of Evolution and Ecology, University of California Davis, USA
| | | | - Aaron A Comeault
- School of Environmental and Natural Sciences, Bangor University, UK
| | - K Nicole Crown
- Department of Biology, Case Western Reserve University, USA
| | | | - Steve Dorus
- Center for Reproductive Evolution, Department of Biology, Syracuse University, USA
| | - Takako Fujichika
- Department of Biological Sciences, Tokyo Metropolitan University, Japan
| | - James A Hemker
- Department of Developmental Biology, Stanford University, USA
| | - Jan Hrcek
- Institute of Entomology, Biology Centre, Czech Academy of Sciences, Czechia
| | - Maaria Kankare
- Department of Biological and Environmental Science, University of Jyväskylä, Finland
| | - Toru Katoh
- Department of Biological Sciences, Hokkaido University, Japan
| | - Karl N Magnacca
- Hawaii Invertebrate Program, Division of Forestry & Wildlife, State of Hawaii, USA
| | - Ryan A Martin
- Department of Biology, Case Western Reserve University, USA
| | - Teruyuki Matsunaga
- Department of Complexity Science and Engineering, The University of Tokyo, Japan
| | | | - Danny E Miller
- Division of Genetic Medicine, Department of Pediatrics; Department of Laboratory Medicine and Pathology, University of Washington, USA
| | - Scott Pitnick
- Center for Reproductive Evolution, Department of Biology, Syracuse University, USA
| | - Sara Simoni
- Department of Biology, Stanford University, USA
| | | | - Michele Schiffer
- Daintree Rainforest Observatory, James Cook University, Australia
| | - Zeeshan A Syed
- Center for Reproductive Evolution, Department of Biology, Syracuse University, USA
| | - Aya Takahashi
- Department of Biological Sciences, Tokyo Metropolitan University, Japan
| | - Kevin H-C Wei
- Department of Zoology, The University of British Columbia
| | | | - Michael B Eisen
- Department of Cell and Molecular Biology, University of California Berkeley, United States
- Howard Hughes Medical Institute,University of California Berkeley, United States
| | - Artyom Kopp
- Department of Evolution and Ecology, University of California Davis, USA
| | - Daniel Matute
- Department of Biology, University of North Carolina Chapel Hill, USA
| | - Darren J Obbard
- Institute of Ecology and Evolution, University of Edinburgh, UK
| | | | - Donald K Price
- School of Life Sciences, University of Nevada Las Vegas, USA
| | | | - Thomas Werner
- Department of Biological Sciences, Michigan Technological University, USA
| | - Dmitri A Petrov
- Department of Biology, Stanford University, USA
- CZ Biohub, Investigator
| |
Collapse
|
17
|
Laine VN, Sepers B, Lindner M, Gawehns F, Ruuskanen S, van Oers K. An ecologist's guide for studying DNA methylation variation in wild vertebrates. Mol Ecol Resour 2023; 23:1488-1508. [PMID: 35466564 DOI: 10.1111/1755-0998.13624] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Revised: 03/29/2022] [Accepted: 04/13/2022] [Indexed: 11/30/2022]
Abstract
The field of molecular biology is advancing fast with new powerful technologies, sequencing methods and analysis software being developed constantly. Commonly used tools originally developed for research on humans and model species are now regularly used in ecological and evolutionary research. There is also a growing interest in the causes and consequences of epigenetic variation in natural populations. Studying ecological epigenetics is currently challenging, especially for vertebrate systems, because of the required technical expertise, complications with analyses and interpretation, and limitations in acquiring sufficiently high sample sizes. Importantly, neglecting the limitations of the experimental setup, technology and analyses may affect the reliability and reproducibility, and the extent to which unbiased conclusions can be drawn from these studies. Here, we provide a practical guide for researchers aiming to study DNA methylation variation in wild vertebrates. We review the technical aspects of epigenetic research, concentrating on DNA methylation using bisulfite sequencing, discuss the limitations and possible pitfalls, and how to overcome them through rigid and reproducible data analysis. This review provides a solid foundation for the proper design of epigenetic studies, a clear roadmap on the best practices for correct data analysis and a realistic view on the limitations for studying ecological epigenetics in vertebrates. This review will help researchers studying the ecological and evolutionary implications of epigenetic variation in wild populations.
Collapse
Affiliation(s)
- Veronika N Laine
- Finnish Museum of Natural History, University of Helsinki, Helsinki, Finland
| | - Bernice Sepers
- Department of Animal Ecology, Netherlands Institute of Ecology (NIOO-KNAW), Wageningen, The Netherlands
- Behavioural Ecology Group, Wageningen University & Research (WUR), Wageningen, The Netherlands
| | - Melanie Lindner
- Department of Animal Ecology, Netherlands Institute of Ecology (NIOO-KNAW), Wageningen, The Netherlands
- Chronobiology Unit, Groningen Institute for Evolutionary Life Sciences (GELIFES), University of Groningen, Groningen, The Netherlands
| | - Fleur Gawehns
- Department of Animal Ecology, Netherlands Institute of Ecology (NIOO-KNAW), Wageningen, The Netherlands
| | - Suvi Ruuskanen
- Department of Biological and Environmental Science, University of Jyväskylä, Jyväskylä, Finland
- Department of Biology, University of Turku, Finland
| | - Kees van Oers
- Department of Animal Ecology, Netherlands Institute of Ecology (NIOO-KNAW), Wageningen, The Netherlands
- Behavioural Ecology Group, Wageningen University & Research (WUR), Wageningen, The Netherlands
| |
Collapse
|
18
|
Sproul JS, Hotaling S, Heckenhauer J, Powell A, Marshall D, Larracuente AM, Kelley JL, Pauls SU, Frandsen PB. Analyses of 600+ insect genomes reveal repetitive element dynamics and highlight biodiversity-scale repeat annotation challenges. Genome Res 2023; 33:1708-1717. [PMID: 37739812 PMCID: PMC10691545 DOI: 10.1101/gr.277387.122] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 09/20/2023] [Indexed: 09/24/2023]
Abstract
Repetitive elements (REs) are integral to the composition, structure, and function of eukaryotic genomes, yet remain understudied in most taxonomic groups. We investigated REs across 601 insect species and report wide variation in RE dynamics across groups. Analysis of associations between REs and protein-coding genes revealed dynamic evolution at the interface between REs and coding regions across insects, including notably elevated RE-gene associations in lineages with abundant long interspersed nuclear elements (LINEs). We leveraged this large, empirical data set to quantify impacts of long-read technology on RE detection and investigate fundamental challenges to RE annotation in diverse groups. In long-read assemblies, we detected ∼36% more REs than short-read assemblies, with long terminal repeats (LTRs) showing 162% increased detection, whereas DNA transposons and LINEs showed less respective technology-related bias. In most insect lineages, 25%-85% of repetitive sequences were "unclassified" following automated annotation, compared with only ∼13% in Drosophila species. Although the diversity of available insect genomes has rapidly expanded, we show the rate of community contributions to RE databases has not kept pace, preventing efficient annotation and high-resolution study of REs in most groups. We highlight the tremendous opportunity and need for the biodiversity genomics field to embrace REs and suggest collective steps for making progress toward this goal.
Collapse
Affiliation(s)
- John S Sproul
- Department of Biology, Brigham Young University, Provo, Utah 84602, USA;
- Department of Biology, University of Nebraska Omaha, Omaha, Nebraska 68182, USA
- Department of Biology, University of Rochester, Rochester, New York 14627, USA
| | - Scott Hotaling
- School of Biological Sciences, Washington State University, Pullman, Washington 99163, USA
- Department of Watershed Sciences, Utah State University, Logan, Utah 84322, USA
| | - Jacqueline Heckenhauer
- LOEWE Center for Translational Biodiversity Genomics (LOEWE-TBG), 60325 Frankfurt, Germany
- Senckenberg Research Institute and Natural History Museum Frankfurt, 60325 Frankfurt, Germany
| | - Ashlyn Powell
- Department of Plant and Wildlife Sciences, Brigham Young University, Provo, Utah 84602, USA
| | - Dez Marshall
- Department of Biology, University of Nebraska Omaha, Omaha, Nebraska 68182, USA
| | | | - Joanna L Kelley
- School of Biological Sciences, Washington State University, Pullman, Washington 99163, USA
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, California 95064, USA
| | - Steffen U Pauls
- LOEWE Center for Translational Biodiversity Genomics (LOEWE-TBG), 60325 Frankfurt, Germany
- Senckenberg Research Institute and Natural History Museum Frankfurt, 60325 Frankfurt, Germany
- Department of Insect Biotechnology, Justus-Liebig-University Gießen, 35392 Gießen, Germany
| | - Paul B Frandsen
- LOEWE Center for Translational Biodiversity Genomics (LOEWE-TBG), 60325 Frankfurt, Germany
- Department of Plant and Wildlife Sciences, Brigham Young University, Provo, Utah 84602, USA
- Data Science Lab, Smithsonian Institution, Washington, District of Columbia 20560, USA
| |
Collapse
|
19
|
Angst P, Pombert JF, Ebert D, Fields PD. Near chromosome-level genome assembly of the microsporidium Hamiltosporidium tvaerminnensis. G3 (BETHESDA, MD.) 2023; 13:jkad185. [PMID: 37565496 PMCID: PMC10542269 DOI: 10.1093/g3journal/jkad185] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 08/05/2023] [Indexed: 08/12/2023]
Abstract
Microsporidia are intracellular parasitic fungi whose genomes rank among the smallest of all known eukaryotes. A number of outstanding questions remain concerning the evolution of their large-scale variation in genome architecture, responsible for genome size variation of more than an order of magnitude. This genome report presents the first near-chromosomal assembly of a large-genome microsporidium, Hamiltosporidium tvaerminnensis. Combined Oxford Nanopore, Pacific Biosciences (PacBio), and Illumina sequencing led to a genome assembly of 17 contigs, 11 of which represent complete chromosomes. Our assembly is 21.64 Mb in length, has an N50 of 1.44 Mb, and consists of 39.56% interspersed repeats. We introduce a novel approach in microsporidia, PacBio Iso-Seq, as part of a larger annotation pipeline for obtaining high-quality annotations of 3,573 protein-coding genes. Based on direct evidence from the full-length Iso-Seq transcripts, we present evidence for alternative polyadenylation and variation in splicing efficiency, which are potential regulation mechanisms for gene expression in microsporidia. The generated high-quality genome assembly is a necessary resource for comparative genomics that will help elucidate the evolution of genome architecture in response to intracellular parasitism.
Collapse
Affiliation(s)
- Pascal Angst
- Department of Environmental Sciences, Zoology, University of Basel, Basel 4051, Switzerland
| | | | - Dieter Ebert
- Department of Environmental Sciences, Zoology, University of Basel, Basel 4051, Switzerland
| | - Peter D Fields
- Department of Environmental Sciences, Zoology, University of Basel, Basel 4051, Switzerland
| |
Collapse
|
20
|
Huang P, Li C, Lin F, Liu Y, Zong Y, Li B, Zheng Y. Chromosome-level genome assembly and population genetic analysis of a near-threatened rosewood species ( Dalbergia cultrata Pierre Graham ex Benth) provide insights into its evolutionary and cold stress responses. FRONTIERS IN PLANT SCIENCE 2023; 14:1212967. [PMID: 37810393 PMCID: PMC10552272 DOI: 10.3389/fpls.2023.1212967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 08/28/2023] [Indexed: 10/10/2023]
Abstract
Dalbergia cultrata Pierre Graham ex Benth (D. cultrata) is a precious rosewood tree species that grows in the tropical and subtropical regions of Asia. In this study, we used PacBio long-reading sequencing technology and Hi-C assistance to sequence and assemble the reference genome of D. cultrata. We generated 171.47 Gb PacBio long reads and 72.43 Gb Hi-C data and yielded an assembly of 10 pseudochromosomes with a total size of 690.99 Mb and Scaffold N50 of 65.76 Mb. The analysis of specific genes revealed that the triterpenoids represented by lupeol may play an important role in D. cultrata's potential medicinal value. Using the new reference genome, we analyzed the resequencing of 19 Dalbergia accessions and found that D. cultrata and D. cochinchinensis have the latest genetic relationship. Transcriptome sequencing of D. cultrata leaves grown under cold stress revealed that MYB transcription factor and E3 ubiquitin ligase may be playing an important role in the cold response of D. cultrata. Genome resources and identified genetic variation, especially those genes related to the biosynthesis of phytochemicals and cold stress response, will be helpful for the introduction, domestication, utilization, and further breeding of Dalbergia species.
Collapse
Affiliation(s)
- Ping Huang
- State Key Laboratory of Tree Genetics and Breeding, Chinese Academy of Forestry, Beijing, China
- Laboratory of Forest Silviculture and Tree Cultivation, National Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, China
| | - Changhong Li
- State Key Laboratory of Tree Genetics and Breeding, Chinese Academy of Forestry, Beijing, China
- Laboratory of Forest Silviculture and Tree Cultivation, National Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, China
| | - Furong Lin
- State Key Laboratory of Tree Genetics and Breeding, Chinese Academy of Forestry, Beijing, China
- Laboratory of Forest Silviculture and Tree Cultivation, National Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, China
| | - Yu Liu
- State Key Laboratory of Tree Genetics and Breeding, Chinese Academy of Forestry, Beijing, China
- Laboratory of Forest Silviculture and Tree Cultivation, National Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, China
- Wenzhou Key Laboratory of Resource Plant Innovation and Utilization, Zhejiang Institute of Subtropical Crops, Zhejiang Academy of Agricultural Sciences, Wenzhou, Zhejiang, China
| | - Yichen Zong
- State Key Laboratory of Tree Genetics and Breeding, Chinese Academy of Forestry, Beijing, China
- Laboratory of Forest Silviculture and Tree Cultivation, National Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, China
| | - Bin Li
- State Key Laboratory of Tree Genetics and Breeding, Chinese Academy of Forestry, Beijing, China
- Laboratory of Forest Silviculture and Tree Cultivation, National Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, China
| | - Yongqi Zheng
- State Key Laboratory of Tree Genetics and Breeding, Chinese Academy of Forestry, Beijing, China
- Laboratory of Forest Silviculture and Tree Cultivation, National Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, China
| |
Collapse
|
21
|
Flack N, Drown M, Walls C, Pratte J, McLain A, Faulk C. Chromosome-level, nanopore-only genome and allele-specific DNA methylation of Pallas's cat, Otocolobus manul. NAR Genom Bioinform 2023; 5:lqad033. [PMID: 37025970 PMCID: PMC10071556 DOI: 10.1093/nargab/lqad033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2022] [Revised: 02/10/2023] [Accepted: 03/17/2023] [Indexed: 04/07/2023] Open
Abstract
Pallas's cat, or the manul cat (Otocolobus manul), is a small felid native to the grasslands and steppes of central Asia. Population strongholds in Mongolia and China face growing challenges from climate change, habitat fragmentation, poaching, and other sources. These threats, combined with O. manul's zoo collection popularity and value in evolutionary biology, necessitate improvement of species genomic resources. We used standalone nanopore sequencing to assemble a 2.5 Gb, 61-contig nuclear assembly and 17097 bp mitogenome for O. manul. The primary nuclear assembly had 56× sequencing coverage, a contig N50 of 118 Mb, and a 94.7% BUSCO completeness score for Carnivora-specific genes. High genome collinearity within Felidae permitted alignment-based scaffolding onto the fishing cat (Prionailurus viverrinus) reference genome. Manul contigs spanned all 19 felid chromosomes with an inferred total gap length of less than 400 kilobases. Modified basecalling and variant phasing produced an alternate pseudohaplotype assembly and allele-specific DNA methylation calls; 61 differentially methylated regions were identified between haplotypes. Nearest features included classical imprinted genes, non-coding RNAs, and putative novel imprinted loci. The assembled mitogenome successfully resolved existing discordance between Felinae nuclear and mtDNA phylogenies. All assembly drafts were generated from 158 Gb of sequence using seven minION flow cells.
Collapse
Affiliation(s)
- Nicole Flack
- Department of Veterinary and Biomedical Sciences, University of Minnesota, Saint Paul, MN 55108, USA
| | - Melissa Drown
- Department of Ecology, Evolution, and Behavior, University of Minnesota, Saint Paul, MN 55108, USA
| | - Carrie Walls
- Department of Animal Science, University of Minnesota, Saint Paul, MN 55108, USA
| | - Jay Pratte
- Bloomington Parks and Recreation, Miller Park Zoo, Bloomington, IL 61701, USA
| | - Adam McLain
- Department of Biology and Chemistry, SUNY Polytechnic Institute, Utica, NY 13502, USA
| | - Christopher Faulk
- Department of Animal Science, University of Minnesota, Saint Paul, MN 55108, USA
| |
Collapse
|
22
|
Srivastav S, Feschotte C, Clark AG. Rapid evolution of piRNA clusters in the Drosophila melanogaster ovary. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.08.539910. [PMID: 37214865 PMCID: PMC10197564 DOI: 10.1101/2023.05.08.539910] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Animal genomes are parasitized by a horde of transposable elements (TEs) whose mutagenic activity can have catastrophic consequences. The piRNA pathway is a conserved mechanism to repress TE activity in the germline via a specialized class of small RNAs associated with effector Piwi proteins called piwi-associated RNAs (piRNAs). piRNAs are produced from discrete genomic regions called piRNA clusters (piCs). While piCs are generally enriched for TE sequences and the molecular processes by which they are transcribed and regulated are relatively well understood in Drosophila melanogaster, much less is known about the origin and evolution of piCs in this or any other species. To investigate piC evolution, we use a population genomics approach to compare piC activity and sequence composition across 8 geographically distant strains of D. melanogaster with high quality long-read genome assemblies. We perform extensive annotations of ovary piCs and TE content in each strain and test predictions of two proposed models of piC evolution. The 'de novo' model posits that individual TE insertions can spontaneously attain the status of a small piC to generate piRNAs silencing the entire TE family. The 'trap' model envisions large and evolutionary stable genomic clusters where TEs tend to accumulate and serves as a long-term "memory" of ancient TE invasions and produce a great variety of piRNAs protecting against related TEs entering the genome. It remains unclear which model best describes the evolution of piCs. Our analysis uncovers extensive variation in piC activity across strains and signatures of rapid birth and death of piCs in natural populations. Most TE families inferred to be recently or currently active show an enrichment of strain-specific insertions into large piCs, consistent with the trap model. By contrast, only a small subset of active LTR retrotransposon families is enriched for the formation of strain-specific piCs, suggesting that these families have an inherent proclivity to form de novo piCs. Thus, our findings support aspects of both 'de novo' and 'trap' models of piC evolution. We propose that these two models represent two extreme stages along an evolutionary continuum, which begins with the emergence of piCs de novo from a few specific LTR retrotransposon insertions that subsequently expand by accretion of other TE insertions during evolution to form larger 'trap' clusters. Our study shows that piCs are evolutionarily labile and that TEs themselves are the major force driving the formation and evolution of piCs.
Collapse
Affiliation(s)
- Satyam Srivastav
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, USA
| | - Cédric Feschotte
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, USA
| | - Andrew G. Clark
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, USA
| |
Collapse
|
23
|
Mohamed M, Sabot F, Varoqui M, Mugat B, Audouin K, Pélisson A, Fiston-Lavier AS, Chambeyron S. TrEMOLO: accurate transposable element allele frequency estimation using long-read sequencing data combining assembly and mapping-based approaches. Genome Biol 2023; 24:63. [PMID: 37013657 PMCID: PMC10069131 DOI: 10.1186/s13059-023-02911-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 03/23/2023] [Indexed: 04/05/2023] Open
Abstract
Transposable Element MOnitoring with LOng-reads (TrEMOLO) is a new software that combines assembly- and mapping-based approaches to robustly detect genetic elements called transposable elements (TEs). Using high- or low-quality genome assemblies, TrEMOLO can detect most TE insertions and deletions and estimate their allele frequency in populations. Benchmarking with simulated data revealed that TrEMOLO outperforms other state-of-the-art computational tools. TE detection and frequency estimation by TrEMOLO were validated using simulated and experimental datasets. Therefore, TrEMOLO is a comprehensive and suitable tool to accurately study TE dynamics. TrEMOLO is available under GNU GPL3.0 at https://github.com/DrosophilaGenomeEvolution/TrEMOLO .
Collapse
Affiliation(s)
- Mourdas Mohamed
- Institute of Human Genetics, UMR9002, CNRS and Université de Montpellier, Montpellier, France
| | - François Sabot
- DIADE, University of Montpellier, CIRAD, IRD, Montpellier, France
- IFB - Southgreen Bioversity, CIRAD, INRAE, IRD, Montpellier, France
| | - Marion Varoqui
- Institute of Human Genetics, UMR9002, CNRS and Université de Montpellier, Montpellier, France
| | - Bruno Mugat
- Institute of Human Genetics, UMR9002, CNRS and Université de Montpellier, Montpellier, France
| | | | - Alain Pélisson
- Institute of Human Genetics, UMR9002, CNRS and Université de Montpellier, Montpellier, France
| | - Anna-Sophie Fiston-Lavier
- ISEM, Université Montpellier, CNRS, IRD, CIRAD, EPHE, Montpellier, France
- Institut Universitaire de France (IUF), Paris, France
| | - Séverine Chambeyron
- Institute of Human Genetics, UMR9002, CNRS and Université de Montpellier, Montpellier, France
| |
Collapse
|
24
|
Payne ZL, Penny GM, Turner TN, Dutcher SK. A gap-free genome assembly of Chlamydomonas reinhardtii and detection of translocations induced by CRISPR-mediated mutagenesis. PLANT COMMUNICATIONS 2023; 4:100493. [PMID: 36397679 PMCID: PMC10030371 DOI: 10.1016/j.xplc.2022.100493] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Revised: 10/26/2022] [Accepted: 11/15/2022] [Indexed: 05/04/2023]
Abstract
Genomic assemblies of the unicellular green alga Chlamydomonas reinhardtii have provided important resources for researchers. However, assembly errors, large gaps, and unplaced scaffolds as well as strain-specific variants currently impede many types of analysis. By combining PacBio HiFi and Oxford Nanopore long-read technologies, we generated a de novo genome assembly for strain CC-5816, derived from crosses of strains CC-125 and CC-124. Multiple methods of evaluating genome completeness and base-pair error rate suggest that the final telomere-to-telomere assembly is highly accurate. The CC-5816 assembly enabled previously difficult analyses that include characterization of the 17 centromeres, rDNA arrays on three chromosomes, and 56 insertions of organellar DNA into the nuclear genome. Using Nanopore sequencing, we identified sites of cytosine (CpG) methylation, which are enriched at centromeres. We analyzed CRISPR-Cas9 insertional mutants in the PF23 gene. Two of the three alleles produced progeny that displayed patterns of meiotic inviability that suggested the presence of a chromosomal aberration. Mapping Nanopore reads from pf23-2 and pf23-3 onto the CC-5816 genome showed that these two strains each carry a translocation that was initiated at the PF23 gene locus on chromosome 11 and joined with chromosomes 5 or 3, respectively. The translocations were verified by demonstrating linkage between loci on the two translocated chromosomes in meiotic progeny. The three pf23 alleles display the expected short-cilia phenotype, and immunoblotting showed that pf23-2 lacks the PF23 protein. Our CC-5816 genome assembly will undoubtedly provide an important tool for the Chlamydomonas research community.
Collapse
Affiliation(s)
- Zachary L Payne
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Gervette M Penny
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Tychele N Turner
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Susan K Dutcher
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO 63110, USA.
| |
Collapse
|
25
|
Willey C, Korstanje R. Sequencing and assembling bear genomes: the bare necessities. Front Zool 2022; 19:30. [PMID: 36451195 PMCID: PMC9710173 DOI: 10.1186/s12983-022-00475-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Accepted: 11/08/2022] [Indexed: 12/12/2022] Open
Abstract
Unique genetic adaptations are present in bears of every species across the world. From (nearly) shutting down important organs during hibernation to preventing harm from lifestyles that could easily cause metabolic diseases in humans, bears may hold the answer to various human ailments. However, only a few of these unique traits are currently being investigated at the molecular level, partly because of the lack of necessary tools. One of these tools is well-annotated genome assemblies from the different, extant bear species. These reference genomes are needed to allow us to identify differences in genetic variants, isoforms, gene expression, and genomic features such as transposons and identify those that are associated with biomedical-relevant traits. In this review we assess the current state of the genome assemblies of the eight different bear species, discuss current gaps, and the future benefits these reference genomes may have in informing human biomedical applications, while at the same time improving bear conservation efforts.
Collapse
|
26
|
ONT-Based Alternative Assemblies Impact on the Annotations of Unique versus Repetitive Features in the Genome of a Romanian Strain of Drosophila melanogaster. Int J Mol Sci 2022; 23:ijms232314892. [PMID: 36499217 PMCID: PMC9741293 DOI: 10.3390/ijms232314892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 11/21/2022] [Accepted: 11/24/2022] [Indexed: 11/29/2022] Open
Abstract
To date, different strategies of whole-genome sequencing (WGS) have been developed in order to understand the genome structure and functions. However, the analysis of genomic sequences obtained from natural populations is challenging and the biological interpretation of sequencing data remains the main issue. The MinION device developed by Oxford Nanopore Technologies (ONT) is able to generate long reads with minimal costs and time requirements. These valuable assets qualify it as a suitable method for performing WGS, especially in small laboratories. The long reads resulted using this sequencing approach can cover large structural variants and repetitive sequences commonly present in the genomes of eukaryotes. Using MinION, we performed two WGS assessments of a Romanian local strain of Drosophila melanogaster, referred to as Horezu_LaPeri (Horezu). In total, 1,317,857 reads with a size of 8.9 gigabytes (Gb) were generated. Canu and Flye de novo assembly tools were employed to obtain four distinct assemblies with both unfiltered and filtered reads, achieving maximum reference genome coverages of 94.8% (Canu) and 91.4% (Flye). In order to test the quality of these assemblies, we performed a two-step evaluation. Firstly, we considered the BUSCO scores and inquired for a supplemental set of genes using BLAST. Subsequently, we appraised the total content of natural transposons (NTs) relative to the reference genome (ISO1 strain) and mapped the mdg1 retroelement as a resolution assayer. Our results reveal that filtered data provide only slightly enhanced results when considering genes identification, but the use of unfiltered data had a consistent positive impact on the global evaluation of the NTs content. Our comparative studies also revealed differences between Flye and Canu assemblies regarding the annotation of unique versus repetitive genomic features. In our hands, Flye proved to be moderately better for gene identification, while Canu clearly outperformed Flye for NTs analysis. Data concerning the NTs content were compared to those obtained with ONT for the D. melanogaster ISO1 strain, revealing that our strategy conducted to better results. Additionally, the parameters of our ONT reads and assemblies are similar to those reported for ONT experiments performed on various model organisms, revealing that our assembly data are appropriate for a proficient annotation of the Horezu genome.
Collapse
|
27
|
Gao JJ, Barmina O, Thompson A, Kim BY, Suvorov A, Tanaka K, Watabe H, Toda MJ, Chen JM, Katoh TK, Kopp A. Secondary reversion to sexual monomorphism associated with tissue-specific loss of doublesex expression. Evolution 2022; 76:2089-2104. [PMID: 35841603 DOI: 10.1111/evo.14564] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2022] [Revised: 06/23/2022] [Accepted: 06/27/2022] [Indexed: 01/22/2023]
Abstract
Animal evolution is characterized by frequent turnover of sexually dimorphic traits-new sex-specific characters are gained, and some ancestral sex-specific characters are lost, in many lineages. In insects, sexual differentiation is predominantly cell autonomous and depends on the expression of the doublesex (dsx) transcription factor. In most cases, cells that transcribe dsx have the potential to undergo sex-specific differentiation, while those that lack dsx expression do not. Consistent with this mode of development, comparative research has shown that the origin of new sex-specific traits can be associated with the origin of new spatial domains of dsx expression. In this report, we examine the opposite situation-a secondary loss of the sex comb, a male-specific grasping structure that develops on the front legs of some drosophilid species. We show that while the origin of the sex comb is linked to an evolutionary gain of dsx expression in the leg, sex comb loss in a newly identified species of Lordiphosa (Drosophilidae) is associated with a secondary loss of dsx expression. We discuss how the developmental control of sexual dimorphism affects the mechanisms by which sex-specific traits can evolve.
Collapse
Affiliation(s)
- Jian-Jun Gao
- Yunnan Key Laboratory of Plant Reproductive Adaptation and Evolutionary Ecology, Yunnan University, China.,State Key Laboratory for Conservation and Utilization of Bioresources in Yunnan, Yunnan University, China
| | - Olga Barmina
- Department of Evolution and Ecology, University of California Davis, Davis, CA, 95616, USA
| | - Ammon Thompson
- Department of Evolution and Ecology, University of California Davis, Davis, CA, 95616, USA
| | - Bernard Y Kim
- Department of Biology, Stanford University, Stanford, CA, 94305, USA
| | - Anton Suvorov
- Department of Genetics, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Kohtaro Tanaka
- Department of Evolution and Ecology, University of California Davis, Davis, CA, 95616, USA
| | - Hideaki Watabe
- The Hokkaido University Museum, Kita-10, Nishi-8, Kitaku, Sapporo, 060-0810, Japan
| | - Masanori J Toda
- The Hokkaido University Museum, Kita-10, Nishi-8, Kitaku, Sapporo, 060-0810, Japan
| | - Ji-Min Chen
- State Key Laboratory for Conservation and Utilization of Bioresources in Yunnan, Yunnan University, China
| | - Takehiro K Katoh
- Yunnan Key Laboratory of Plant Reproductive Adaptation and Evolutionary Ecology, Yunnan University, China
| | - Artyom Kopp
- Department of Evolution and Ecology, University of California Davis, Davis, CA, 95616, USA
| |
Collapse
|
28
|
Lee Y, Ha U, Moon S. Ongoing endeavors to detect mobilization of transposable elements. BMB Rep 2022. [PMID: 35725016 PMCID: PMC9340088 DOI: 10.5483/bmbrep.2022.55.7.088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Transposable elements (TEs) are DNA sequences capable of mobilization from one location to another in the genome. Since the discovery of ‘Dissociation (Dc) locus’ by Barbara McClintock in maize (1), mounting evidence in the era of genomics indicates that a significant fraction of most eukaryotic genomes is composed of TE sequences, involving in various aspects of biological processes such as development, physiology, diseases and evolution. Although technical advances in genomics have discovered numerous functional impacts of TE across species, our understanding of TEs is still ongoing process due to challenges resulted from complexity and abundance of TEs in the genome. In this mini-review, we briefly summarize biology of TEs and their impacts on the host genome, emphasizing importance of understanding TE landscape in the genome. Then, we introduce recent endeavors especially in vivo retrotransposition assays and long read sequencing technology for identifying de novo insertions/TE polymorphism, which will broaden our knowledge of extraordinary relationship between genomic cohabitants and their host.
Collapse
Affiliation(s)
- Yujeong Lee
- Department of Biological Sciences, Kangwon National University, Chuncheon 24341, Korea
| | - Una Ha
- Department of Biological Sciences, Kangwon National University, Chuncheon 24341, Korea
| | - Sungjin Moon
- Department of Biological Sciences, Kangwon National University, Chuncheon 24341, Korea
| |
Collapse
|
29
|
Corrochano-Fraile A, Davie A, Carboni S, Bekaert M. Evidence of multiple genome duplication events in Mytilus evolution. BMC Genomics 2022; 23:340. [PMID: 35501689 PMCID: PMC9063065 DOI: 10.1186/s12864-022-08575-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Accepted: 04/20/2022] [Indexed: 12/12/2022] Open
Abstract
Background Molluscs remain one significantly under-represented taxa amongst available genomic resources, despite being the second-largest animal phylum and the recent advances in genomes sequencing technologies and genome assembly techniques. With the present work, we want to contribute to the growing efforts by filling this gap, presenting a new high-quality reference genome for Mytilus edulis and investigating the evolutionary history within the Mytilidae family, in relation to other species in the class Bivalvia. Results Here we present, for the first time, the discovery of multiple whole genome duplication events in the Mytilidae family and, more generally, in the class Bivalvia. In addition, the calculation of evolution rates for three species of the Mytilinae subfamily sheds new light onto the taxa evolution and highlights key orthologs of interest for the study of Mytilus species divergences. Conclusions The reference genome presented here will enable the correct identification of molecular markers for evolutionary, population genetics, and conservation studies. Mytilidae have the capability to become a model shellfish for climate change adaptation using genome-enabled systems biology and multi-disciplinary studies of interactions between abiotic stressors, pathogen attacks, and aquaculture practises. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08575-9.
Collapse
Affiliation(s)
- Ana Corrochano-Fraile
- Faculty of Natural Sciences, Institute of Aquaculture, University of Stirling, Stirling, FK9 4LA, UK
| | - Andrew Davie
- Faculty of Natural Sciences, Institute of Aquaculture, University of Stirling, Stirling, FK9 4LA, UK
| | - Stefano Carboni
- Faculty of Natural Sciences, Institute of Aquaculture, University of Stirling, Stirling, FK9 4LA, UK. .,International Marine Centre, Loc. Sa Mardini snc, 09170, Torre Grande, OR, Italy.
| | - Michaël Bekaert
- Faculty of Natural Sciences, Institute of Aquaculture, University of Stirling, Stirling, FK9 4LA, UK
| |
Collapse
|
30
|
Rech GE, Radío S, Guirao-Rico S, Aguilera L, Horvath V, Green L, Lindstadt H, Jamilloux V, Quesneville H, González J. Population-scale long-read sequencing uncovers transposable elements associated with gene expression variation and adaptive signatures in Drosophila. Nat Commun 2022; 13:1948. [PMID: 35413957 PMCID: PMC9005704 DOI: 10.1038/s41467-022-29518-8] [Citation(s) in RCA: 43] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Accepted: 03/15/2022] [Indexed: 12/16/2022] Open
Abstract
High quality reference genomes are crucial to understanding genome function, structure and evolution. The availability of reference genomes has allowed us to start inferring the role of genetic variation in biology, disease, and biodiversity conservation. However, analyses across organisms demonstrate that a single reference genome is not enough to capture the global genetic diversity present in populations. In this work, we generate 32 high-quality reference genomes for the well-known model species D. melanogaster and focus on the identification and analysis of transposable element variation as they are the most common type of structural variant. We show that integrating the genetic variation across natural populations from five climatic regions increases the number of detected insertions by 58%. Moreover, 26% to 57% of the insertions identified using long-reads were missed by short-reads methods. We also identify hundreds of transposable elements associated with gene expression variation and new TE variants likely to contribute to adaptive evolution in this species. Our results highlight the importance of incorporating the genetic variation present in natural populations to genomic studies, which is essential if we are to understand how genomes function and evolve.
Collapse
Affiliation(s)
- Gabriel E Rech
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), 08003, Barcelona, Spain
| | - Santiago Radío
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), 08003, Barcelona, Spain
| | - Sara Guirao-Rico
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), 08003, Barcelona, Spain
| | - Laura Aguilera
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), 08003, Barcelona, Spain
| | - Vivien Horvath
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), 08003, Barcelona, Spain
| | - Llewellyn Green
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), 08003, Barcelona, Spain
| | - Hannah Lindstadt
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), 08003, Barcelona, Spain
| | | | | | - Josefa González
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), 08003, Barcelona, Spain.
| |
Collapse
|
31
|
Widanagama SD, Freeland JR, Xu X, Shafer ABA. Genome assembly, annotation, and comparative analysis of the cattail Typha latifolia. G3 GENES|GENOMES|GENETICS 2022; 12:6433155. [PMID: 34871392 PMCID: PMC9210280 DOI: 10.1093/g3journal/jkab401] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 11/13/2021] [Indexed: 11/19/2022]
Abstract
Cattails (Typha species) comprise a genus of emergent wetland plants with a global distribution. Typha latifolia and Typha angustifolia are two of the most widespread species, and in areas of sympatry can interbreed to produce the hybrid Typha × glauca. In some regions, the relatively high fitness of Typha × glauca allows it to outcompete and displace both parent species, while simultaneously reducing plant and invertebrate biodiversity, and modifying nutrient and water cycling. We generated a high-quality whole-genome assembly of T. latifolia using PacBio long-read and high coverage Illumina sequences that will facilitate evolutionary and ecological studies in this hybrid zone. Genome size was 287 Mb and consisted of 1158 scaffolds, with an N50 of 8.71 Mb; 43.84% of the genome were identified as repetitive elements. The assembly has a BUSCO score of 96.03%, and 27,432 genes and 2700 RNA sequences were putatively identified. Comparative analysis detected over 9000 shared orthologs with related taxa and phylogenomic analysis supporting T. latifolia as a divergent lineage within Poales. This high-quality scaffold-level reference genome will provide a useful resource for future population genomic analyses and improve our understanding of Typha hybrid dynamics.
Collapse
Affiliation(s)
- Shane D Widanagama
- Department of Computer Science, Trent University, Peterborough, ON K9L 0G2, Canada
| | - Joanna R Freeland
- Department of Biology, Trent University, Peterborough, ON K9L 0G2, Canada
| | - Xinwei Xu
- Department of Ecology, College of Life Sciences, Wuhan University, Wuhan 430072, China
| | - Aaron B A Shafer
- Department of Forensic Sciences, Trent University, Peterborough, ON K9L 0G2, Canada
- Corresponding author: Department of Forensic Sciences, Trent University, DNA Building, 2140 East Bank Drive, Peterborough, ON, K9L 0G2, Canada.
| |
Collapse
|
32
|
Wierzbicki F, Schwarz F, Cannalonga O, Kofler R. Novel quality metrics allow identifying and generating high-quality assemblies of piRNA clusters. Mol Ecol Resour 2022; 22:102-121. [PMID: 34181811 DOI: 10.1111/1755-0998.13455] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 04/30/2021] [Accepted: 06/14/2021] [Indexed: 12/30/2022]
Abstract
In most animals, it is thought that the proliferation of a transposable element (TE) is stopped when the TE jumps into a piRNA cluster. Despite this central importance, little is known about the composition and the evolutionary dynamics of piRNA clusters. This is largely because piRNA clusters are notoriously difficult to assemble as they are frequently composed of highly repetitive DNA. With long reads, we may finally be able to obtain reliable assemblies of piRNA clusters. Unfortunately, it is unclear how to generate and identify the best assemblies, as many assembly strategies exist and standard quality metrics are ignorant of TEs. To address these problems, we introduce several novel quality metrics that assess: (a) the fraction of completely assembled piRNA clusters, (b) the quality of the assembled clusters and (c) whether an assembly captures the overall TE landscape of an organisms (i.e. the abundance, the number of SNPs and internal deletions of all TE families). The requirements for computing these metrics vary, ranging from annotations of piRNA clusters to consensus sequences of TEs and genomic sequencing data. Using these novel metrics, we evaluate the effect of assembly algorithm, polishing, read length, coverage, residual polymorphisms and finally identify strategies that yield reliable assemblies of piRNA clusters. Based on an optimized approach, we provide assemblies for the two Drosophila melanogaster strains Canton-S and Pi2. About 80% of known piRNA clusters were assembled in both strains. Finally, we demonstrate the generality of our approach by extending our metrics to humans and Arabidopsis thaliana.
Collapse
Affiliation(s)
- Filip Wierzbicki
- Institut für Populationsgenetik, Vetmeduni Vienna, Wien, Austria.,Vienna Graduate School of Population Genetics, Vetmeduni Vienna, Vienna, Austria
| | - Florian Schwarz
- Institut für Populationsgenetik, Vetmeduni Vienna, Wien, Austria.,Vienna Graduate School of Population Genetics, Vetmeduni Vienna, Vienna, Austria
| | | | - Robert Kofler
- Institut für Populationsgenetik, Vetmeduni Vienna, Wien, Austria
| |
Collapse
|
33
|
Xie S, Leung AWS, Zheng Z, Zhang D, Xiao C, Luo R, Luo M, Zhang S. Applications and potentials of nanopore sequencing in the (epi)genome and (epi)transcriptome era. Innovation (N Y) 2021; 2:100153. [PMID: 34901902 PMCID: PMC8640597 DOI: 10.1016/j.xinn.2021.100153] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 08/09/2021] [Indexed: 02/08/2023] Open
Abstract
The Human Genome Project opened an era of (epi)genomic research, and also provided a platform for the development of new sequencing technologies. During and after the project, several sequencing technologies continue to dominate nucleic acid sequencing markets. Currently, Illumina (short-read), PacBio (long-read), and Oxford Nanopore (long-read) are the most popular sequencing technologies. Unlike PacBio or the popular short-read sequencers before it, which, as examples of the second or so-called Next-Generation Sequencing platforms, need to synthesize when sequencing, nanopore technology directly sequences native DNA and RNA molecules. Nanopore sequencing, therefore, avoids converting mRNA into cDNA molecules, which not only allows for the sequencing of extremely long native DNA and full-length RNA molecules but also document modifications that have been made to those native DNA or RNA bases. In this review on direct DNA sequencing and direct RNA sequencing using Oxford Nanopore technology, we focus on their development and application achievements, discussing their challenges and future perspective. We also address the problems researchers may encounter applying these approaches in their research topics, and how to resolve them.
Collapse
Affiliation(s)
- Shangqian Xie
- Key Laboratory of Ministry of Education for Genetics and Germplasm Innovation of Tropical Special Trees and Ornamental Plants, College of Forestry, Hainan University, Haikou 570228, China
| | - Amy Wing-Sze Leung
- Department of Computer Science, The University of Hong Kong, Hong Kong 999077, China
| | - Zhenxian Zheng
- Department of Computer Science, The University of Hong Kong, Hong Kong 999077, China
| | - Dake Zhang
- Beijing Advanced Innovation Centre for Biomedical Engineering, Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China
| | - Chuanle Xiao
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Centre, Sun Yat-sen University, Guangzhou 510060, China
| | - Ruibang Luo
- Department of Computer Science, The University of Hong Kong, Hong Kong 999077, China
| | - Ming Luo
- Agriculture and Biotechnology Research Center, Guangdong Provincial Key Laboratory of Applied Botany, Center of Economic Botany, Core Botanical Gardens, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou 510650, China
| | - Shoudong Zhang
- School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong 999077, China
- Center for Soybean Research of the State Key Laboratory of Agrobiotechnology, The Chinese University of Hong Kong, Shatin, Hong Kong 999077, China
| |
Collapse
|
34
|
Walsh AT, Triant DA, Le Tourneau JJ, Shamimuzzaman M, Elsik CG. Hymenoptera Genome Database: new genomes and annotation datasets for improved go enrichment and orthologue analyses. Nucleic Acids Res 2021; 50:D1032-D1039. [PMID: 34747465 PMCID: PMC8728238 DOI: 10.1093/nar/gkab1018] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Revised: 10/06/2021] [Accepted: 10/12/2021] [Indexed: 01/02/2023] Open
Abstract
We report an update of the Hymenoptera Genome Database (HGD; http://HymenopteraGenome.org), a genomic database of hymenopteran insect species. The number of species represented in HGD has nearly tripled, with fifty-eight hymenopteran species, including twenty bees, twenty-three ants, eleven wasps and four sawflies. With a reorganized website, HGD continues to provide the HymenopteraMine genomic data mining warehouse and JBrowse/Apollo genome browsers integrated with BLAST. We have computed Gene Ontology (GO) annotations for all species, greatly enhancing the GO annotation data gathered from UniProt with more than a ten-fold increase in the number of GO-annotated genes. We have also generated orthology datasets that encompass all HGD species and provide orthologue clusters for fourteen taxonomic groups. The new GO annotation and orthology data are available for searching in HymenopteraMine, and as bulk file downloads.
Collapse
Affiliation(s)
- Amy T Walsh
- Division of Animal Sciences, University of Missouri, Columbia, MO 65211, USA
| | - Deborah A Triant
- Division of Animal Sciences, University of Missouri, Columbia, MO 65211, USA
| | | | - Md Shamimuzzaman
- Division of Animal Sciences, University of Missouri, Columbia, MO 65211, USA
| | - Christine G Elsik
- Division of Animal Sciences, University of Missouri, Columbia, MO 65211, USA.,Division of Plant Science & Technology, University of Missouri, Columbia, MO 65211, USA.,MU Institute for Data Science & Informatics, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
35
|
Comparative genomic analysis of different sexes and diet-specific amino acid mutation identification in Ancherythroculter nigrocauda. COMPARATIVE BIOCHEMISTRY AND PHYSIOLOGY D-GENOMICS & PROTEOMICS 2021; 40:100910. [PMID: 34509952 DOI: 10.1016/j.cbd.2021.100910] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Revised: 08/29/2021] [Accepted: 08/30/2021] [Indexed: 01/27/2023]
Abstract
Determining the sex and controlling the sex ratio are essential aspects of fish genetics that can assist in developing successful fish breeding programs. High quality genome assembly and annotations are prerequisites to determine sex-specific genes and their expression. In addition, analysis of resequencing data can identify genomic difference between male and female fishes. In this study, we performed chromosome-level genome assembly in female Ancherythroculter nigrocauda fish having low heterozygosity using PacBio reads. High-throughput chromatin conformation capture (HiC) yielded a genome of size 1054.05 Mb, with a contig N50 length of 3.40 Mb and a scaffold N50 length of 42.68 Mb. In addition, we sequenced 5 female and 5 male A. nigrocauda samples and identified sex-specific regions on LG20 Furthermore, diet-specific amino acid mutation were found on 582 genes between herbivorous and carnivorous fishes, with 26 of them exhibiting significantly different expression patterns in the liver tissue of these two types of fishes. The chromosome-level genome assembly of A. nigrocauda provides valuable resources for conducting in-depth comparative genomic studies with immense applications in fish genetic breeding and farming. Similarly, the diet-specific amino acid mutations are useful in the breeding of new strains of carnivorous fishes with an herbivorous diet.
Collapse
|
36
|
Kim BY, Wang JR, Miller DE, Barmina O, Delaney E, Thompson A, Comeault AA, Peede D, D'Agostino ERR, Pelaez J, Aguilar JM, Haji D, Matsunaga T, Armstrong EE, Zych M, Ogawa Y, Stamenković-Radak M, Jelić M, Veselinović MS, Tanasković M, Erić P, Gao JJ, Katoh TK, Toda MJ, Watabe H, Watada M, Davis JS, Moyle LC, Manoli G, Bertolini E, Košťál V, Hawley RS, Takahashi A, Jones CD, Price DK, Whiteman N, Kopp A, Matute DR, Petrov DA. Highly contiguous assemblies of 101 drosophilid genomes. eLife 2021; 10:e66405. [PMID: 34279216 PMCID: PMC8337076 DOI: 10.7554/elife.66405] [Citation(s) in RCA: 85] [Impact Index Per Article: 28.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Accepted: 07/16/2021] [Indexed: 12/13/2022] Open
Abstract
Over 100 years of studies in Drosophila melanogaster and related species in the genus Drosophila have facilitated key discoveries in genetics, genomics, and evolution. While high-quality genome assemblies exist for several species in this group, they only encompass a small fraction of the genus. Recent advances in long-read sequencing allow high-quality genome assemblies for tens or even hundreds of species to be efficiently generated. Here, we utilize Oxford Nanopore sequencing to build an open community resource of genome assemblies for 101 lines of 93 drosophilid species encompassing 14 species groups and 35 sub-groups. The genomes are highly contiguous and complete, with an average contig N50 of 10.5 Mb and greater than 97% BUSCO completeness in 97/101 assemblies. We show that Nanopore-based assemblies are highly accurate in coding regions, particularly with respect to coding insertions and deletions. These assemblies, along with a detailed laboratory protocol and assembly pipelines, are released as a public resource and will serve as a starting point for addressing broad questions of genetics, ecology, and evolution at the scale of hundreds of species.
Collapse
Affiliation(s)
- Bernard Y Kim
- Department of Biology, Stanford UniversityStanfordUnited States
| | - Jeremy R Wang
- Department of Genetics, University of North CarolinaChapel HillUnited States
| | - Danny E Miller
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children’s HospitalSeattleUnited States
| | - Olga Barmina
- Department of Evolution and Ecology, University of California DavisDavisUnited States
| | - Emily Delaney
- Department of Evolution and Ecology, University of California DavisDavisUnited States
| | - Ammon Thompson
- Department of Evolution and Ecology, University of California DavisDavisUnited States
| | - Aaron A Comeault
- School of Natural Sciences, Bangor UniversityBangorUnited Kingdom
| | - David Peede
- Biology Department, University of North CarolinaChapel HillUnited States
| | | | - Julianne Pelaez
- Department of Integrative Biology, University of California, BerkeleyBerkeleyUnited States
| | - Jessica M Aguilar
- Department of Integrative Biology, University of California, BerkeleyBerkeleyUnited States
| | - Diler Haji
- Department of Integrative Biology, University of California, BerkeleyBerkeleyUnited States
| | - Teruyuki Matsunaga
- Department of Integrative Biology, University of California, BerkeleyBerkeleyUnited States
| | | | - Molly Zych
- Molecular and Cellular Biology Program, University of WashingtonSeattleUnited States
| | - Yoshitaka Ogawa
- Department of Biological Sciences, Tokyo Metropolitan UniversityHachiojiJapan
| | | | - Mihailo Jelić
- Faculty of Biology, University of BelgradeBelgradeSerbia
| | | | - Marija Tanasković
- University of Belgrade, Institute for Biological Research "Siniša Stanković", National Institute of Republic of SerbiaBelgradeSerbia
| | - Pavle Erić
- University of Belgrade, Institute for Biological Research "Siniša Stanković", National Institute of Republic of SerbiaBelgradeSerbia
| | - Jian-Jun Gao
- School of Ecology and Environmental Science, Yunnan UniversityKunmingChina
| | - Takehiro K Katoh
- School of Ecology and Environmental Science, Yunnan UniversityKunmingChina
| | | | - Hideaki Watabe
- Biological Laboratory, Sapporo College, Hokkaido University of EducationSapporoJapan
| | - Masayoshi Watada
- Graduate School of Science and Engineering, Ehime UniversityMatsuyamaJapan
| | - Jeremy S Davis
- Department of Biology, University of KentuckyLexingtonUnited States
| | - Leonie C Moyle
- Department of Biology, Indiana UniversityBloomingtonUnited States
| | - Giulia Manoli
- Neurobiology and Genetics, Theodor Boveri Institute, Biocentre, University of WürzburgWürzburgGermany
| | - Enrico Bertolini
- Neurobiology and Genetics, Theodor Boveri Institute, Biocentre, University of WürzburgWürzburgGermany
| | - Vladimír Košťál
- Institute of Entomology, Biology Centre, Academy of Sciences of the Czech RepublicPragueCzech Republic
| | - R Scott Hawley
- Department of Molecular and Integrative Physiology, University of Kansas Medical Center, Stowers Institute for Medical ResearchKansas CityUnited States
| | - Aya Takahashi
- Department of Biological Sciences, Tokyo Metropolitan UniversityHachiojiJapan
| | - Corbin D Jones
- Biology Department, University of North CarolinaChapel HillUnited States
| | - Donald K Price
- School of Life Science, University of NevadaLas VegasUnited States
| | - Noah Whiteman
- Department of Integrative Biology, University of California, BerkeleyBerkeleyUnited States
| | - Artyom Kopp
- Department of Evolution and Ecology, University of California DavisDavisUnited States
| | - Daniel R Matute
- Biology Department, University of North CarolinaChapel HillUnited States
| | - Dmitri A Petrov
- Department of Biology, Stanford UniversityStanfordUnited States
| |
Collapse
|
37
|
Gatter T, von Löhneysen S, Fallmann J, Drozdova P, Hartmann T, Stadler PF. LazyB: fast and cheap genome assembly. Algorithms Mol Biol 2021; 16:8. [PMID: 34074310 PMCID: PMC8168326 DOI: 10.1186/s13015-021-00186-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Accepted: 05/06/2021] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Advances in genome sequencing over the last years have lead to a fundamental paradigm shift in the field. With steadily decreasing sequencing costs, genome projects are no longer limited by the cost of raw sequencing data, but rather by computational problems associated with genome assembly. There is an urgent demand for more efficient and and more accurate methods is particular with regard to the highly complex and often very large genomes of animals and plants. Most recently, "hybrid" methods that integrate short and long read data have been devised to address this need. RESULTS LazyB is such a hybrid genome assembler. It has been designed specificially with an emphasis on utilizing low-coverage short and long reads. LazyB starts from a bipartite overlap graph between long reads and restrictively filtered short-read unitigs. This graph is translated into a long-read overlap graph G. Instead of the more conventional approach of removing tips, bubbles, and other local features, LazyB stepwisely extracts subgraphs whose global properties approach a disjoint union of paths. First, a consistently oriented subgraph is extracted, which in a second step is reduced to a directed acyclic graph. In the next step, properties of proper interval graphs are used to extract contigs as maximum weight paths. These path are translated into genomic sequences only in the final step. A prototype implementation of LazyB, entirely written in python, not only yields significantly more accurate assemblies of the yeast and fruit fly genomes compared to state-of-the-art pipelines but also requires much less computational effort. CONCLUSIONS LazyB is new low-cost genome assembler that copes well with large genomes and low coverage. It is based on a novel approach for reducing the overlap graph to a collection of paths, thus opening new avenues for future improvements. AVAILABILITY The LazyB prototype is available at https://github.com/TGatter/LazyB .
Collapse
Affiliation(s)
- Thomas Gatter
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16-18, 04107, Leipzig, Germany.
| | - Sarah von Löhneysen
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16-18, 04107, Leipzig, Germany
| | - Jörg Fallmann
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16-18, 04107, Leipzig, Germany
| | - Polina Drozdova
- Institute of Biology, Irkutsk State University, RU-664003, Irkutsk, Russia
| | - Tom Hartmann
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16-18, 04107, Leipzig, Germany
| | - Peter F Stadler
- Biology Department, Universidad Nacional de Colombia, Carrera 45 # 26-85, Edif. Uriel Gutiérrez, Bogotá, D.C, Colombia.
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Härtelstraße 16-18, 04107, Leipzig, Germany.
- Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, 04103, Leipzig, Germany.
- Department of Theoretical Chemistry, University of Vienna, Währinger Straße 17, 1090, Vienna, Austria.
- Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM87501, USA.
| |
Collapse
|
38
|
Ser HL, Letchumanan V, Goh BH, Wong SH, Lee LH. The Use of Fecal Microbiome Transplant in Treating Human Diseases: Too Early for Poop? Front Microbiol 2021; 12:519836. [PMID: 34054740 PMCID: PMC8155486 DOI: 10.3389/fmicb.2021.519836] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Accepted: 04/07/2021] [Indexed: 12/15/2022] Open
Abstract
Fecal microbiome transplant (FMT) has gained popularity over the past few years, given its success in treating several gastrointestinal diseases. At the same time, microbial populations in the gut have been shown to have more physiological effects than we expected as "habitants" of the gut. The imbalance in the gut microbiome or dysbiosis, particularly when there are excessive harmful pathogens, can trigger not just infections but can also result in the development of common diseases, such as cancer and cardiometabolic diseases. By using FMT technology, the dysbiosis of the gut microbiome in patients can be resolved by administering fecal materials from a healthy donor. The current review summarizes the history and current uses of FMT before suggesting potential ideas for its high-quality application in clinical settings.
Collapse
Affiliation(s)
- Hooi-Leng Ser
- Novel Bacteria and Drug Discovery Research Group, Microbiome and Bioresource Research Strength, Jeffrey Cheah School of Medicine and Health Sciences, Monash University Malaysia, Bandar Sunway, Malaysia
| | - Vengadesh Letchumanan
- Novel Bacteria and Drug Discovery Research Group, Microbiome and Bioresource Research Strength, Jeffrey Cheah School of Medicine and Health Sciences, Monash University Malaysia, Bandar Sunway, Malaysia
| | - Bey-Hing Goh
- Biofunctional Molecule Exploratory Research Group, School of Pharmacy, Monash University Malaysia, Bandar Sunway, Malaysia
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Sunny Hei Wong
- Department of Medicine and Therapeutics, Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Sha Tin, Hong Kong
| | - Learn-Han Lee
- Novel Bacteria and Drug Discovery Research Group, Microbiome and Bioresource Research Strength, Jeffrey Cheah School of Medicine and Health Sciences, Monash University Malaysia, Bandar Sunway, Malaysia
| |
Collapse
|
39
|
Clifton BD, Jimenez J, Kimura A, Chahine Z, Librado P, Sánchez-Gracia A, Abbassi M, Carranza F, Chan C, Marchetti M, Zhang W, Shi M, Vu C, Yeh S, Fanti L, Xia XQ, Rozas J, Ranz JM. Understanding the Early Evolutionary Stages of a Tandem Drosophilamelanogaster-Specific Gene Family: A Structural and Functional Population Study. Mol Biol Evol 2021; 37:2584-2600. [PMID: 32359138 PMCID: PMC7475035 DOI: 10.1093/molbev/msaa109] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Gene families underlie genetic innovation and phenotypic diversification. However, our understanding of the early genomic and functional evolution of tandemly arranged gene families remains incomplete as paralog sequence similarity hinders their accurate characterization. The Drosophila melanogaster-specific gene family Sdic is tandemly repeated and impacts sperm competition. We scrutinized Sdic in 20 geographically diverse populations using reference-quality genome assemblies, read-depth methodologies, and qPCR, finding that ∼90% of the individuals harbor 3-7 copies as well as evidence of population differentiation. In strains with reliable gene annotations, copy number variation (CNV) and differential transposable element insertions distinguish one structurally distinct version of the Sdic region per strain. All 31 annotated copies featured protein-coding potential and, based on the protein variant encoded, were categorized into 13 paratypes differing in their 3' ends, with 3-5 paratypes coexisting in any strain examined. Despite widespread gene conversion, the only copy present in all strains has functionally diverged at both coding and regulatory levels under positive selection. Contrary to artificial tandem duplications of the Sdic region that resulted in increased male expression, CNV in cosmopolitan strains did not correlate with expression levels, likely as a result of differential genome modifier composition. Duplicating the region did not enhance sperm competitiveness, suggesting a fitness cost at high expression levels or a plateau effect. Beyond facilitating a minimally optimal expression level, Sdic CNV acts as a catalyst of protein and regulatory diversity, showcasing a possible evolutionary path recently formed tandem multigene families can follow toward long-term consolidation in eukaryotic genomes.
Collapse
Affiliation(s)
- Bryan D Clifton
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA
| | - Jamie Jimenez
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA
| | - Ashlyn Kimura
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA
| | - Zeinab Chahine
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA
| | - Pablo Librado
- Laboratoire AMIS CNRS UMR 5288, Faculté de Médicine de Purpan, Université Paul Sabatier, Toulouse, France
| | - Alejandro Sánchez-Gracia
- Departament de Genètica, Microbiologia i Estadistica, Universitat de Barcelona, Barcelona, Spain.,Institut de Recerca de la Biodiversitat, Universitat de Barcelona, Barcelona, Spain
| | - Mashya Abbassi
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA
| | - Francisco Carranza
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA
| | - Carolus Chan
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA
| | - Marcella Marchetti
- Istituto Pasteur Italia, Fondazione Cenci-Bolognetti, Rome, Italy.,Department of Biology and Biotechnology "C. Darwin", Sapienza University of Rome, Rome, Italy
| | - Wanting Zhang
- Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, Hubei Province, China
| | - Mijuan Shi
- Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, Hubei Province, China
| | - Christine Vu
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA
| | - Shudan Yeh
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA.,Department of Life Sciences, National Central University, Taoyuan City, Zhongli District, Taiwan
| | - Laura Fanti
- Istituto Pasteur Italia, Fondazione Cenci-Bolognetti, Rome, Italy.,Department of Biology and Biotechnology "C. Darwin", Sapienza University of Rome, Rome, Italy
| | - Xiao-Qin Xia
- Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, Hubei Province, China
| | - Julio Rozas
- Departament de Genètica, Microbiologia i Estadistica, Universitat de Barcelona, Barcelona, Spain.,Institut de Recerca de la Biodiversitat, Universitat de Barcelona, Barcelona, Spain
| | - José M Ranz
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA
| |
Collapse
|
40
|
Tran Van P, Anselmetti Y, Bast J, Dumas Z, Galtier N, Jaron KS, Martens K, Parker DJ, Robinson-Rechavi M, Schwander T, Simion P, Schön I. First annotated draft genomes of nonmarine ostracods (Ostracoda, Crustacea) with different reproductive modes. G3 (BETHESDA, MD.) 2021; 11:jkab043. [PMID: 33591306 PMCID: PMC8049415 DOI: 10.1093/g3journal/jkab043] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Accepted: 02/05/2021] [Indexed: 11/14/2022]
Abstract
Ostracods are one of the oldest crustacean groups with an excellent fossil record and high importance for phylogenetic analyses but genome resources for this class are still lacking. We have successfully assembled and annotated the first reference genomes for three species of nonmarine ostracods; two with obligate sexual reproduction (Cyprideis torosa and Notodromas monacha) and the putative ancient asexual Darwinula stevensoni. This kind of genomic research has so far been impeded by the small size of most ostracods and the absence of genetic resources such as linkage maps or BAC libraries that were available for other crustaceans. For genome assembly, we used an Illumina-based sequencing technology, resulting in assemblies of similar sizes for the three species (335-382 Mb) and with scaffold numbers and their N50 (19-56 kb) in the same orders of magnitude. Gene annotations were guided by transcriptome data from each species. The three assemblies are relatively complete with BUSCO scores of 92-96. The number of predicted genes (13,771-17,776) is in the same range as Branchiopoda genomes but lower than in most malacostracan genomes. These three reference genomes from nonmarine ostracods provide the urgently needed basis to further develop ostracods as models for evolutionary and ecological research.
Collapse
Affiliation(s)
- Patrick Tran Van
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Yoann Anselmetti
- ISEM—Institut des Sciences de l’Evolution, Montpellier 34090, France
| | - Jens Bast
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
| | - Zoé Dumas
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
| | - Nicolas Galtier
- ISEM—Institut des Sciences de l’Evolution, Montpellier 34090, France
| | - Kamil S Jaron
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
| | - Koen Martens
- Royal Belgian Institute of Natural Sciences, OD Nature, Freshwater Biology, Brussels 1000, Belgium
- Department of Biology, University of Ghent, Ghent 9000, Belgium
| | - Darren J Parker
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Marc Robinson-Rechavi
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Tanja Schwander
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
| | - Paul Simion
- ISEM—Institut des Sciences de l’Evolution, Montpellier 34090, France
- Université de Namur, LEGE, URBE, Namur 5000, Belgium
| | - Isa Schön
- Royal Belgian Institute of Natural Sciences, OD Nature, Freshwater Biology, Brussels 1000, Belgium
- University of Hasselt, Research Group Zoology, Diepenbeek 3590, Belgium
| |
Collapse
|
41
|
Vogt G. Epigenetic variation in animal populations: Sources, extent, phenotypic implications, and ecological and evolutionary relevance. J Biosci 2021. [DOI: 10.1007/s12038-021-00138-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
42
|
Chakraborty M, Chang CH, Khost DE, Vedanayagam J, Adrion JR, Liao Y, Montooth KL, Meiklejohn CD, Larracuente AM, Emerson JJ. Evolution of genome structure in the Drosophila simulans species complex. Genome Res 2021; 31:380-396. [PMID: 33563718 PMCID: PMC7919458 DOI: 10.1101/gr.263442.120] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Accepted: 12/28/2020] [Indexed: 12/25/2022]
Abstract
The rapid evolution of repetitive DNA sequences, including satellite DNA, tandem duplications, and transposable elements, underlies phenotypic evolution and contributes to hybrid incompatibilities between species. However, repetitive genomic regions are fragmented and misassembled in most contemporary genome assemblies. We generated highly contiguous de novo reference genomes for the Drosophila simulans species complex (D. simulans, D. mauritiana, and D. sechellia), which speciated ∼250,000 yr ago. Our assemblies are comparable in contiguity and accuracy to the current D. melanogaster genome, allowing us to directly compare repetitive sequences between these four species. We find that at least 15% of the D. simulans complex species genomes fail to align uniquely to D. melanogaster owing to structural divergence-twice the number of single-nucleotide substitutions. We also find rapid turnover of satellite DNA and extensive structural divergence in heterochromatic regions, whereas the euchromatic gene content is mostly conserved. Despite the overall preservation of gene synteny, euchromatin in each species has been shaped by clade- and species-specific inversions, transposable elements, expansions and contractions of satellite and tRNA tandem arrays, and gene duplications. We also find rapid divergence among Y-linked genes, including copy number variation and recent gene duplications from autosomes. Our assemblies provide a valuable resource for studying genome evolution and its consequences for phenotypic evolution in these genetic model species.
Collapse
Affiliation(s)
- Mahul Chakraborty
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, California 92697, USA
| | - Ching-Ho Chang
- Department of Biology, University of Rochester, Rochester, New York 14627, USA
| | - Danielle E Khost
- Department of Biology, University of Rochester, Rochester, New York 14627, USA
- FAS Informatics and Scientific Applications, Harvard University, Cambridge, Massachusetts 02138, USA
| | - Jeffrey Vedanayagam
- Department of Developmental Biology, Memorial Sloan-Kettering Cancer Center, New York, New York 10065, USA
| | - Jeffrey R Adrion
- Institute of Ecology and Evolution, University of Oregon, Eugene, Oregon 97403, USA
| | - Yi Liao
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, California 92697, USA
| | - Kristi L Montooth
- School of Biological Sciences, University of Nebraska-Lincoln, Lincoln, Nebraska 68502, USA
| | - Colin D Meiklejohn
- School of Biological Sciences, University of Nebraska-Lincoln, Lincoln, Nebraska 68502, USA
| | | | - J J Emerson
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, California 92697, USA
| |
Collapse
|
43
|
Chakraborty M, Ramaiah A, Adolfi A, Halas P, Kaduskar B, Ngo LT, Jayaprasad S, Paul K, Whadgar S, Srinivasan S, Subramani S, Bier E, James AA, Emerson JJ. Hidden genomic features of an invasive malaria vector, Anopheles stephensi, revealed by a chromosome-level genome assembly. BMC Biol 2021; 19:28. [PMID: 33568145 PMCID: PMC7876825 DOI: 10.1186/s12915-021-00963-z] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Accepted: 01/19/2021] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND The mosquito Anopheles stephensi is a vector of urban malaria in Asia that recently invaded Africa. Studying the genetic basis of vectorial capacity and engineering genetic interventions are both impeded by limitations of a vector's genome assembly. The existing assemblies of An. stephensi are draft-quality and contain thousands of sequence gaps, potentially missing genetic elements important for its biology and evolution. RESULTS To access previously intractable genomic regions, we generated a reference-grade genome assembly and full transcript annotations that achieve a new standard for reference genomes of disease vectors. Here, we report novel species-specific transposable element (TE) families and insertions in functional genetic elements, demonstrating the widespread role of TEs in genome evolution and phenotypic variation. We discovered 29 previously hidden members of insecticide resistance genes, uncovering new candidate genetic elements for the widespread insecticide resistance observed in An. stephensi. We identified 2.4 Mb of the Y chromosome and seven new male-linked gene candidates, representing the most extensive coverage of the Y chromosome in any mosquito. By tracking full-length mRNA for > 15 days following blood feeding, we discover distinct roles of previously uncharacterized genes in blood metabolism and female reproduction. The Y-linked heterochromatin landscape reveals extensive accumulation of long-terminal repeat retrotransposons throughout the evolution and degeneration of this chromosome. Finally, we identify a novel Y-linked putative transcription factor that is expressed constitutively throughout male development and adulthood, suggesting an important role. CONCLUSION Collectively, these results and resources underscore the significance of previously hidden genomic elements in the biology of malaria mosquitoes and will accelerate the development of genetic control strategies of malaria transmission.
Collapse
Affiliation(s)
- Mahul Chakraborty
- Department of Ecology and Evolutionary Biology, University of California, Irvine, CA, 92697, USA
| | - Arunachalam Ramaiah
- Department of Ecology and Evolutionary Biology, University of California, Irvine, CA, 92697, USA
- Section of Cell and Developmental Biology, University of California, San Diego, La Jolla, CA, 92093-0335, USA
- Tata Institute for Genetics and Society, Center at inStem, Bangalore, Karnataka, 560065, India
| | - Adriana Adolfi
- Department of Microbiology & Molecular Genetics, University of California, Irvine, CA, 92697, USA
| | - Paige Halas
- Department of Microbiology & Molecular Genetics, University of California, Irvine, CA, 92697, USA
| | - Bhagyashree Kaduskar
- Section of Cell and Developmental Biology, University of California, San Diego, La Jolla, CA, 92093-0335, USA
- Tata Institute for Genetics and Society, Center at inStem, Bangalore, Karnataka, 560065, India
| | - Luna Thanh Ngo
- Department of Ecology and Evolutionary Biology, University of California, Irvine, CA, 92697, USA
| | - Suvratha Jayaprasad
- Institute of Bioinformatics and Applied Biotechnology, Bangalore, KA, 560100, India
| | - Kiran Paul
- Institute of Bioinformatics and Applied Biotechnology, Bangalore, KA, 560100, India
| | - Saurabh Whadgar
- Institute of Bioinformatics and Applied Biotechnology, Bangalore, KA, 560100, India
| | - Subhashini Srinivasan
- Tata Institute for Genetics and Society, Center at inStem, Bangalore, Karnataka, 560065, India
- Institute of Bioinformatics and Applied Biotechnology, Bangalore, KA, 560100, India
| | - Suresh Subramani
- Tata Institute for Genetics and Society, Center at inStem, Bangalore, Karnataka, 560065, India
- Section of Molecular Biology, University of California, San Diego, La Jolla, CA, 92093-0322, USA
- Tata Institute for Genetics and Society, University of California, San Diego, La Jolla, CA, 92093-0335, USA
| | - Ethan Bier
- Section of Cell and Developmental Biology, University of California, San Diego, La Jolla, CA, 92093-0335, USA
- Tata Institute for Genetics and Society, University of California, San Diego, La Jolla, CA, 92093-0335, USA
| | - Anthony A James
- Department of Microbiology & Molecular Genetics, University of California, Irvine, CA, 92697, USA
- Tata Institute for Genetics and Society, University of California, San Diego, La Jolla, CA, 92093-0335, USA
- Department of Molecular Biology & Biochemistry, University of California, Irvine, CA, 92697, USA
| | - J J Emerson
- Department of Ecology and Evolutionary Biology, University of California, Irvine, CA, 92697, USA.
- Center for Complex Biological Systems, University of California, Irvine, CA, 92697, USA.
| |
Collapse
|
44
|
Eisfeldt J, Pettersson M, Petri A, Nilsson D, Feuk L, Lindstrand A. Hybrid sequencing resolves two germline ultra-complex chromosomal rearrangements consisting of 137 breakpoint junctions in a single carrier. Hum Genet 2020; 140:775-790. [PMID: 33315133 PMCID: PMC8052244 DOI: 10.1007/s00439-020-02242-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Accepted: 11/18/2020] [Indexed: 12/16/2022]
Abstract
Chromoanagenesis is a genomic event responsible for the formation of complex structural chromosomal rearrangements (CCRs). Germline chromoanagenesis is rare and the majority of reported cases are associated with an affected phenotype. Here, we report a healthy female carrying two de novo CCRs involving chromosomes 4, 19, 21 and X and chromosomes 7 and 11, respectively, with a total of 137 breakpoint junctions (BPJs). We characterized the CCRs using a hybrid-sequencing approach, combining short-read sequencing, nanopore sequencing, and optical mapping. The results were validated using multiple cytogenetic methods, including fluorescence in situ hybridization, spectral karyotyping, and Sanger sequencing. We identified 137 BPJs, which to our knowledge is the highest number of reported breakpoint junctions in germline chromoanagenesis. We also performed a statistical assessment of the positioning of the breakpoints, revealing a significant enrichment of BPJ-affecting genes (96 intragenic BPJs, 26 genes, p < 0.0001), indicating that the CCRs formed during active transcription of these genes. In addition, we find that the DNA fragments are unevenly and non-randomly distributed across the derivative chromosomes indicating a multistep process of scattering and re-joining of DNA fragments. In summary, we report a new maximum number of BPJs (137) in germline chromoanagenesis. We also show that a hybrid sequencing approach is necessary for the correct characterization of complex CCRs. Through in-depth statistical assessment, it was found that the CCRs most likely was formed through an event resembling chromoplexy—a catastrophic event caused by erroneous transcription factor binding.
Collapse
Affiliation(s)
- Jesper Eisfeldt
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Karolinska University Hospital Solna, 171 76, Stockholm, Sweden.,Department of Clinical Genetics, Karolinska University Hospital, Stockholm, Sweden.,Science for Life Laboratory, Karolinska Institutet Science Park, Solna, Sweden
| | - Maria Pettersson
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Karolinska University Hospital Solna, 171 76, Stockholm, Sweden.,Department of Clinical Genetics, Karolinska University Hospital, Stockholm, Sweden
| | - Anna Petri
- Science for Life Laboratory Uppsala, Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden
| | - Daniel Nilsson
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Karolinska University Hospital Solna, 171 76, Stockholm, Sweden.,Department of Clinical Genetics, Karolinska University Hospital, Stockholm, Sweden.,Science for Life Laboratory, Karolinska Institutet Science Park, Solna, Sweden
| | - Lars Feuk
- Science for Life Laboratory Uppsala, Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden
| | - Anna Lindstrand
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Karolinska University Hospital Solna, 171 76, Stockholm, Sweden. .,Department of Clinical Genetics, Karolinska University Hospital, Stockholm, Sweden.
| |
Collapse
|
45
|
Shaikh KM, Kumar P, Nesamma AA, Abdin MZ, Jutur PP. Hybrid genome assembly and functional annotation reveals insights on lipid biosynthesis of oleaginous native isolate Parachlorella kessleri, a potential industrial strain for production of biofuel precursors. ALGAL RES 2020. [DOI: 10.1016/j.algal.2020.102118] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
|
46
|
The Beginning of the End: A Chromosomal Assembly of the New World Malaria Mosquito Ends with a Novel Telomere. G3-GENES GENOMES GENETICS 2020; 10:3811-3819. [PMID: 32883756 PMCID: PMC7534423 DOI: 10.1534/g3.120.401654] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Chromosome level assemblies are accumulating in various taxonomic groups including mosquitoes. However, even in the few reference-quality mosquito assemblies, a significant portion of the heterochromatic regions including telomeres remain unresolved. Here we produce a de novo assembly of the New World malaria mosquito, Anopheles albimanus by integrating Oxford Nanopore sequencing, Illumina, Hi-C and optical mapping. This 172.6 Mbps female assembly, which we call AalbS3, is obtained by scaffolding polished large contigs (contig N50 = 13.7 Mbps) into three chromosomes. All chromosome arms end with telomeric repeats, which is the first in mosquito assemblies and represents a significant step toward the completion of a genome assembly. These telomeres consist of tandem repeats of a novel 30-32 bp Telomeric Repeat Unit (TRU) and are confirmed by analyzing the termini of long reads and through both chromosomal in situ hybridization and a Bal31 sensitivity assay. The AalbS3 assembly included previously uncharacterized centromeric and rDNA clusters and more than doubled the content of transposable elements and other repetitive sequences. This telomere-to-telomere assembly, although still containing gaps, represents a significant step toward resolving biologically important but previously hidden genomic components. The comparison of different scaffolding methods will also inform future efforts to obtain reference-quality genomes for other mosquito species.
Collapse
|
47
|
Sielemann K, Hafner A, Pucker B. The reuse of public datasets in the life sciences: potential risks and rewards. PeerJ 2020; 8:e9954. [PMID: 33024631 PMCID: PMC7518187 DOI: 10.7717/peerj.9954] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Accepted: 08/25/2020] [Indexed: 12/13/2022] Open
Abstract
The 'big data' revolution has enabled novel types of analyses in the life sciences, facilitated by public sharing and reuse of datasets. Here, we review the prodigious potential of reusing publicly available datasets and the associated challenges, limitations and risks. Possible solutions to issues and research integrity considerations are also discussed. Due to the prominence, abundance and wide distribution of sequencing data, we focus on the reuse of publicly available sequence datasets. We define 'successful reuse' as the use of previously published data to enable novel scientific findings. By using selected examples of successful reuse from different disciplines, we illustrate the enormous potential of the practice, while acknowledging the respective limitations and risks. A checklist to determine the reuse value and potential of a particular dataset is also provided. The open discussion of data reuse and the establishment of this practice as a norm has the potential to benefit all stakeholders in the life sciences.
Collapse
Affiliation(s)
- Katharina Sielemann
- Genetics and Genomics of Plants, Center for Biotechnology (CeBiTec) & Faculty of Biology, Bielefeld University, Bielefeld, Germany
- Graduate School DILS, Bielefeld Institute for Bioinformatics Infrastructure (BIBI), Bielefeld University, Bielefeld, Germany
| | - Alenka Hafner
- Genetics and Genomics of Plants, Center for Biotechnology (CeBiTec) & Faculty of Biology, Bielefeld University, Bielefeld, Germany
- Current Affiliation: Intercollege Graduate Degree Program in Plant Biology, Penn State University, University Park, State College, PA, United States of America
| | - Boas Pucker
- Genetics and Genomics of Plants, Center for Biotechnology (CeBiTec) & Faculty of Biology, Bielefeld University, Bielefeld, Germany
- Evolution and Diversity, Department of Plant Sciences, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
48
|
Bogaerts-Márquez M, Barrón MG, Fiston-Lavier AS, Vendrell-Mir P, Castanera R, Casacuberta JM, González J. T-lex3: an accurate tool to genotype and estimate population frequencies of transposable elements using the latest short-read whole genome sequencing data. Bioinformatics 2020; 36:1191-1197. [PMID: 31580402 PMCID: PMC7703783 DOI: 10.1093/bioinformatics/btz727] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2019] [Revised: 09/16/2019] [Accepted: 09/25/2019] [Indexed: 12/22/2022] Open
Abstract
Motivation Transposable elements (TEs) constitute a significant proportion of the majority of genomes sequenced to date. TEs are responsible for a considerable fraction of the genetic variation within and among species. Accurate genotyping of TEs in genomes is therefore crucial for a complete identification of the genetic differences among individuals, populations and species. Results In this work, we present a new version of T-lex, a computational pipeline that accurately genotypes and estimates the population frequencies of reference TE insertions using short-read high-throughput sequencing data. In this new version, we have re-designed the T-lex algorithm to integrate the BWA-MEM short-read aligner, which is one of the most accurate short-read mappers and can be launched on longer short-reads (e.g. reads >150 bp). We have added new filtering steps to increase the accuracy of the genotyping, and new parameters that allow the user to control both the minimum and maximum number of reads, and the minimum number of strains to genotype a TE insertion. We also showed for the first time that T-lex3 provides accurate TE calls in a plant genome. Availability and implementation To test the accuracy of T-lex3, we called 1630 individual TE insertions in Drosophila melanogaster, 1600 individual TE insertions in humans, and 3067 individual TE insertions in the rice genome. We showed that this new version of T-lex is a broadly applicable and accurate tool for genotyping and estimating TE frequencies in organisms with different genome sizes and different TE contents. T-lex3 is available at Github: https://github.com/GonzalezLab/T-lex3. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- María Bogaerts-Márquez
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), Paseo Maritimo Barceloneta 37-49, Barcelona, Spain
| | - Maite G Barrón
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), Paseo Maritimo Barceloneta 37-49, Barcelona, Spain
| | - Anna-Sophie Fiston-Lavier
- Institut des Sciences de l'Evolution de Montpellier (UMR 5554, CNRS-UM-IRD-EPHE), 11 Université de Motpellier, Place Eugène Bataillon, Montpellier, France
| | - Pol Vendrell-Mir
- Center for Research in Agricultural Genomics, CRAG (CSIC-IRTA-UAB-UB), Campus UAB, Cerdanyola del Vallès, Barcelona, Spain
| | - Raúl Castanera
- Center for Research in Agricultural Genomics, CRAG (CSIC-IRTA-UAB-UB), Campus UAB, Cerdanyola del Vallès, Barcelona, Spain
| | - Josep M Casacuberta
- Center for Research in Agricultural Genomics, CRAG (CSIC-IRTA-UAB-UB), Campus UAB, Cerdanyola del Vallès, Barcelona, Spain
| | - Josefa González
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), Paseo Maritimo Barceloneta 37-49, Barcelona, Spain
| |
Collapse
|
49
|
Mohamed M, Dang NTM, Ogyama Y, Burlet N, Mugat B, Boulesteix M, Mérel V, Veber P, Salces-Ortiz J, Severac D, Pélisson A, Vieira C, Sabot F, Fablet M, Chambeyron S. A Transposon Story: From TE Content to TE Dynamic Invasion of Drosophila Genomes Using the Single-Molecule Sequencing Technology from Oxford Nanopore. Cells 2020; 9:E1776. [PMID: 32722451 PMCID: PMC7465170 DOI: 10.3390/cells9081776] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2020] [Revised: 07/17/2020] [Accepted: 07/23/2020] [Indexed: 11/17/2022] Open
Abstract
Transposable elements (TEs) are the main components of genomes. However, due to their repetitive nature, they are very difficult to study using data obtained with short-read sequencing technologies. Here, we describe an efficient pipeline to accurately recover TE insertion (TEI) sites and sequences from long reads obtained by Oxford Nanopore Technology (ONT) sequencing. With this pipeline, we could precisely describe the landscapes of the most recent TEIs in wild-type strains of Drosophila melanogaster and Drosophila simulans. Their comparison suggests that this subset of TE sequences is more similar than previously thought in these two species. The chromosome assemblies obtained using this pipeline also allowed recovering piRNA cluster sequences, which was impossible using short-read sequencing. Finally, we used our pipeline to analyze ONT sequencing data from a D. melanogaster unstable line in which LTR transposition was derepressed for 73 successive generations. We could rely on single reads to identify new insertions with intact target site duplications. Moreover, the detailed analysis of TEIs in the wild-type strains and the unstable line did not support the trap model claiming that piRNA clusters are hotspots of TE insertions.
Collapse
Affiliation(s)
- Mourdas Mohamed
- Institute of Human Genetics, UMR9002, CNRS and Montpellier University, 34396 Montpellier, France; (M.M.); (Y.O.); (B.M.); (A.P.)
| | - Nguyet Thi-Minh Dang
- IRD/UM UMR DIADE, 911 avenue Agropolis BP64501, 34394 Montpellier, France; (N.T.-M.D.); (F.S.)
| | - Yuki Ogyama
- Institute of Human Genetics, UMR9002, CNRS and Montpellier University, 34396 Montpellier, France; (M.M.); (Y.O.); (B.M.); (A.P.)
| | - Nelly Burlet
- Université de Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Evolutive UMR 5558, 69622 Villeurbanne, France; (N.B.); (M.B.); (V.M.); (P.V.); (J.S.-O.); (C.V.)
| | - Bruno Mugat
- Institute of Human Genetics, UMR9002, CNRS and Montpellier University, 34396 Montpellier, France; (M.M.); (Y.O.); (B.M.); (A.P.)
| | - Matthieu Boulesteix
- Université de Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Evolutive UMR 5558, 69622 Villeurbanne, France; (N.B.); (M.B.); (V.M.); (P.V.); (J.S.-O.); (C.V.)
| | - Vincent Mérel
- Université de Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Evolutive UMR 5558, 69622 Villeurbanne, France; (N.B.); (M.B.); (V.M.); (P.V.); (J.S.-O.); (C.V.)
| | - Philippe Veber
- Université de Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Evolutive UMR 5558, 69622 Villeurbanne, France; (N.B.); (M.B.); (V.M.); (P.V.); (J.S.-O.); (C.V.)
| | - Judit Salces-Ortiz
- Université de Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Evolutive UMR 5558, 69622 Villeurbanne, France; (N.B.); (M.B.); (V.M.); (P.V.); (J.S.-O.); (C.V.)
- Institute of Evolutionary Biology (IBE), CSIC-Universitat Pompeu Fabra, 08003 Barcelona, Spain
| | - Dany Severac
- MGX-Montpellier GenomiX, c/o Institut de Génomique Fonctionnelle, CNRS, INSERM, Université de Montpellier, 34094 Montpellier, France;
| | - Alain Pélisson
- Institute of Human Genetics, UMR9002, CNRS and Montpellier University, 34396 Montpellier, France; (M.M.); (Y.O.); (B.M.); (A.P.)
| | - Cristina Vieira
- Université de Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Evolutive UMR 5558, 69622 Villeurbanne, France; (N.B.); (M.B.); (V.M.); (P.V.); (J.S.-O.); (C.V.)
| | - François Sabot
- IRD/UM UMR DIADE, 911 avenue Agropolis BP64501, 34394 Montpellier, France; (N.T.-M.D.); (F.S.)
| | - Marie Fablet
- Université de Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Evolutive UMR 5558, 69622 Villeurbanne, France; (N.B.); (M.B.); (V.M.); (P.V.); (J.S.-O.); (C.V.)
| | - Séverine Chambeyron
- Institute of Human Genetics, UMR9002, CNRS and Montpellier University, 34396 Montpellier, France; (M.M.); (Y.O.); (B.M.); (A.P.)
| |
Collapse
|
50
|
Linder RA, Majumder A, Chakraborty M, Long A. Two Synthetic 18-Way Outcrossed Populations of Diploid Budding Yeast with Utility for Complex Trait Dissection. Genetics 2020; 215:323-342. [PMID: 32241804 PMCID: PMC7268983 DOI: 10.1534/genetics.120.303202] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2020] [Accepted: 03/31/2020] [Indexed: 02/07/2023] Open
Abstract
Advanced-generation multiparent populations (MPPs) are a valuable tool for dissecting complex traits, having more power than genome-wide association studies to detect rare variants and higher resolution than F2 linkage mapping. To extend the advantages of MPPs in budding yeast, we describe the creation and characterization of two outbred MPPs derived from 18 genetically diverse founding strains. We carried out de novo assemblies of the genomes of the 18 founder strains, such that virtually all variation segregating between these strains is known, and represented those assemblies as Santa Cruz Genome Browser tracks. We discovered complex patterns of structural variation segregating among the founders, including a large deletion within the vacuolar ATPase VMA1, several different deletions within the osmosensor MSB2, a series of deletions and insertions at PRM7 and the adjacent BSC1, as well as copy number variation at the dehydrogenase ALD2 Resequenced haploid recombinant clones from the two MPPs have a median unrecombined block size of 66 kb, demonstrating that the population is highly recombined. We pool-sequenced the two MPPs to 3270× and 2226× coverage and demonstrated that we can accurately estimate local haplotype frequencies using pooled data. We further downsampled the pool-sequenced data to ∼20-40× and showed that local haplotype frequency estimates remained accurate, with median error rates 0.8 and 0.6% at 20× and 40×, respectively. Haplotypes frequencies are estimated much more accurately than SNP frequencies obtained directly from the same data. Deep sequencing of the two populations revealed that 10 or more founders are present at a detectable frequency for > 98% of the genome, validating the utility of this resource for the exploration of the role of standing variation in the architecture of complex traits.
Collapse
Affiliation(s)
- Robert A Linder
- Department of Ecology and Evolutionary Biology, School of Biological Sciences, University of California, Irvine, California 92697-2525
| | - Arundhati Majumder
- Department of Ecology and Evolutionary Biology, School of Biological Sciences, University of California, Irvine, California 92697-2525
| | - Mahul Chakraborty
- Department of Ecology and Evolutionary Biology, School of Biological Sciences, University of California, Irvine, California 92697-2525
| | - Anthony Long
- Department of Ecology and Evolutionary Biology, School of Biological Sciences, University of California, Irvine, California 92697-2525
| |
Collapse
|