1
|
Nie Y, Liu X, Zhao L, Huang Y. Repetitive element expansions contribute to genome size gigantism in Pamphagidae: A comparative study (Orthoptera, Acridoidea). Genomics 2024; 116:110896. [PMID: 39025318 DOI: 10.1016/j.ygeno.2024.110896] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 07/10/2024] [Accepted: 07/15/2024] [Indexed: 07/20/2024]
Abstract
Pamphagidae is a family of Acridoidea that inhabits the desert steppes of Eurasia and Africa. This study employed flow cytometry to estimate the genome size of eight species in the Pamphagidae. The results indicate that the genome size of the eight species ranged from 13.88 pg to 14.66 pg, with an average of 14.26 pg. This is the largest average genome size recorded for the Orthoptera families, as well as for the entire Insecta. Furthermore, the study explored the role of repetitive sequences in the genome, including their evolutionary dynamics and activity, using low-coverage next-generation sequencing data. The genome is composed of 14 different types of repetitive sequences, which collectively make up between 59.9% and 68.17% of the total genome. The Pamphagidae family displays high levels of transposable element (TE) activity, with the number of TEs increasing and accumulating since the family's emergence. The study found that the types of repetitive sequences contributing to the TE outburst events are similar across species. Additionally, the study identified unique repetitive elements for each species. The differences in repetitive sequences among the eight Pamphagidae species correspond to their phylogenetic relationships. The study sheds new light on genome gigantism in the Pamphagidae and provides insight into the correlation between genome size and repetitive sequences within the family.
Collapse
Affiliation(s)
- Yimeng Nie
- College of Life Sciences, Shaanxi Normal University, Xi'an, China
| | - Xuanzeng Liu
- College of Life Sciences, Shaanxi Normal University, Xi'an, China
| | - Lina Zhao
- College of Life Sciences, Shaanxi Normal University, Xi'an, China
| | - Yuan Huang
- College of Life Sciences, Shaanxi Normal University, Xi'an, China.
| |
Collapse
|
2
|
Takvorian N, Zangui H, Naino Jika AK, Alouane A, Siljak-Yakovlev S. Genome Size Variation in Sesamum indicum L. Germplasm from Niger. Genes (Basel) 2024; 15:711. [PMID: 38927647 PMCID: PMC11203198 DOI: 10.3390/genes15060711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 05/17/2024] [Accepted: 05/24/2024] [Indexed: 06/28/2024] Open
Abstract
Sesamum indicum L. (Pedaliaceae) is one of the most economically important oil crops in the world, thanks to the high oil content of its seeds and its nutritional value. It is cultivated all over the world, mainly in Asia and Africa. Well adapted to arid environments, sesame offers a good opportunity as an alternative subsistence crop for farmers in Africa, particularly Niger, to cope with climate change. For the first time, the variation in genome size among 75 accessions of the Nigerien germplasm was studied. The sample was collected throughout Niger, revealing various morphological, biochemical and phenological traits. For comparison, an additional accession from Thailand was evaluated as an available Asian representative. In the Niger sample, the 2C DNA value ranged from 0.77 to 1 pg (753 to 978 Mbp), with an average of 0.85 ± 0.037 pg (831 Mbp). Statistical analysis showed a significant difference in 2C DNA values among 58 pairs of Niger accessions (p-value < 0.05). This significant variation indicates the likely genetic diversity of sesame germplasm, offering valuable insights into its possible potential for climate-resilient agriculture. Our results therefore raise a fundamental question: is intraspecific variability in the genome size of Nigerien sesame correlated with specific morphological and physiological traits?
Collapse
Affiliation(s)
- Najat Takvorian
- Université Paris-Saclay, CNRS, AgroParisTech, Ecologie Systématique Evolution, 91190 Gif-sur-Yvette, France;
- Sorbonne Université, UFR Sciences de la Vie, UFR927, 4 Place Jussieu, F-75005 Paris Cedex 05, France
| | - Hamissou Zangui
- Department of Plant Production, Abdou Moumouni University, BP-10960 Niamey, Niger; (H.Z.); (A.K.N.J.)
| | - Abdel Kader Naino Jika
- Department of Plant Production, Abdou Moumouni University, BP-10960 Niamey, Niger; (H.Z.); (A.K.N.J.)
| | - Aïda Alouane
- Université Paris-Saclay, CNRS, AgroParisTech, Ecologie Systématique Evolution, 91190 Gif-sur-Yvette, France;
- Sorbonne Université, UFR Sciences de la Vie, UFR927, 4 Place Jussieu, F-75005 Paris Cedex 05, France
| | - Sonja Siljak-Yakovlev
- Université Paris-Saclay, CNRS, AgroParisTech, Ecologie Systématique Evolution, 91190 Gif-sur-Yvette, France;
| |
Collapse
|
3
|
Dayi M. Diversity and evolution of transposable elements in the plant-parasitic nematodes. BMC Genomics 2024; 25:511. [PMID: 38783171 PMCID: PMC11118728 DOI: 10.1186/s12864-024-10435-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 05/21/2024] [Indexed: 05/25/2024] Open
Abstract
BACKGROUND Transposable elements (TEs) are mobile DNA sequences that propagate within genomes, occupying a significant portion of eukaryotic genomes and serving as a source of genetic variation and innovation. TEs can impact genome dynamics through their repetitive nature and mobility. Nematodes are incredibly versatile organisms, capable of thriving in a wide range of environments. The plant-parasitic nematodes are able to infect nearly all vascular plants, leading to significant crop losses and management expenses worldwide. It is worth noting that plant parasitism has evolved independently at least three times within this nematode group. Furthermore, the genome size of plant-parasitic nematodes can vary substantially, spanning from 41.5 Mbp to 235 Mbp. To investigate genome size variation and evolution in plant-parasitic nematodes, TE composition, diversity, and evolution were analysed in 26 plant-parasitic nematodes from 9 distinct genera in Clade IV. RESULTS Interestingly, despite certain species lacking specific types of DNA transposons or retrotransposon superfamilies, they still exhibit a diverse range of TE content. Identification of species-specific TE repertoire in nematode genomes provides a deeper understanding of genome evolution in plant-parasitic nematodes. An intriguing observation is that plant-parasitic nematodes possess extensive DNA transposons and retrotransposon insertions, including recent sightings of LTR/Gypsy and LTR/Pao superfamilies. Among them, the Gypsy superfamilies were found to encode Aspartic proteases in the plant-parasitic nematodes. CONCLUSIONS The study of the transposable element (TE) composition in plant-parasitic nematodes has yielded insightful discoveries. The findings revealed that certain species exhibit lineage-specific variations in their TE makeup. Discovering the species-specific TE repertoire in nematode genomes is a crucial element in understanding the evolution of genomes in plant-parasitic nematodes. It allows us to gain a deeper insight into the intricate workings of these organisms and their genetic makeup. With this knowledge, we are gaining a fundamental piece in the puzzle of understanding the evolution of these parasites. Moreover, recent transpositions have led to the acquisition of new TE superfamilies, especially Gypsy and Pao retrotransposons, further expanding the diversity of TEs in these nematodes. Significantly, the widely distributed Gypsy superfamily possesses proteases that are exclusively associated with parasitism during nematode-host interactions. These discoveries provide a deeper understanding of the TE landscape within plant-parasitic nematodes.
Collapse
Affiliation(s)
- Mehmet Dayi
- Forestry Vocational School, Düzce University, Konuralp Campus, 81620, Düzce, Türkiye.
- Faculty of Medicine, University of Miyazaki, Miyazaki, Japan.
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, 277-8562, Japan.
| |
Collapse
|
4
|
Forest T, Achaz G, Marbouty M, Bignaud A, Thierry A, Koszul R, Milhes M, Lledo J, Pons JM, Fuchs J. Chromosome-level genome assembly of the European green woodpecker Picus viridis. G3 (BETHESDA, MD.) 2024; 14:jkae042. [PMID: 38537260 PMCID: PMC11075563 DOI: 10.1093/g3journal/jkae042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Accepted: 02/15/2024] [Indexed: 05/08/2024]
Abstract
The European green woodpecker, Picus viridis, is a widely distributed species found in the Western Palearctic region. Here, we assembled a highly contiguous genome assembly for this species using a combination of short- and long-read sequencing and scaffolded with chromatin conformation capture (Hi-C). The final genome assembly was 1.28 Gb and features a scaffold N50 of 37 Mb and a scaffold L50 of 39.165 Mb. The assembly incorporates 89.4% of the genes identified in birds in OrthoDB. Gene and repetitive content annotation on the assembly detected 15,805 genes and a ∼30.1% occurrence of repetitive elements, respectively. Analysis of synteny demonstrates the fragmented nature of the P. viridis genome when compared to the chicken (Gallus gallus). The assembly and annotations produced in this study will certainly help for further research into the genomics of P. viridis and the comparative evolution of woodpeckers. Five historical and seven contemporary samples have been resequenced and may give insights on the population history of this species.
Collapse
Affiliation(s)
- Thomas Forest
- Éco-anthropologie, Muséum national d’Histoire naturelle, CNRS UMR 7206, 75005 Paris, France
- CIRB, Collège de France, Université PSL, CNRS, INSERM, 75005 Paris, France
- Institut de Systématique Evolution Biodiversité, Muséum national d’Histoire naturelle CNRS SU EPHE UA, CP 51, 75005 Paris, France
| | - Guillaume Achaz
- CIRB, Collège de France, Université PSL, CNRS, INSERM, 75005 Paris, France
- Université Paris-Cité, 75006 Paris, France
| | - Martial Marbouty
- Institut Pasteur, CNRS UMR 3525, Université Paris Cité, Unité Régulation Spatiale des Génomes, 75015 Paris, France
| | - Amaury Bignaud
- Institut Pasteur, CNRS UMR 3525, Université Paris Cité, Unité Régulation Spatiale des Génomes, 75015 Paris, France
| | - Agnès Thierry
- Institut Pasteur, CNRS UMR 3525, Université Paris Cité, Unité Régulation Spatiale des Génomes, 75015 Paris, France
| | - Romain Koszul
- Institut Pasteur, CNRS UMR 3525, Université Paris Cité, Unité Régulation Spatiale des Génomes, 75015 Paris, France
| | - Marine Milhes
- PlaGe, INRAE, Genotoul, 31320 Castanet-Tolosan, France
| | - Joanna Lledo
- PlaGe, INRAE, Genotoul, 31320 Castanet-Tolosan, France
| | - Jean-Marc Pons
- Institut de Systématique Evolution Biodiversité, Muséum national d’Histoire naturelle CNRS SU EPHE UA, CP 51, 75005 Paris, France
| | - Jérôme Fuchs
- Institut de Systématique Evolution Biodiversité, Muséum national d’Histoire naturelle CNRS SU EPHE UA, CP 51, 75005 Paris, France
| |
Collapse
|
5
|
Benham PM, Cicero C, Escalona M, Beraut E, Fairbairn C, Marimuthu MPA, Nguyen O, Sahasrabudhe R, King BL, Thomas WK, Kovach AI, Nachman MW, Bowie RCK. Remarkably High Repeat Content in the Genomes of Sparrows: The Importance of Genome Assembly Completeness for Transposable Element Discovery. Genome Biol Evol 2024; 16:evae067. [PMID: 38566597 PMCID: PMC11088854 DOI: 10.1093/gbe/evae067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 03/01/2024] [Accepted: 03/23/2024] [Indexed: 04/04/2024] Open
Abstract
Transposable elements (TE) play critical roles in shaping genome evolution. Highly repetitive TE sequences are also a major source of assembly gaps making it difficult to fully understand the impact of these elements on host genomes. The increased capacity of long-read sequencing technologies to span highly repetitive regions promises to provide new insights into patterns of TE activity across diverse taxa. Here we report the generation of highly contiguous reference genomes using PacBio long-read and Omni-C technologies for three species of Passerellidae sparrow. We compared these assemblies to three chromosome-level sparrow assemblies and nine other sparrow assemblies generated using a variety of short- and long-read technologies. All long-read based assemblies were longer (range: 1.12 to 1.41 Gb) than short-read assemblies (0.91 to 1.08 Gb) and assembly length was strongly correlated with the amount of repeat content. Repeat content for Bell's sparrow (31.2% of genome) was the highest level ever reported within the order Passeriformes, which comprises over half of avian diversity. The highest levels of repeat content (79.2% to 93.7%) were found on the W chromosome relative to other regions of the genome. Finally, we show that proliferation of different TE classes varied even among species with similar levels of repeat content. These patterns support a dynamic model of TE expansion and contraction even in a clade where TEs were once thought to be fairly depauperate and static. Our work highlights how the resolution of difficult-to-assemble regions of the genome with new sequencing technologies promises to transform our understanding of avian genome evolution.
Collapse
Affiliation(s)
- Phred M Benham
- Museum of Vertebrate Zoology, University of California Berkeley, Berkeley, CA 94720, USA
- Department of Integrative Biology, University of California Berkeley, Berkeley, CA 94720, USA
| | - Carla Cicero
- Museum of Vertebrate Zoology, University of California Berkeley, Berkeley, CA 94720, USA
| | - Merly Escalona
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Eric Beraut
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Colin Fairbairn
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Mohan P A Marimuthu
- DNA Technologies and Expression Analysis Core Laboratory, Genome Center, University of California-Davis, Davis, CA 95616, USA
| | - Oanh Nguyen
- DNA Technologies and Expression Analysis Core Laboratory, Genome Center, University of California-Davis, Davis, CA 95616, USA
| | - Ruta Sahasrabudhe
- DNA Technologies and Expression Analysis Core Laboratory, Genome Center, University of California-Davis, Davis, CA 95616, USA
| | - Benjamin L King
- Department of Molecular and Biomedical Sciences, University of Maine, Orono, ME 04469, USA
| | - W Kelley Thomas
- Department of Molecular, Cellular and Biomedical Sciences, University of New Hampshire, Durham, NH 03824, USA
| | - Adrienne I Kovach
- Department of Natural Resources and the Environment, University of New Hampshire, Durham, NH 03824, USA
| | - Michael W Nachman
- Museum of Vertebrate Zoology, University of California Berkeley, Berkeley, CA 94720, USA
- Department of Integrative Biology, University of California Berkeley, Berkeley, CA 94720, USA
| | - Rauri C K Bowie
- Museum of Vertebrate Zoology, University of California Berkeley, Berkeley, CA 94720, USA
- Department of Integrative Biology, University of California Berkeley, Berkeley, CA 94720, USA
| |
Collapse
|
6
|
O’Connor RE, Kretschmer R, Romanov MN, Griffin DK. A Bird's-Eye View of Chromosomic Evolution in the Class Aves. Cells 2024; 13:310. [PMID: 38391923 PMCID: PMC10886771 DOI: 10.3390/cells13040310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Revised: 01/27/2024] [Accepted: 02/05/2024] [Indexed: 02/24/2024] Open
Abstract
Birds (Aves) are the most speciose of terrestrial vertebrates, displaying Class-specific characteristics yet incredible external phenotypic diversity. Critical to agriculture and as model organisms, birds have adapted to many habitats. The only extant examples of dinosaurs, birds emerged ~150 mya and >10% are currently threatened with extinction. This review is a comprehensive overview of avian genome ("chromosomic") organization research based mostly on chromosome painting and BAC-based studies. We discuss traditional and contemporary tools for reliably generating chromosome-level assemblies and analyzing multiple species at a higher resolution and wider phylogenetic distance than previously possible. These results permit more detailed investigations into inter- and intrachromosomal rearrangements, providing unique insights into evolution and speciation mechanisms. The 'signature' avian karyotype likely arose ~250 mya and remained largely unchanged in most groups including extinct dinosaurs. Exceptions include Psittaciformes, Falconiformes, Caprimulgiformes, Cuculiformes, Suliformes, occasional Passeriformes, Ciconiiformes, and Pelecaniformes. The reasons for this remarkable conservation may be the greater diploid chromosome number generating variation (the driver of natural selection) through a greater possible combination of gametes and/or an increase in recombination rate. A deeper understanding of avian genomic structure permits the exploration of fundamental biological questions pertaining to the role of evolutionary breakpoint regions and homologous synteny blocks.
Collapse
Affiliation(s)
- Rebecca E. O’Connor
- School of Biosciences, University of Kent, Canterbury CT2 7NJ, UK; (R.E.O.); (M.N.R.)
| | - Rafael Kretschmer
- Departamento de Ecologia, Zoologia e Genética, Instituto de Biologia, Campus Universitário Capão do Leão, Universidade Federal de Pelotas, Pelotas 96010-900, RS, Brazil;
| | - Michael N. Romanov
- School of Biosciences, University of Kent, Canterbury CT2 7NJ, UK; (R.E.O.); (M.N.R.)
- L. K. Ernst Federal Research Centre for Animal Husbandry, Dubrovitsy, 142132 Podolsk, Moscow Oblast, Russia
| | - Darren K. Griffin
- School of Biosciences, University of Kent, Canterbury CT2 7NJ, UK; (R.E.O.); (M.N.R.)
| |
Collapse
|
7
|
Stuart KC, Johnson RN, Major RE, Atsawawaranunt K, Ewart KM, Rollins LA, Santure AW, Whibley A. The genome of a globally invasive passerine, the common myna, Acridotheres tristis. DNA Res 2024; 31:dsae005. [PMID: 38366840 PMCID: PMC10917472 DOI: 10.1093/dnares/dsae005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 02/13/2024] [Accepted: 02/15/2024] [Indexed: 02/18/2024] Open
Abstract
In an era of global climate change, biodiversity conservation is receiving increased attention. Conservation efforts are greatly aided by genetic tools and approaches, which seek to understand patterns of genetic diversity and how they impact species health and their ability to persist under future climate regimes. Invasive species offer vital model systems in which to investigate questions regarding adaptive potential, with a particular focus on how changes in genetic diversity and effective population size interact with novel selection regimes. The common myna (Acridotheres tristis) is a globally invasive passerine and is an excellent model species for research both into the persistence of low-diversity populations and the mechanisms of biological invasion. To underpin research on the invasion genetics of this species, we present the genome assembly of the common myna. We describe the genomic landscape of this species, including genome wide allelic diversity, methylation, repeats, and recombination rate, as well as an examination of gene family evolution. Finally, we use demographic analysis to identify that some native regions underwent a dramatic population increase between the two most recent periods of glaciation, and reveal artefactual impacts of genetic bottlenecks on demographic analysis.
Collapse
Affiliation(s)
- Katarina C Stuart
- School of Biological Sciences, University of Auckland, Auckland, Aotearoa, New Zealand
- Evolution and Ecology Research Centre, School of Biological, Earth and Environmental Sciences, University of New South Wales, Sydney, Australia
| | - Rebecca N Johnson
- National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
| | - Richard E Major
- Australian Museum Research Institute, Australian Museum, Sydney, Australia
| | | | - Kyle M Ewart
- Australian Museum Research Institute, Australian Museum, Sydney, Australia
- School of Life and Environmental Sciences,University of Sydney, Sydney, Australia
| | - Lee A Rollins
- Evolution and Ecology Research Centre, School of Biological, Earth and Environmental Sciences, University of New South Wales, Sydney, Australia
| | - Anna W Santure
- School of Biological Sciences, University of Auckland, Auckland, Aotearoa, New Zealand
| | - Annabel Whibley
- School of Biological Sciences, University of Auckland, Auckland, Aotearoa, New Zealand
| |
Collapse
|
8
|
Hensen N, Bonometti L, Westerberg I, Brännström IO, Guillou S, Cros-Aarteil S, Calhoun S, Haridas S, Kuo A, Mondo S, Pangilinan J, Riley R, LaButti K, Andreopoulos B, Lipzen A, Chen C, Yan M, Daum C, Ng V, Clum A, Steindorff A, Ohm RA, Martin F, Silar P, Natvig DO, Lalanne C, Gautier V, Ament-Velásquez SL, Kruys Å, Hutchinson MI, Powell AJ, Barry K, Miller AN, Grigoriev IV, Debuchy R, Gladieux P, Hiltunen Thorén M, Johannesson H. Genome-scale phylogeny and comparative genomics of the fungal order Sordariales. Mol Phylogenet Evol 2023; 189:107938. [PMID: 37820761 DOI: 10.1016/j.ympev.2023.107938] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 09/28/2023] [Accepted: 10/04/2023] [Indexed: 10/13/2023]
Abstract
The order Sordariales is taxonomically diverse, and harbours many species with different lifestyles and large economic importance. Despite its importance, a robust genome-scale phylogeny, and associated comparative genomic analysis of the order is lacking. In this study, we examined whole-genome data from 99 Sordariales, including 52 newly sequenced genomes, and seven outgroup taxa. We inferred a comprehensive phylogeny that resolved several contentious relationships amongst families in the order, and cleared-up intrafamily relationships within the Podosporaceae. Extensive comparative genomics showed that genomes from the three largest families in the dataset (Chaetomiaceae, Podosporaceae and Sordariaceae) differ greatly in GC content, genome size, gene number, repeat percentage, evolutionary rate, and genome content affected by repeat-induced point mutations (RIP). All genomic traits showed phylogenetic signal, and ancestral state reconstruction revealed that the variation of the properties stems primarily from within-family evolution. Together, the results provide a thorough framework for understanding genome evolution in this important group of fungi.
Collapse
Affiliation(s)
- Noah Hensen
- Stockholm University, Department of Ecology, Environment and Plants Sciences, Stockholm, Sweden
| | - Lucas Bonometti
- University of Montpellier, PHIM Plant Health Institute, Montpellier, France
| | - Ivar Westerberg
- Stockholm University, Department of Ecology, Environment and Plants Sciences, Stockholm, Sweden
| | - Ioana Onut Brännström
- Oslo University, Natural History Museum, Oslo, Norway; Uppsala University, Department of Ecology and Genetics, Uppsala, Sweden
| | - Sonia Guillou
- University of Montpellier, PHIM Plant Health Institute, Montpellier, France
| | | | - Sara Calhoun
- Lawrence Berkeley National Laboratory, U.S. Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Sajeet Haridas
- Lawrence Berkeley National Laboratory, U.S. Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Alan Kuo
- Lawrence Berkeley National Laboratory, U.S. Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Stephen Mondo
- Lawrence Berkeley National Laboratory, U.S. Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Jasmyn Pangilinan
- Lawrence Berkeley National Laboratory, U.S. Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Robert Riley
- Lawrence Berkeley National Laboratory, U.S. Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Kurt LaButti
- Lawrence Berkeley National Laboratory, U.S. Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Bill Andreopoulos
- Lawrence Berkeley National Laboratory, U.S. Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Anna Lipzen
- Lawrence Berkeley National Laboratory, U.S. Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Cindy Chen
- Lawrence Berkeley National Laboratory, U.S. Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Mi Yan
- Lawrence Berkeley National Laboratory, U.S. Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Chris Daum
- Lawrence Berkeley National Laboratory, U.S. Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Vivian Ng
- Lawrence Berkeley National Laboratory, U.S. Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Alicia Clum
- Lawrence Berkeley National Laboratory, U.S. Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Andrei Steindorff
- Lawrence Berkeley National Laboratory, U.S. Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Robin A Ohm
- Lawrence Berkeley National Laboratory, U.S. Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | | | - Philippe Silar
- Université de Paris Cité, Laboratoire Interdisciplinaire des Energies de Demain, Paris, France
| | - Donald O Natvig
- University of New Mexico, Department of Biology, Albuquerque, USA
| | - Christophe Lalanne
- Université de Paris Cité, Laboratoire Interdisciplinaire des Energies de Demain, Paris, France
| | - Valérie Gautier
- Université de Paris Cité, Laboratoire Interdisciplinaire des Energies de Demain, Paris, France
| | | | - Åsa Kruys
- Uppsala University, Museum of Evolution, Uppsala, Sweden
| | | | - Amy Jo Powell
- Sandia National Laboratories, Dept. of Systems Design and Architecture, Albuquerque, USA
| | - Kerrie Barry
- Lawrence Berkeley National Laboratory, U.S. Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Andrew N Miller
- University of Illinois Urbana-Champaign, Illinois Natural History Survey, USA
| | - Igor V Grigoriev
- Lawrence Berkeley National Laboratory, U.S. Department of Energy Joint Genome Institute, Berkeley, CA, USA; University of California Berkeley, Department of Plant and Microbial Biology, Berkeley, CA, USA
| | - Robert Debuchy
- Université Paris-Saclay, Institute for Integrative Biology of the Cell, Gif-sur-Yvette, France
| | - Pierre Gladieux
- University of Montpellier, PHIM Plant Health Institute, Montpellier, France
| | - Markus Hiltunen Thorén
- Stockholm University, Department of Ecology, Environment and Plants Sciences, Stockholm, Sweden; The Royal Swedish Academy of Sciences, Stockholm, Sweden
| | - Hanna Johannesson
- Stockholm University, Department of Ecology, Environment and Plants Sciences, Stockholm, Sweden; The Royal Swedish Academy of Sciences, Stockholm, Sweden.
| |
Collapse
|
9
|
Ricci M, Peona V, Boattini A, Taccioli C. Comparative analysis of bats and rodents' genomes suggests a relation between non-LTR retrotransposons, cancer incidence, and ageing. Sci Rep 2023; 13:9039. [PMID: 37270634 DOI: 10.1038/s41598-023-36006-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Accepted: 05/27/2023] [Indexed: 06/05/2023] Open
Abstract
The presence in nature of species showing drastic differences in lifespan and cancer incidence has recently increased the interest of the scientific community. In particular, the adaptations and the genomic features underlying the evolution of cancer-resistant and long-lived organisms have recently focused on transposable elements (TEs). In this study, we compared the content and dynamics of TE activity in the genomes of four rodent and six bat species exhibiting different lifespans and cancer susceptibility. Mouse, rat, and guinea pig genomes (short-lived and cancer-prone organisms) were compared with that of naked mole rat (Heterocephalus glaber) which is a cancer-resistant organism and the rodent with the longest lifespan. The long-lived bats of the genera Myotis, Rhinolophus, Pteropus and Rousettus were instead compared with Molossus molossus, which is one of the organisms with the shortest lifespan among the order Chiroptera. Despite previous hypotheses stating a substantial tolerance of TEs in bats, we found that long-lived bats and the naked mole rat share a marked decrease of non-LTR retrotransposons (LINEs and SINEs) accumulation in recent evolutionary times.
Collapse
Affiliation(s)
| | - Valentina Peona
- Department of Organismal Biology, Systematic Biology, Uppsala University, Uppsala, Sweden.
| | - Alessio Boattini
- Department of Biological, Geological and Environmental Sciences, University of Bologna, Bologna, Italy
| | - Cristian Taccioli
- Department of Animal Medicine, Health and Production, University of Padova, Padua, Italy
| |
Collapse
|
10
|
Šatović-Vukšić E, Plohl M. Satellite DNAs-From Localized to Highly Dispersed Genome Components. Genes (Basel) 2023; 14:genes14030742. [PMID: 36981013 PMCID: PMC10048060 DOI: 10.3390/genes14030742] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Revised: 03/15/2023] [Accepted: 03/16/2023] [Indexed: 03/30/2023] Open
Abstract
According to the established classical view, satellite DNAs are defined as abundant non-coding DNA sequences repeated in tandem that build long arrays located in heterochromatin. Advances in sequencing methodologies and development of specialized bioinformatics tools enabled defining a collection of all repetitive DNAs and satellite DNAs in a genome, the repeatome and the satellitome, respectively, as well as their reliable annotation on sequenced genomes. Supported by various non-model species included in recent studies, the patterns of satellite DNAs and satellitomes as a whole showed much more diversity and complexity than initially thought. Differences are not only in number and abundance of satellite DNAs but also in their distribution across the genome, array length, interspersion patterns, association with transposable elements, localization in heterochromatin and/or in euchromatin. In this review, we compare characteristic organizational features of satellite DNAs and satellitomes across different animal and plant species in order to summarize organizational forms and evolutionary processes that may lead to satellitomes' diversity and revisit some basic notions regarding repetitive DNA landscapes in genomes.
Collapse
Affiliation(s)
- Eva Šatović-Vukšić
- Division of Molecular Biology, Ruđer Bošković Institute, 10000 Zagreb, Croatia
| | - Miroslav Plohl
- Division of Molecular Biology, Ruđer Bošković Institute, 10000 Zagreb, Croatia
| |
Collapse
|
11
|
Peona V, Kutschera VE, Blom MPK, Irestedt M, Suh A. Satellite DNA evolution in Corvoidea inferred from short and long reads. Mol Ecol 2023; 32:1288-1305. [PMID: 35488497 DOI: 10.1111/mec.16484] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Revised: 04/11/2022] [Accepted: 04/17/2022] [Indexed: 11/29/2022]
Abstract
Satellite DNA (satDNA) is a fast-evolving portion of eukaryotic genomes. The homogeneous and repetitive nature of such satDNA causes problems during the assembly of genomes, and therefore it is still difficult to study it in detail in nonmodel organisms as well as across broad evolutionary timescales. Here, we combined the use of short- and long-read data to explore the diversity and evolution of satDNA between individuals of the same species and between genera of birds spanning ~40 millions of years of bird evolution using birds-of-paradise (Paradisaeidae) and crow (Corvus) species. These avian species highlighted the presence of a GC-rich Corvoidea satellitome composed of 61 satellite families and provided a set of candidate satDNA monomers for being centromeric on the basis of length, abundance, homogeneity and transcription. Surprisingly, we found that the satDNA of crow species rapidly diverged between closely related species while the satDNA appeared more similar between birds-of-paradise species belonging to different genera.
Collapse
Affiliation(s)
- Valentina Peona
- Department of Organismal Biology - Systematic Biology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Verena E Kutschera
- Department of Biochemistry and Biophysics, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Stockholm University, Solna, Sweden
| | - Mozes P K Blom
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden.,Museum für Naturkunde, Leibniz Institut für Evolutions- und Biodiversitätsforschung, Berlin, Germany
| | - Martin Irestedt
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden
| | - Alexander Suh
- Department of Organismal Biology - Systematic Biology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden.,School of Biological Sciences-Organisms and the Environment, University of East Anglia, Norwich, UK
| |
Collapse
|
12
|
Genomic Characterization of Multidrug-Resistant Extended Spectrum β-Lactamase-Producing Klebsiella pneumoniae from Clinical Samples of a Tertiary Hospital in South Kivu Province, Eastern Democratic Republic of Congo. Microorganisms 2023; 11:microorganisms11020525. [PMID: 36838490 PMCID: PMC9960421 DOI: 10.3390/microorganisms11020525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Revised: 02/15/2023] [Accepted: 02/16/2023] [Indexed: 02/22/2023] Open
Abstract
Multidrug-resistant (MDR) and extended spectrum β-lactamase (ESBL)-producing extra-intestinal K. pneumoniae are associated with increased morbidity and mortality. This study aimed to characterize the resistance and virulence profiles of extra-intestinal MDR ESBL-producing K. pneumoniae associated with infections at a tertiary hospital in South-Kivu province, DRC. Whole-genome sequencing (WGS) was carried out on 37 K. pneumoniae isolates displaying MDR and ESBL-producing phenotype. The assembled genomes were analysed for phylogeny, virulence factors and antimicrobial resistance genes (ARG) determinants. These isolates were compared to sub-Saharan counterparts. K. pneumoniae isolates displayed a high genetic variability with up to 16 sequence types (ST). AMR was widespread against β-lactamases (including third and fourth-generation cephalosporins, but not carbapenems), aminoglycosides, ciprofloxacin, tetracycline, erythromycin, nitrofurantoin, and cotrimoxazole. The blaCTX-M-15 gene was the most common β-lactamase gene among K. pneumoniae isolates. No carbapenemase gene was found. ARG for aminoglycosides, quinolones, phenicols, tetracyclines, sulfonamides, nitrofurantoin were widely distributed among the isolates. Nine isolates had the colistin-resistant R256G substitution in the pmrB efflux pump gene without displaying reduced susceptibility to colistin. Despite carrying virulence genes, none had hypervirulence genes. Our results highlight the genetic diversity of MDR ESBL-producing K. pneumoniae isolates and underscore the importance of monitoring simultaneously the evolution of phenotypic and genotypic AMR in Bukavu and DRC, while calling for caution in administering colistin and carbapenem to patients.
Collapse
|
13
|
Macas J, Ávila Robledillo L, Kreplak J, Novák P, Koblížková A, Vrbová I, Burstin J, Neumann P. Assembly of the 81.6 Mb centromere of pea chromosome 6 elucidates the structure and evolution of metapolycentric chromosomes. PLoS Genet 2023; 19:e1010633. [PMID: 36735726 PMCID: PMC10027222 DOI: 10.1371/journal.pgen.1010633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 03/20/2023] [Accepted: 01/23/2023] [Indexed: 02/04/2023] Open
Abstract
Centromeres in the legume genera Pisum and Lathyrus exhibit unique morphological characteristics, including extended primary constrictions and multiple separate domains of centromeric chromatin. These so-called metapolycentromeres resemble an intermediate form between monocentric and holocentric types, and therefore provide a great opportunity for studying the transitions between different types of centromere organizations. However, because of the exceedingly large and highly repetitive nature of metapolycentromeres, highly contiguous assemblies needed for these studies are lacking. Here, we report on the assembly and analysis of a 177.6 Mb region of pea (Pisum sativum) chromosome 6, including the 81.6 Mb centromere region (CEN6) and adjacent chromosome arms. Genes, DNA methylation profiles, and most of the repeats were uniformly distributed within the centromere, and their densities in CEN6 and chromosome arms were similar. The exception was an accumulation of satellite DNA in CEN6, where it formed multiple arrays up to 2 Mb in length. Centromeric chromatin, characterized by the presence of the CENH3 protein, was predominantly associated with arrays of three different satellite repeats; however, five other satellites present in CEN6 lacked CENH3. The presence of CENH3 chromatin was found to determine the spatial distribution of the respective satellites during the cell cycle. Finally, oligo-FISH painting experiments, performed using probes specifically designed to label the genomic regions corresponding to CEN6 in Pisum, Lathyrus, and Vicia species, revealed that metapolycentromeres evolved via the expansion of centromeric chromatin into neighboring chromosomal regions and the accumulation of novel satellite repeats. However, in some of these species, centromere evolution also involved chromosomal translocations and centromere repositioning.
Collapse
Affiliation(s)
- Jiří Macas
- Biology Centre, Czech Academy of Sciences, Institute of Plant Molecular Biology, Branišovská 31, České Budějovice, Czech Republic
| | - Laura Ávila Robledillo
- Biology Centre, Czech Academy of Sciences, Institute of Plant Molecular Biology, Branišovská 31, České Budějovice, Czech Republic
| | - Jonathan Kreplak
- Agroécologie, AgroSup Dijon, INRA, Univ. Bourgogne, Univ. Bourgogne Franche-Comté, Dijon, France
| | - Petr Novák
- Biology Centre, Czech Academy of Sciences, Institute of Plant Molecular Biology, Branišovská 31, České Budějovice, Czech Republic
| | - Andrea Koblížková
- Biology Centre, Czech Academy of Sciences, Institute of Plant Molecular Biology, Branišovská 31, České Budějovice, Czech Republic
| | - Iva Vrbová
- Biology Centre, Czech Academy of Sciences, Institute of Plant Molecular Biology, Branišovská 31, České Budějovice, Czech Republic
| | - Judith Burstin
- Agroécologie, AgroSup Dijon, INRA, Univ. Bourgogne, Univ. Bourgogne Franche-Comté, Dijon, France
| | - Pavel Neumann
- Biology Centre, Czech Academy of Sciences, Institute of Plant Molecular Biology, Branišovská 31, České Budějovice, Czech Republic
| |
Collapse
|
14
|
Matoulek D, Ježek B, Vohnoutová M, Symonová R. Advances in Vertebrate (Cyto)Genomics Shed New Light on Fish Compositional Genome Evolution. Genes (Basel) 2023; 14:genes14020244. [PMID: 36833171 PMCID: PMC9956151 DOI: 10.3390/genes14020244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Accepted: 01/05/2023] [Indexed: 01/19/2023] Open
Abstract
Cytogenetic and compositional studies considered fish genomes rather poor in guanine-cytosine content (GC%) because of a putative "sharp increase in genic GC% during the evolution of higher vertebrates". However, the available genomic data have not been exploited to confirm this viewpoint. In contrast, further misunderstandings in GC%, mostly of fish genomes, originated from a misapprehension of the current flood of data. Utilizing public databases, we calculated the GC% in animal genomes of three different, technically well-established fractions: DNA (entire genome), cDNA (complementary DNA), and cds (exons). Our results across chordates help set borders of GC% values that are still incorrect in literature and show: (i) fish in their immense diversity possess comparably GC-rich (or even GC-richer) genomes as higher vertebrates, and fish exons are GC-enriched among vertebrates; (ii) animal genomes generally show a GC-enrichment from the DNA, over cDNA, to the cds level (i.e., not only the higher vertebrates); (iii) fish and invertebrates show a broad(er) inter-quartile range in GC%, while avian and mammalian genomes are more constrained in their GC%. These results indicate no sharp increase in the GC% of genes during the transition to higher vertebrates, as stated and numerously repeated before. We present our results in 2D and 3D space to explore the compositional genome landscape and prepared an online platform to explore the AT/GC compositional genome evolution.
Collapse
Affiliation(s)
- Dominik Matoulek
- Department of Physics, Faculty of Science, University of Hradec Králové, 500 03 Hradec Králové, Czech Republic
| | - Bruno Ježek
- Faculty of Informatics and Management, University of Hradec Králové, Rokitanského 62, 500 02 Hradec Králové, Czech Republic
| | - Marta Vohnoutová
- Department of Computer Science, Faculty of Science, University of South Bohemia, Branišovská 1760, 370 05 České Budějovice, Czech Republic
| | - Radka Symonová
- Department of Computer Science, Faculty of Science, University of South Bohemia, Branišovská 1760, 370 05 České Budějovice, Czech Republic
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, 85354 Freising, Germany
- Institute of Hydrobiology, Biology Centre of the Czech Academy of Sciences, 370 05 České Budějovice, Czech Republic
- Correspondence:
| |
Collapse
|
15
|
Ulu S, Ulu ZO, Akar A, Ozgenturk NO. De novo Transcriptome Analysis and Gene Expression Profiling of Corylus Species. Folia Biol (Praha) 2023; 69:99-106. [PMID: 38206775 DOI: 10.14712/fb2023069030099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2024]
Abstract
Hazelnut (Corylus), which has high commercial and nutritional benefits, is an important tree for producing nuts and nut oil consumed as ingredient especially in chocolate. While Corylus avellana L. (Euro-pean hazelnut, Betulaceae) and Corylus colurna L. (Turkish hazelnut, Betulaceae) are the two common hazelnut species in Europe, C. avellana L. (Tombul hazelnut) is grown as the most widespread hazelnut species in Turkey, and C. colurna L., which is the most important genetic resource for hazelnut breeding, exists naturally in Anatolia. We generated the transcriptome data of these two Corylus species and used these data for gene discovery and gene expression profiling. Total RNA from young leaves, flowers (male and female), buds, and husk shoots of C. avellana and C. colurna were used for two different libraries and were sequenced using Illumina HiSeq4000 with 100 bp paired-end reads. The transcriptome data 10.48 and 10.30 Gb of C. avellana and C. colurna, respectively, were assembled into 70,265 and 88,343 unigenes, respectively. These unigenes were functionally annotated using the TRAPID platform. We identified 25,312 and 27,051 simple sequen-ce repeats (SSRs) for C. avellana and C. colurna, respectively. TL1, GMPM1, N, 2MMP, At1g29670, CHIB1 unigenes were selected for validation with qPCR. The first de novo transcriptome data of C. co-lurna were used to compare data of C. avellana of commercial importance. These data constitute a valuable extension of the publicly available transcriptomic resource aimed at breeding, medicinal, and industrial research studies.
Collapse
Affiliation(s)
- Salih Ulu
- Department of Molecular Biology and Genetics, Faculty of Art and Science, Yildiz Technical University, Istanbul, Turkey
| | - Zehra Omeroglu Ulu
- Department of Genetics and Bioengineering, Faculty of Engineering, Yeditepe University, Istanbul, Turkey
- Department of Molecular Biology and Genetics, Faculty of Art and Science, Yildiz Technical University, Istanbul, Turkey
| | - Aysun Akar
- Hazelnut Research Institution, Ministry of Food, Agriculture and Livestock, Giresun, Turkey
| | - Nehir Ozdemir Ozgenturk
- Department of Molecular Biology and Genetics, Faculty of Art and Science, Yildiz Technical University, Istanbul, Turkey.
| |
Collapse
|
16
|
Kim J, Lee C, Ko BJ, Yoo DA, Won S, Phillippy AM, Fedrigo O, Zhang G, Howe K, Wood J, Durbin R, Formenti G, Brown S, Cantin L, Mello CV, Cho S, Rhie A, Kim H, Jarvis ED. False gene and chromosome losses in genome assemblies caused by GC content variation and repeats. Genome Biol 2022; 23:204. [PMID: 36167554 PMCID: PMC9516821 DOI: 10.1186/s13059-022-02765-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Accepted: 09/02/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Many short-read genome assemblies have been found to be incomplete and contain mis-assemblies. The Vertebrate Genomes Project has been producing new reference genome assemblies with an emphasis on being as complete and error-free as possible, which requires utilizing long reads, long-range scaffolding data, new assembly algorithms, and manual curation. A more thorough evaluation of the recent references relative to prior assemblies can provide a detailed overview of the types and magnitude of improvements. RESULTS Here we evaluate new vertebrate genome references relative to the previous assemblies for the same species and, in two cases, the same individuals, including a mammal (platypus), two birds (zebra finch, Anna's hummingbird), and a fish (climbing perch). We find that up to 11% of genomic sequence is entirely missing in the previous assemblies. In the Vertebrate Genomes Project zebra finch assembly, we identify eight new GC- and repeat-rich micro-chromosomes with high gene density. The impact of missing sequences is biased towards GC-rich 5'-proximal promoters and 5' exon regions of protein-coding genes and long non-coding RNAs. Between 26 and 60% of genes include structural or sequence errors that could lead to misunderstanding of their function when using the previous genome assemblies. CONCLUSIONS Our findings reveal novel regulatory landscapes and protein coding sequences that have been greatly underestimated in previous assemblies and are now present in the Vertebrate Genomes Project reference genomes.
Collapse
Affiliation(s)
- Juwan Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Chul Lee
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Byung June Ko
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea
| | - Dong Ahn Yoo
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Sohyoung Won
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA
| | - Olivier Fedrigo
- Vertebrate Genome Lab, The Rockefeller University, New York City, USA
| | - Guojie Zhang
- BGI-Shenzhen, Shenzhen, 518083, China
- Villum Centre for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Universitetsparken 15, 2100, Copenhagen, Denmark
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223, China
| | | | | | - Richard Durbin
- Wellcome Sanger Institute, Cambridge, UK
- Department of Genetics, University of Cambridge, Cambridge, UK
| | - Giulio Formenti
- Vertebrate Genome Lab, The Rockefeller University, New York City, USA
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York City, USA
| | - Samara Brown
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York City, USA
| | - Lindsey Cantin
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York City, USA
| | - Claudio V Mello
- Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, OR, 97239, USA
| | - Seoae Cho
- eGnome, Inc, Seoul, Republic of Korea
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA
| | - Heebal Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea.
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea.
- eGnome, Inc, Seoul, Republic of Korea.
| | - Erich D Jarvis
- Vertebrate Genome Lab, The Rockefeller University, New York City, USA.
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York City, USA.
- Howard Hughes Medical Institute, Chevy Chase, MD, USA.
| |
Collapse
|
17
|
Kirov I, Kolganova E, Dudnikov M, Yurkevich OY, Amosova AV, Muravenko OV. A Pipeline NanoTRF as a New Tool for De Novo Satellite DNA Identification in the Raw Nanopore Sequencing Reads of Plant Genomes. PLANTS (BASEL, SWITZERLAND) 2022; 11:2103. [PMID: 36015406 PMCID: PMC9413040 DOI: 10.3390/plants11162103] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 08/08/2022] [Accepted: 08/11/2022] [Indexed: 06/15/2023]
Abstract
High-copy tandemly organized repeats (TRs), or satellite DNA, is an important but still enigmatic component of eukaryotic genomes. TRs comprise arrays of multi-copy and highly similar tandem repeats, which makes the elucidation of TRs a very challenging task. Oxford Nanopore sequencing data provide a valuable source of information on TR organization at the single molecule level. However, bioinformatics tools for de novo identification of TRs in raw Nanopore data have not been reported so far. We developed NanoTRF, a new python pipeline for TR repeat identification, characterization and consensus monomer sequence assembly. This new pipeline requires only a raw Nanopore read file from low-depth (<1×) genome sequencing. The program generates an informative html report and figures on TR genome abundance, monomer sequence and monomer length. In addition, NanoTRF performs annotation of transposable elements (TEs) sequences within or near satDNA arrays, and the information can be used to elucidate how TR−TE co-evolve in the genome. Moreover, we validated by FISH that the NanoTRF report is useful for the evaluation of TR chromosome organization—clustered or dispersed. Our findings showed that NanoTRF is a robust method for the de novo identification of satellite repeats in raw Nanopore data without prior read assembly. The obtained sequences can be used in many downstream analyses including genome assembly assistance and gap estimation, chromosome mapping and cytogenetic marker development.
Collapse
Affiliation(s)
- Ilya Kirov
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, Moscow 127550, Russia
- Moscow Institute of Physics and Technology, Dolgoprudny 141701, Russia
| | - Elizaveta Kolganova
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, Moscow 127550, Russia
| | - Maxim Dudnikov
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, Moscow 127550, Russia
- Moscow Institute of Physics and Technology, Dolgoprudny 141701, Russia
| | - Olga Yu. Yurkevich
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow 119991, Russia
| | - Alexandra V. Amosova
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow 119991, Russia
| | - Olga V. Muravenko
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow 119991, Russia
| |
Collapse
|
18
|
Dantán-González E, Quiroz-Castañeda RE, Aguilar-Díaz H, Amaro-Estrada I, Martínez-Ocampo F, Rodríguez-Camarillo S. Mexican Strains of Anaplasma marginale: A First Comparative Genomics and Phylogeographic Analysis. Pathogens 2022; 11:pathogens11080873. [PMID: 36014994 PMCID: PMC9415054 DOI: 10.3390/pathogens11080873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Revised: 07/24/2022] [Accepted: 07/28/2022] [Indexed: 02/01/2023] Open
Abstract
The One Health approach looks after animal welfare and demands constant monitoring of the strains that circulate globally to prevent outbreaks. Anaplasma marginale is the etiologic agent of bovine anaplasmosis and is endemic worldwide. This study aimed to analyze, for the first time, the genetic diversity of seven Mexican strains of A. marginale and their relationship with other strains reported. The main features of A. marginale were obtained by characterizing all 24 genomes reported so far. Genetic diversity and phylogeography were analyzed by characterizing the msp1a gene and 5′-UTR microsatellite sequences and constructing a phylogenetic tree with 540 concatenated genes of the core genome. The Mexican strains show 15 different repeat sequences in six MSP1a structures and have phylogeographic relationships with strains from North America, South America, and Asia, which confirms they are highly variable. Based on our results, we encourage the performance of genome sequencing of A. marginale strains to obtain a high assembly level of molecular markers and the performance of extensive phylogeographic analysis. Undoubtedly, genomic surveillance helps build a picture of how a pathogen changes and evolves in geographical regions. However, we cannot discard the study of relationships pathogens establish with ticks and how they have co-evolved to establish themselves as a successful transmission system.
Collapse
Affiliation(s)
- Edgar Dantán-González
- Laboratorio de Estudios Ecogenómicos, Centro de Investigación en Biotecnología, Universidad Autónoma del Estado de Morelos, Cuernavaca 62209, Mexico; (E.D.-G.); (F.M.-O.)
| | - Rosa Estela Quiroz-Castañeda
- Centro Nacional de Investigación Disciplinaria en Salud Animal e Inocuidad, Instituto Nacional de Investigaciones Forestales, Agrícolas y Pecuarias, Jiutepec 62574, Mexico; (H.A.-D.); (I.A.-E.); (S.R.-C.)
- Correspondence: or
| | - Hugo Aguilar-Díaz
- Centro Nacional de Investigación Disciplinaria en Salud Animal e Inocuidad, Instituto Nacional de Investigaciones Forestales, Agrícolas y Pecuarias, Jiutepec 62574, Mexico; (H.A.-D.); (I.A.-E.); (S.R.-C.)
| | - Itzel Amaro-Estrada
- Centro Nacional de Investigación Disciplinaria en Salud Animal e Inocuidad, Instituto Nacional de Investigaciones Forestales, Agrícolas y Pecuarias, Jiutepec 62574, Mexico; (H.A.-D.); (I.A.-E.); (S.R.-C.)
| | - Fernando Martínez-Ocampo
- Laboratorio de Estudios Ecogenómicos, Centro de Investigación en Biotecnología, Universidad Autónoma del Estado de Morelos, Cuernavaca 62209, Mexico; (E.D.-G.); (F.M.-O.)
| | - Sergio Rodríguez-Camarillo
- Centro Nacional de Investigación Disciplinaria en Salud Animal e Inocuidad, Instituto Nacional de Investigaciones Forestales, Agrícolas y Pecuarias, Jiutepec 62574, Mexico; (H.A.-D.); (I.A.-E.); (S.R.-C.)
| |
Collapse
|
19
|
Cerca J, Petersen B, Lazaro-Guevara JM, Rivera-Colón A, Birkeland S, Vizueta J, Li S, Li Q, Loureiro J, Kosawang C, Díaz PJ, Rivas-Torres G, Fernández-Mazuecos M, Vargas P, McCauley RA, Petersen G, Santos-Bay L, Wales N, Catchen JM, Machado D, Nowak MD, Suh A, Sinha NR, Nielsen LR, Seberg O, Gilbert MTP, Leebens-Mack JH, Rieseberg LH, Martin MD. The genomic basis of the plant island syndrome in Darwin's giant daisies. Nat Commun 2022; 13:3729. [PMID: 35764640 PMCID: PMC9240058 DOI: 10.1038/s41467-022-31280-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Accepted: 06/09/2022] [Indexed: 12/04/2022] Open
Abstract
The repeated, rapid and often pronounced patterns of evolutionary divergence observed in insular plants, or the ‘plant island syndrome’, include changes in leaf phenotypes, growth, as well as the acquisition of a perennial lifestyle. Here, we sequence and describe the genome of the critically endangered, Galápagos-endemic species Scalesia atractyloides Arnot., obtaining a chromosome-resolved, 3.2-Gbp assembly containing 43,093 candidate gene models. Using a combination of fossil transposable elements, k-mer spectra analyses and orthologue assignment, we identify the two ancestral genomes, and date their divergence and the polyploidization event, concluding that the ancestor of all extant Scalesia species was an allotetraploid. There are a comparable number of genes and transposable elements across the two subgenomes, and while their synteny has been mostly conserved, we find multiple inversions that may have facilitated adaptation. We identify clear signatures of selection across genes associated with vascular development, growth, adaptation to salinity and flowering time, thus finding compelling evidence for a genomic basis of the island syndrome in one of Darwin’s giant daisies. Many island plant species share a syndrome of characteristic phenotype and life history. Cerca et al. find the genomic basis of the plant island syndrome in one of Darwin’s giant daisies, while separating ancestral genomes in a chromosome-resolved polyploid assembly.
Collapse
Affiliation(s)
- José Cerca
- Department of Natural History, NTNU University Museum, Norwegian University of Science and Technology, Trondheim, Norway.
| | - Bent Petersen
- Centre for Evolutionary Hologenomics, The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Øster Farimagsgade 5, 1353, Copenhagen, Denmark.,Centre of Excellence for Omics-Driven Computational Biodiscovery, Faculty of Applied Sciences, AIMST University, Kedah, Malaysia
| | - José Miguel Lazaro-Guevara
- Department of Botany and Biodiversity Research Centre, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Angel Rivera-Colón
- Department of Evolution, Ecology, and Behavior, University of Illinois at Urbana-Champaign, Champaign, IL, USA
| | - Siri Birkeland
- Department of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, Ås, Norway.,Natural History Museum, University of Oslo, Oslo, Norway
| | - Joel Vizueta
- Villum Centre for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Universitetsparken 15, 2100, Copenhagen, Denmark
| | - Siyu Li
- Department of Plant Biology, University of California, Davis, Davis, CA, 95616, USA
| | - Qionghou Li
- Department of Botany and Biodiversity Research Centre, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - João Loureiro
- Centre for Functional Ecology, Department of Life Sciences, University of Coimbra, Calçada Martim de Freitas, 3000-095, Coimbra, Portugal
| | - Chatchai Kosawang
- Department of Geosciences and Natural Resource Management, University of Copenhagen, Rolighedsvej 23, 1958, Frederiksberg C, Denmark
| | - Patricia Jaramillo Díaz
- Estación Científica Charles Darwin, Fundación Charles Darwin, Santa Cruz, Galápagos, Ecuador.,Department of Botany and Plant Physiology, University of Malaga, Malaga, Spain
| | - Gonzalo Rivas-Torres
- Colegio de Ciencias Biológicas y Ambientales COCIBA & Extensión Galápagos, Universidad San Francisco de Quito USFQ, Quito, 170901, Ecuador.,Galapagos Science Center, USFQ, UNC Chapel Hill, San Cristobal, Galapagos, Ecuador.,Estación de Biodiversidad Tiputini, Colegio de Ciencias Biológicas y Ambientales, Universidad San Francisco de Quito USFQ, Quito, Ecuador.,Courtesy Faculty, Department of Wildlife Ecology and Conservation, University of Florida, 110 Newins-Ziegler Hall, Gainesville, FL, 32611, USA
| | | | - Pablo Vargas
- Departamento de Biodiversidad y Conservación, Real Jardín Botánico (RJB-CSIC), Plaza de Murillo 2, 28014, Madrid, Spain
| | - Ross A McCauley
- Department of Biology, Fort Lewis College, Durango, CO, 81301, USA
| | - Gitte Petersen
- Department of Ecology, Environment and Plant Sciences, Stockholm University, SE-106 91, Stockholm, Sweden
| | - Luisa Santos-Bay
- Centre for Evolutionary Hologenomics, The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Øster Farimagsgade 5, 1353, Copenhagen, Denmark
| | - Nathan Wales
- Department of Archaeology, University of York, York, UK
| | - Julian M Catchen
- Department of Evolution, Ecology, and Behavior, University of Illinois at Urbana-Champaign, Champaign, IL, USA
| | - Daniel Machado
- Department of Biotechnology and Food Science, Norwegian University of Science and Technology, Trondheim, 7491, Norway
| | | | - Alexander Suh
- School of Biological Sciences, University of East Anglia, Norwich Research Park, NR4 7TU, Norwich, UK.,Department of Organismal Biology, Evolutionary Biology Centre (EBC), Science for Life Laboratory, Uppsala University, 75236, Uppsala, Sweden
| | - Neelima R Sinha
- Department of Plant Biology, University of California, Davis, Davis, CA, 95616, USA
| | - Lene R Nielsen
- Department of Geosciences and Natural Resource Management, University of Copenhagen, Rolighedsvej 23, 1958, Frederiksberg C, Denmark
| | - Ole Seberg
- The Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark
| | - M Thomas P Gilbert
- Department of Natural History, NTNU University Museum, Norwegian University of Science and Technology, Trondheim, Norway.,Centre for Evolutionary Hologenomics, The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Øster Farimagsgade 5, 1353, Copenhagen, Denmark
| | | | - Loren H Rieseberg
- Department of Botany and Biodiversity Research Centre, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Michael D Martin
- Department of Natural History, NTNU University Museum, Norwegian University of Science and Technology, Trondheim, Norway.
| |
Collapse
|
20
|
Immunity and lifespan: answering long-standing questions with comparative genomics. Trends Genet 2022; 38:650-661. [DOI: 10.1016/j.tig.2022.02.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Revised: 01/14/2022] [Accepted: 02/28/2022] [Indexed: 10/18/2022]
|
21
|
Satellitome Analysis and Transposable Elements Comparison in Geographically Distant Populations of Spodoptera frugiperda. Life (Basel) 2022; 12:life12040521. [PMID: 35455012 PMCID: PMC9026859 DOI: 10.3390/life12040521] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 03/25/2022] [Accepted: 03/28/2022] [Indexed: 11/29/2022] Open
Abstract
Spodoptera frugiperda (fall armyworm) is a member of the superfamily Noctuoidea that accounts for more than a third of all Lepidoptera and includes a considerable number of agricultural and forest pest species. Spodoptera frugiperda is a polyphagous species that is a significant agricultural pest worldwide, emphasizing its economic importance. Spodoptera frugiperda’s genome size, assembly, phylogenetic classification, and transcriptome analysis have all been previously described. However, the different studies reported different compositions of repeated DNA sequences that occupied the whole assembled genome, and the Spodoptera frugiperda genome also lacks the comprehensive study of dynamic satellite DNA. We conducted a comparative analysis of repetitive DNA across geographically distant populations of Spodoptera frugiperda, particularly satellite DNA, using publicly accessible raw genome data from eight different geographical regions. Our results showed that most transposable elements (TEs) were commonly shared across all geographically distant samples, except for the Maverick and PIF/Harbinger elements, which have divergent repeat copies. The TEs age analysis revealed that most TEs families consist of young copies 1–15 million years old; however, PIF/Harbinger has some older/degenerated copies of 30–35 million years old. A total of seven satellite DNA families were discovered, accounting for approximately 0.65% of the entire genome of the Spodoptera frugiperda fall armyworm. The repeat profiling analysis of satellite DNA families revealed differential read depth coverage or copy numbers. The satellite DNA families range in size from the lowest 108 bp SfrSat06-108 families to the largest (1824 bp) SfrSat07-1824 family. We did not observe a statistically significant correlation between monomer length and K2P divergence, copy number, or abundance of each satellite family. Our findings suggest that the satellite DNA families identified in Spodoptera frugiperda account for a considerable proportion of the genome’s repetitive fraction. The satellite DNA families’ repeat profiling revealed a point mutation along the reference sequences. Limited TEs differentiation exists among geographically distant populations of Spodoptera frugiperda.
Collapse
|
22
|
Warmuth VM, Weissensteiner MH, Wolf J. Ineffective silencing of transposable elements on an avian W Chromosome. Genome Res 2022; 32:671-681. [PMID: 35149543 PMCID: PMC8997356 DOI: 10.1101/gr.275465.121] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Accepted: 02/08/2022] [Indexed: 11/24/2022]
Abstract
One of the defining features of transposable elements (TEs) is their ability to move to new locations in the host genome. To minimise the potentially deleterious effects of de novo TE insertions, hosts have evolved several mechanisms to control TE activity, including recombination-mediated removal and epigenetic silencing; however, increasing evidence suggests that silencing of TEs is often incomplete. The crow family experienced a recent radiation of LTR retrotransposons (LTRs), offering an opportunity to gain insight into the regulatory control of young, potentially still active TEs. We quantified the abundance of TE-derived transcripts across several tissues in 15 Eurasian crows (Corvus (corone) spp.) raised under common garden conditions and find evidence for ineffective TE suppression on the female-specific W Chromosome. Using RNA-seq data, we show that ~ 9.5% of all transcribed TEs had considerably greater (average: 16-fold) transcript abundance in female crows, and that more than 85% of these female-biased TEs originated on the W Chromosome. After accounting for differences in TE density among chromosomal classes, W-linked TEs were significantly more highly expressed than TEs residing on other chromosomes, consistent with ineffective silencing on the former. Together, our results suggest that the crow W Chromosome acts as a source of transcriptionally active TEs, with possible negative fitness consequences for female birds analogous to Drosophila (an X/Y system), where overexpression of Y-linked TEs is associated with male-specific aging and fitness loss ('toxic Y').
Collapse
|
23
|
Ramos L, Antunes A. Decoding sex: Elucidating sex determination and how high-quality genome assemblies are untangling the evolutionary dynamics of sex chromosomes. Genomics 2022; 114:110277. [PMID: 35104609 DOI: 10.1016/j.ygeno.2022.110277] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Revised: 12/22/2021] [Accepted: 01/26/2022] [Indexed: 11/28/2022]
Abstract
Sexual reproduction is a diverse and widespread process. In gonochoristic species, the differentiation of sexes occurs through diverse mechanisms, influenced by environmental and genetic factors. In most vertebrates, a master-switch gene is responsible for triggering a sex determination network. However, only a few genes have acquired master-switch functions, and this process is associated with the evolution of sex-chromosomes, which have a significant influence in evolution. Additionally, their highly repetitive regions impose challenges for high-quality sequencing, even using high-throughput, state-of-the-art techniques. Here, we review the mechanisms involved in sex determination and their role in the evolution of species, particularly vertebrates, focusing on sex chromosomes and the challenges involved in sequencing these genomic elements. We also address the improvements provided by the growth of sequencing projects, by generating a massive number of near-gapless, telomere-to-telomere, chromosome-level, phased assemblies, increasing the number and quality of sex-chromosome sequences available for further studies.
Collapse
Affiliation(s)
- Luana Ramos
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal; Department of Biology, Faculty of Sciences, University of Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
| | - Agostinho Antunes
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal; Department of Biology, Faculty of Sciences, University of Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal.
| |
Collapse
|
24
|
Raghavan V, Kraft L, Mesny F, Rigerte L. A simple guide to de novo transcriptome assembly and annotation. Brief Bioinform 2022; 23:6514404. [PMID: 35076693 PMCID: PMC8921630 DOI: 10.1093/bib/bbab563] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 12/03/2021] [Accepted: 12/09/2021] [Indexed: 12/13/2022] Open
Abstract
A transcriptome constructed from short-read RNA sequencing (RNA-seq) is an easily attainable proxy catalog of protein-coding genes when genome assembly is unnecessary, expensive or difficult. In the absence of a sequenced genome to guide the reconstruction process, the transcriptome must be assembled de novo using only the information available in the RNA-seq reads. Subsequently, the sequences must be annotated in order to identify sequence-intrinsic and evolutionary features in them (for example, protein-coding regions). Although straightforward at first glance, de novo transcriptome assembly and annotation can quickly prove to be challenging undertakings. In addition to familiarizing themselves with the conceptual and technical intricacies of the tasks at hand and the numerous pre- and post-processing steps involved, those interested must also grapple with an overwhelmingly large choice of tools. The lack of standardized workflows, fast pace of development of new tools and techniques and paucity of authoritative literature have served to exacerbate the difficulty of the task even further. Here, we present a comprehensive overview of de novo transcriptome assembly and annotation. We discuss the procedures involved, including pre- and post-processing steps, and present a compendium of corresponding tools.
Collapse
Affiliation(s)
- Venket Raghavan
- Corresponding authors: Venket Raghavan, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail: ; Louis Kraft, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail:
| | - Louis Kraft
- Corresponding authors: Venket Raghavan, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail: ; Louis Kraft, Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany. E-mail:
| | | | | |
Collapse
|
25
|
Cerca J, Armstrong EE, Vizueta J, Fernández R, Dimitrov D, Petersen B, Prost S, Rozas J, Petrov D, Gillespie RG. The Tetragnatha kauaiensis Genome Sheds Light on the Origins of Genomic Novelty in Spiders. Genome Biol Evol 2021; 13:evab262. [PMID: 34849853 PMCID: PMC8693713 DOI: 10.1093/gbe/evab262] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/22/2021] [Indexed: 01/07/2023] Open
Abstract
Spiders (Araneae) have a diverse spectrum of morphologies, behaviors, and physiologies. Attempts to understand the genomic-basis of this diversity are often hindered by their large, heterozygous, and AT-rich genomes with high repeat content resulting in highly fragmented, poor-quality assemblies. As a result, the key attributes of spider genomes, including gene family evolution, repeat content, and gene function, remain poorly understood. Here, we used Illumina and Dovetail Chicago technologies to sequence the genome of the long-jawed spider Tetragnatha kauaiensis, producing an assembly distributed along 3,925 scaffolds with an N50 of ∼2 Mb. Using comparative genomics tools, we explore genome evolution across available spider assemblies. Our findings suggest that the previously reported and vast genome size variation in spiders is linked to the different representation and number of transposable elements. Using statistical tools to uncover gene-family level evolution, we find expansions associated with the sensory perception of taste, immunity, and metabolism. In addition, we report strikingly different histories of chemosensory, venom, and silk gene families, with the first two evolving much earlier, affected by the ancestral whole genome duplication in Arachnopulmonata (∼450 Ma) and exhibiting higher numbers. Together, our findings reveal that spider genomes are highly variable and that genomic novelty may have been driven by the burst of an ancient whole genome duplication, followed by gene family and transposable element expansion.
Collapse
Affiliation(s)
- José Cerca
- Berkeley Evolab, Department of Environmental Science, Policy, and Management, UC Berkeley, California, USA
- Frontiers in Evolutionary Zoology, Natural History Museum, University of Oslo, Norway
- Department of Natural History, NTNU University Museum, Norwegian University of Science and Technology, Trondheim, Norway
| | - Ellie E Armstrong
- Berkeley Evolab, Department of Environmental Science, Policy, and Management, UC Berkeley, California, USA
- Department of Biology, Stanford University, California, USA
| | - Joel Vizueta
- Departament de Genètica, Microbiologia i Estadística & Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona, Spain
- Villum Centre for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Denmark
| | - Rosa Fernández
- Institute of Evolutionary Biology (CSIC—Universitat Pompeu Fabra), Barcelona, Spain
| | - Dimitar Dimitrov
- Department of Natural History, University Museum of Bergen, University of Bergen, Norway
| | - Bent Petersen
- Section for Evolutionary Genomics, The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Denmark
- Centre of Excellence for Omics-Driven Computational Biodiscovery, Faculty of Applied Sciences, AIMST University, Kedah, Malaysia
| | - Stefan Prost
- Central Research Laboratories, Natural History Museum Vienna, Vienna, Austria
- University of Veterinary Medicine, Konrad Lorenz Institute of Ethology, Vienna, Austria
- South African National Biodiversity Institute, National Zoological Garden, Pretoria, South Africa
| | - Julio Rozas
- Departament de Genètica, Microbiologia i Estadística & Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona, Spain
| | - Dmitri Petrov
- Department of Biology, Stanford University, California, USA
| | - Rosemary G Gillespie
- Berkeley Evolab, Department of Environmental Science, Policy, and Management, UC Berkeley, California, USA
| |
Collapse
|
26
|
Halgrain M, Bernardet N, Crepeau M, Même N, Narcy A, Hincke M, Réhault-Godbert S. Eggshell decalcification and skeletal mineralization during chicken embryonic development: defining candidate genes in the chorioallantoic membrane. Poult Sci 2021; 101:101622. [PMID: 34959155 PMCID: PMC8717587 DOI: 10.1016/j.psj.2021.101622] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Revised: 10/22/2021] [Accepted: 11/22/2021] [Indexed: 12/31/2022] Open
Abstract
During chicken embryonic development, skeleton calcification mainly relies on the eggshell, whose minerals are progressively solubilized and transported to the embryo via the chorioallantoic membrane (CAM). However, the molecular components involved in this process remain undefined. We assessed eggshell demineralization and calcification of the embryo skeleton after 12 and 16 d of incubation, and analyzed the expression of several candidate genes in the CAM: carbonic anhydrases that are likely involved in secretion of protons for eggshell dissolution (CA2, CA4, CA9), ions transporters and regulators (CALB1, SLC4A1, ATP6V1B2, SGK1, SCGN, PKD2) and vitamin-D binding protein (GC). Our results confirmed that eggshell weight, thickness, and strength decreased during incubation, with a concomitant increase in calcification of embryonic skeletal system. In the CAM, the expression of CA2 increased during incubation while CA4 and CA9 were expressed at similar levels at both stages. SCL4A1 and SCGN were expressed, but not differentially, between the two stages, while the expression of ATP6V1B2 and PKD2 genes decreased. The expression of SGK1 and TRPV6 increased over time, although the expression of the latter gene was barely detectable. In parallel, we analyzed the expression of these candidate genes in the yolk sac (YS), which mediates the transfer of yolk minerals to the embryo during the first half of incubation. In YS, CA2 expression increases during incubation, similar to the CAM, while the expression of the other candidate genes decreases. Moreover, CALB1 and GC genes were found to be expressed during incubation in the YS, in contrast to the CAM where no expression of either was detected. This study demonstrates that the regulation of genes involved in the mobilization of egg minerals during embryonic development is different between the YS and CAM extraembryonic structures. Identification of the full suite of molecular components involved in the transfer of eggshell calcium to the embryo via the CAM should help to better understand the role of this structure in bone mineralization.
Collapse
Affiliation(s)
| | | | | | - Nathalie Même
- INRAE, Université de Tours, BOA, Nouzilly 37380, France
| | - Agnès Narcy
- INRAE, Université de Tours, BOA, Nouzilly 37380, France
| | - Maxwell Hincke
- Departments of Innovation in Medical Education and Cellular and Molecular Medicine, Faculty of Medicine, University of Ottawa, Canada; LE STUDIUM Research Consortium, Loire Valley Institute for Advanced Studies, Orléans-Tours, France
| | | |
Collapse
|
27
|
Caballero-López V, Lundberg M, Sokolovskis K, Bensch S. Transposable elements mark a repeat-rich region associated with migratory phenotypes of willow warblers (Phylloscopus trochilus). Mol Ecol 2021; 31:1128-1141. [PMID: 34837428 DOI: 10.1111/mec.16292] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 10/26/2021] [Accepted: 11/16/2021] [Indexed: 11/30/2022]
Abstract
The genetic basis of bird migration has been the focus of several studies. Two willow warbler subspecies (Phylloscopus trochilus trochilus and Phylloscopus trochilus acredula) follow different migratory routes to wintering grounds in Africa. Their breeding populations overlap in contact areas or "migratory divides" located in central Scandinavia and in eastern Poland. Earlier analyses demonstrated that the genetic differences between these two migratory phenotypes are few and cluster on chromosomes 1 and 5. In addition, an amplified fragment length polymorphism-derived biallelic marker (known as WW2) presents steep clines across both migratory divides but failed to be mapped in the genome. Here, we characterize the WW2 marker and describe its two variants (WW2 ancestral and WW2 derived) as portions of long terminal repeat retrotransposons originating from an ancient infection by an endogenous retrovirus. We used quantitative polymerase chain reaction techniques to quantify copy numbers of the WW2 derived variant in the two subspecies and their hybrids. This, together with genome analyses revealed that WW2 derived variants are much more abundant in P. t. acredula and appear embedded in a large repeat-rich region (>12 Mbp), not associated with the divergent regions of chromosomes 1 or 5. However, it might interact with genetic elements controlling migration direction. Testing this hypothesis further will require knowing the exact location of this region, such as by obtaining more complete genome assemblies preferably in combination with techniques like fluorescence in situ hybridization applied to a willow warbler karyotype, and finally to investigate the copy number of this marker in hybrids with known migratory tracks.
Collapse
Affiliation(s)
| | - Max Lundberg
- Department of Biology, Lund University, Lund, Sweden
| | | | | |
Collapse
|
28
|
Bravo GA, Schmitt CJ, Edwards SV. What Have We Learned from the First 500 Avian Genomes? ANNUAL REVIEW OF ECOLOGY, EVOLUTION, AND SYSTEMATICS 2021. [DOI: 10.1146/annurev-ecolsys-012121-085928] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The increased capacity of DNA sequencing has significantly advanced our understanding of the phylogeny of birds and the proximate and ultimate mechanisms molding their genomic diversity. In less than a decade, the number of available avian reference genomes has increased to over 500—approximately 5% of bird diversity—placing birds in a privileged position to advance the fields of phylogenomics and comparative, functional, and population genomics. Whole-genome sequence data, as well as indels and rare genomic changes, are further resolving the avian tree of life. The accumulation of bird genomes, increasingly with long-read sequence data, greatly improves the resolution of genomic features such as germline-restricted chromosomes and the W chromosome, and is facilitating the comparative integration of genotypes and phenotypes. Community-based initiatives such as the Bird 10,000 Genomes Project and Vertebrate Genome Project are playing a fundamental role in amplifying and coalescing a vibrant international program in avian comparative genomics.
Collapse
Affiliation(s)
- Gustavo A. Bravo
- Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology, Harvard University, Cambridge, Massachusetts 02138, USA;, ,
| | - C. Jonathan Schmitt
- Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology, Harvard University, Cambridge, Massachusetts 02138, USA;, ,
| | - Scott V. Edwards
- Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology, Harvard University, Cambridge, Massachusetts 02138, USA;, ,
| |
Collapse
|
29
|
Ottenburghs J. The genic view of hybridization in the Anthropocene. Evol Appl 2021; 14:2342-2360. [PMID: 34745330 PMCID: PMC8549621 DOI: 10.1111/eva.13223] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Revised: 03/10/2021] [Accepted: 03/10/2021] [Indexed: 12/24/2022] Open
Abstract
Human impact is noticeable around the globe, indicating that a new era might have begun: the Anthropocene. Continuing human activities, including land-use changes, introduction of non-native species and rapid climate change, are altering the distributions of countless species, often giving rise to human-mediated hybridization events. While the interbreeding of different populations or species can have detrimental effects, such as genetic extinction, it can be beneficial in terms of adaptive introgression or an increase in genetic diversity. In this paper, I first review the different mechanisms and outcomes of anthropogenic hybridization based on literature from the last five years (2016-2020). The most common mechanisms leading to the interbreeding of previously isolated taxa include habitat change (51% of the studies) and introduction of non-native species (34% intentional and 19% unintentional). These human-induced hybridization events most often result in introgression (80%). The high incidence of genetic exchange between the hybridizing taxa indicates that the application of a genic view of speciation (and introgression) can provide crucial insights on how to address hybridization events in the Anthropocene. This perspective considers the genome as a dynamic collection of genetic loci with distinct evolutionary histories, giving rise to a heterogenous genomic landscape in terms of genetic differentiation and introgression. First, understanding this genomic landscape can lead to a better selection of diagnostic genetic markers to characterize hybrid populations. Second, describing how introgression patterns vary across the genome can help to predict the likelihood of negative processes, such as demographic and genetic swamping, as well as positive outcomes, such as adaptive introgression. It is especially important to not only quantify how much genetic material introgressed, but also what has been exchanged. Third, comparing introgression patterns in pre-Anthropocene hybridization events with current human-induced cases might provide novel insights into the likelihood of genetic swamping or species collapse during an anthropogenic hybridization event. However, this comparative approach remains to be tested before it can be applied in practice. Finally, the genic view of introgression can be combined with conservation genomic studies to determine the legal status of hybrids and take appropriate measures to manage anthropogenic hybridization events. The interplay between evolutionary and conservation genomics will result in the constant exchange of ideas between these fields which will not only improve our knowledge on the origin of species, but also how to conserve and protect them.
Collapse
Affiliation(s)
- Jente Ottenburghs
- Wildlife Ecology and ConservationWageningen University & ResearchWageningenThe Netherlands
- Forest Ecology and Forest ManagementWageningen University & ResearchWageningenThe Netherlands
| |
Collapse
|
30
|
Comparative Analysis of Transposable Elements in Genus Calliptamus Grasshoppers Revealed That Satellite DNA Contributes to Genome Size Variation. INSECTS 2021; 12:insects12090837. [PMID: 34564277 PMCID: PMC8466570 DOI: 10.3390/insects12090837] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Revised: 09/01/2021] [Accepted: 09/14/2021] [Indexed: 12/15/2022]
Abstract
Simple Summary Calliptamus is a genus of grasshoppers belonging to the family Acrididae. The genus Calliptamus includes approximately 17 recognized species. Calliptamus abbreviatus, Calliptamus italicus, and Calliptamus barbarus are three species that are widely found in northern China. These species are polyphagous, feeding on a variety of wild plants as well as crops, particularly legumes. The genome sizes, phylogenetic position, and transcriptome analysis of the genus Calliptamus were already known previous to this research. The repeatome analysis of these species was missing, which is directly linked to the larger genome sizes of the grasshoppers. Here, we classified repetitive DNA sequences at the level of superfamilies and sub-families, and found that LINE, TcMar-Tc1 and Ty3-gypsy LTR retrotransposons dominated the repeatomes of all genomes, accounting for 16–34% of the total genomes of these species. Satellite DNA dynamic evolutionary changes in all three genomes played a role in genome size evolution. This study would be a valuable source for future genome assemblies. Abstract Transposable elements (TEs) play a significant role in both eukaryotes and prokaryotes genome size evolution, structural changes, duplication, and functional variabilities. However, the large number of different repetitive DNA has hindered the process of assembling reference genomes, and the genus level TEs diversification of the grasshopper massive genomes is still under investigation. The genus Calliptamus diverged from Peripolus around 17 mya and its species divergence dated back about 8.5 mya, but their genome size shows rather large differences. Here, we used low-coverage Illumina unassembled short reads to investigate the effects of evolutionary dynamics of satDNAs and TEs on genome size variations. The Repeatexplorer2 analysis with 0.5X data resulted in 52%, 56%, and 55% as repetitive elements in the genomes of Calliptamus barbarus, Calliptamus italicus, and Calliptamus abbreviatus, respectively. The LINE and Ty3-gypsy LTR retrotransposons and TcMar-Tc1 dominated the repeatomes of all genomes, accounting for 16–35% of the total genomes of these species. Comparative analysis unveiled that most of the transposable elements (TEs) except satDNAs were highly conserved across three genomes in the genus Calliptamus grasshoppers. Out of a total of 20 satDNA families, 17 satDNA families were commonly shared with minor variations in abundance and divergence between three genomes, and 3 were Calliptamus barbarus specific. Our findings suggest that there is a significant amplification or contraction of satDNAs at genus phylogeny which is the main cause that made genome size different.
Collapse
|
31
|
Peona V, Palacios-Gimenez OM, Blommaert J, Liu J, Haryoko T, Jønsson KA, Irestedt M, Zhou Q, Jern P, Suh A. The avian W chromosome is a refugium for endogenous retroviruses with likely effects on female-biased mutational load and genetic incompatibilities. Philos Trans R Soc Lond B Biol Sci 2021; 376:20200186. [PMID: 34304594 PMCID: PMC8310711 DOI: 10.1098/rstb.2020.0186] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/20/2020] [Indexed: 12/17/2022] Open
Abstract
It is a broadly observed pattern that the non-recombining regions of sex-limited chromosomes (Y and W) accumulate more repeats than the rest of the genome, even in species like birds with a low genome-wide repeat content. Here, we show that in birds with highly heteromorphic sex chromosomes, the W chromosome has a transposable element (TE) density of greater than 55% compared to the genome-wide density of less than 10%, and contains over half of all full-length (thus potentially active) endogenous retroviruses (ERVs) of the entire genome. Using RNA-seq and protein mass spectrometry data, we were able to detect signatures of female-specific ERV expression. We hypothesize that the avian W chromosome acts as a refugium for active ERVs, probably leading to female-biased mutational load that may influence female physiology similar to the 'toxic-Y' effect in Drosophila males. Furthermore, Haldane's rule predicts that the heterogametic sex has reduced fertility in hybrids. We propose that the excess of W-linked active ERVs over the rest of the genome may be an additional explanatory variable for Haldane's rule, with consequences for genetic incompatibilities between species through TE/repressor mismatches in hybrids. Together, our results suggest that the sequence content of female-specific W chromosomes can have effects far beyond sex determination and gene dosage. This article is part of the theme issue 'Challenging the paradigm in sex chromosome evolution: empirical and theoretical insights with a focus on vertebrates (Part II)'.
Collapse
Affiliation(s)
- Valentina Peona
- Department of Organismal Biology—Systematic Biology, Uppsala University, Uppsala, Sweden
| | | | - Julie Blommaert
- Department of Organismal Biology—Systematic Biology, Uppsala University, Uppsala, Sweden
| | - Jing Liu
- MOE Laboratory of Biosystems Homeostasis and Protection, Life Sciences Institute, Zhejiang University, Hangzhou, People's Republic of China
- Department of Neuroscience and Development, University of Vienna, Vienna, Austria
| | - Tri Haryoko
- Museum Zoologicum Bogoriense, Research Centre for Biology, Indonesian Institute of Sciences (LIPI), Cibinong, Indonesia
| | - Knud A. Jønsson
- Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark
| | - Martin Irestedt
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden
| | - Qi Zhou
- MOE Laboratory of Biosystems Homeostasis and Protection, Life Sciences Institute, Zhejiang University, Hangzhou, People's Republic of China
- Department of Neuroscience and Development, University of Vienna, Vienna, Austria
- Center for Reproductive Medicine, The 2nd Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou 310052, People's Republic of China
| | - Patric Jern
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Alexander Suh
- Department of Organismal Biology—Systematic Biology, Uppsala University, Uppsala, Sweden
- School of Biological Sciences—Organisms and the Environment, University of East Anglia, Norwich, UK
| |
Collapse
|
32
|
Abstract
Long-read sequencing technologies have now reached a level of accuracy and yield that allows their application to variant detection at a scale of tens to thousands of samples. Concomitant with the development of new computational tools, the first population-scale studies involving long-read sequencing have emerged over the past 2 years and, given the continuous advancement of the field, many more are likely to follow. In this Review, we survey recent developments in population-scale long-read sequencing, highlight potential challenges of a scaled-up approach and provide guidance regarding experimental design. We provide an overview of current long-read sequencing platforms, variant calling methodologies and approaches for de novo assemblies and reference-based mapping approaches. Furthermore, we summarize strategies for variant validation, genotyping and predicting functional impact and emphasize challenges remaining in achieving long-read sequencing at a population scale.
Collapse
Affiliation(s)
- Wouter De Coster
- Applied and Translational Neurogenomics Group, VIB Center for Molecular Neurology, VIB, Antwerp, Belgium
- Applied and Translational Neurogenomics Group, Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | | | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
| |
Collapse
|
33
|
Yamaguchi K, Kadota M, Nishimura O, Ohishi Y, Naito Y, Kuraku S. Technical considerations in Hi-C scaffolding and evaluation of chromosome-scale genome assemblies. Mol Ecol 2021; 30:5923-5934. [PMID: 34432923 PMCID: PMC9292758 DOI: 10.1111/mec.16146] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2020] [Revised: 07/28/2021] [Accepted: 08/18/2021] [Indexed: 12/15/2022]
Abstract
The recent development of ecological studies has been fueled by the introduction of massive information based on chromosome‐scale genome sequences, even for species for which genetic linkage is not accessible. This was enabled mainly by the application of Hi‐C, a method for genome‐wide chromosome conformation capture that was originally developed for investigating the long‐range interaction of chromatins. Performing genomic scaffolding using Hi‐C data is highly resource‐demanding and employs elaborate laboratory steps for sample preparation. It starts with building a primary genome sequence assembly as an input, which is followed by computation for genome scaffolding using Hi‐C data, requiring careful validation. This article presents technical considerations for obtaining optimal Hi‐C scaffolding results and provides a test case of its application to a reptile species, the Madagascar ground gecko (Paroedura picta). Among the metrics that are frequently used for evaluating scaffolding results, we investigate the validity of the completeness assessment of chromosome‐scale genome assemblies using single‐copy reference orthologues.
Collapse
Affiliation(s)
- Kazuaki Yamaguchi
- Laboratory for Phyloinformatics, RIKEN Center for Biosystems Dynamics Research, Kobe, Japan
| | - Mitsutaka Kadota
- Laboratory for Phyloinformatics, RIKEN Center for Biosystems Dynamics Research, Kobe, Japan
| | - Osamu Nishimura
- Laboratory for Phyloinformatics, RIKEN Center for Biosystems Dynamics Research, Kobe, Japan
| | - Yuta Ohishi
- Laboratory for Phyloinformatics, RIKEN Center for Biosystems Dynamics Research, Kobe, Japan
| | - Yuki Naito
- Database Center for Life Science (DBCLS), Mishima, Japan
| | - Shigehiro Kuraku
- Laboratory for Phyloinformatics, RIKEN Center for Biosystems Dynamics Research, Kobe, Japan.,Molecular Life History Laboratory, National Institute of Genetics, Mishima, Japan.,Department of Genetics, Sokendai (Graduate University for Advanced Studies), Mishima, Japan
| |
Collapse
|
34
|
Martí E, Milani D, Bardella VB, Albuquerque L, Song H, Palacios-Gimenez OM, Cabral-de-Mello DC. Cytogenomic analysis unveils mixed molecular evolution and recurrent chromosomal rearrangements shaping the multigene families on Schistocerca grasshopper genomes. Evolution 2021; 75:2027-2041. [PMID: 34155627 DOI: 10.1111/evo.14287] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Revised: 05/11/2021] [Accepted: 05/26/2021] [Indexed: 11/26/2022]
Abstract
Multigene families are essential components of eukaryotic genomes and play key roles either structurally and functionally. Their modes of evolution remain elusive even in the era of genomics, because multiple multigene family sequences coexist in genomes, particularly in large repetitive genomes. Here, we investigate how the multigene families 18S rDNA, U2 snDNA, and H3 histone evolved in 10 species of Schistocerca grasshoppers with very large and repeat-enriched genomes. Using sequenced genomes and fluorescence in situ hybridization mapping, we find substantial differences between species, including the number of chromosomal clusters, changes in sequence abundance and nucleotide composition, pseudogenization, and association with transposable elements (TEs). The intragenomic analysis of Schistocerca gregaria using long-read sequencing and genome assembly unveils conservation for H3 histone and recurrent pseudogenization for 18S rDNA and U2 snDNA, likely promoted by association with TEs and sequence truncation. Remarkably, TEs were frequently associated with truncated copies, were also among the most abundant in the genome, and revealed signatures of recent activity. Our findings suggest a combined effect of concerted and birth-and-death models driving the evolution of multigene families in Schistocerca over the last 8 million years, and the occurrence of intra- and interchromosomal rearrangements shaping their chromosomal distribution. Despite the conserved karyotype in Schistocerca, our analysis highlights the extensive reorganization of repetitive DNAs in Schistocerca, contributing to the advance of comparative genomics for this important grasshopper genus.
Collapse
Affiliation(s)
- Emiliano Martí
- Departamento de Biologia Geral e Aplicada, UNESP - Univ Estadual Paulista, Instituto de Biociências/IB, Rio Claro, 13506-900, Brazil
| | - Diogo Milani
- Departamento de Biologia Geral e Aplicada, UNESP - Univ Estadual Paulista, Instituto de Biociências/IB, Rio Claro, 13506-900, Brazil
| | - Vanessa B Bardella
- Departamento de Biologia Geral e Aplicada, UNESP - Univ Estadual Paulista, Instituto de Biociências/IB, Rio Claro, 13506-900, Brazil
| | - Lucas Albuquerque
- Departamento de Biologia Geral e Aplicada, UNESP - Univ Estadual Paulista, Instituto de Biociências/IB, Rio Claro, 13506-900, Brazil
| | - Hojun Song
- Department of Entomology, Texas A&M University, College Station, Texas, 77843
| | - Octavio M Palacios-Gimenez
- Department of Organismal Biology - Systematic Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, SE-75236, Sweden.,Population Ecology Group, Institute of Ecology and Evolution, Friedrich Schiller University Jena, Jena, DE-07743, Germany
| | - Diogo C Cabral-de-Mello
- Departamento de Biologia Geral e Aplicada, UNESP - Univ Estadual Paulista, Instituto de Biociências/IB, Rio Claro, 13506-900, Brazil
| |
Collapse
|
35
|
Massively parallel sequencing and capillary electrophoresis of a novel panel of falcon STRs: Concordance with minisatellite DNA profiles from historical wildlife crime. Forensic Sci Int Genet 2021; 54:102550. [PMID: 34174583 PMCID: PMC8430417 DOI: 10.1016/j.fsigen.2021.102550] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 06/03/2021] [Accepted: 06/04/2021] [Indexed: 12/12/2022]
Abstract
Birds of prey have suffered persecution for centuries through trapping, shooting, poisoning and theft from the wild to meet the demand from egg collectors and falconers; they were also amongst the earliest beneficiaries of DNA testing in wildlife forensics. Here we report the identification and characterisation of 14 novel tetramer, pentamer and hexamer short tandem repeat (STR) markers which can be typed either by capillary electrophoresis or massively parallel sequencing (MPS) and apply them to historical casework samples involving 49 peregrine falcons, 30 of which were claimed to be the captively bred offspring of nine pairs. The birds were initially tested in 1994 with a multilocus DNA fingerprinting probe, a sex test and eight single-locus minisatellite probes (SLPs) demonstrating that 23 birds were unrelated to the claimed parents. The multilocus and SLP approaches were highly discriminating but extremely time consuming and required microgram quantities of high molecular weight DNA and the use of radioisotopes. The STR markers displayed between 2 and 21 alleles per locus (mean = 7.6), lengths between 140 and 360 bp, and heterozygosities from 0.4 to 0.93. They produced wholly concordant conclusions with similar discrimination power but in a fraction of the time using a hundred-fold less DNA and with standard forensic equipment. Furthermore, eleven of these STRs were amplified in a single reaction and typed using MPS on the Illumina MiSeq platform revealing eight additional alleles (three with variant repeat structures and five solely due to flanking SNPs) across four loci. This approach gave a random match probability of < 1E-9, and a parental pair false inclusion probability of < 1E-5, with a further ten-fold reduction in the amount of DNA required (~3 ng) and the potential to analyse mixed samples. These STRs will be of value in monitoring wild populations of these key indicator species as well as for testing captive breeding claims and establishing a database of captive raptors. They have the potential to resolve complex cases involving trace, mixed and degraded samples from raptor persecution casework representing a significant advance over the previously applied methods.
Collapse
|
36
|
Suh A, Dion-Côté AM. New Perspectives on the Evolution of Within-Individual Genome Variation and Germline/Soma Distinction. Genome Biol Evol 2021; 13:evab095. [PMID: 33963843 PMCID: PMC8245192 DOI: 10.1093/gbe/evab095] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/07/2021] [Indexed: 12/19/2022] Open
Abstract
Genomes can vary significantly even within the same individual. The underlying mechanisms are manifold, ranging from somatic mutation and recombination, development-associated ploidy changes and genetic bottlenecks, over to programmed DNA elimination during germline/soma differentiation. In this perspective piece, we briefly review recent developments in the study of within-individual genome variation in eukaryotes and prokaryotes. We highlight a Society for Molecular Biology and Evolution 2020 virtual symposium entitled "Within-individual genome variation and germline/soma distinction" and the present Special Section of the same name in Genome Biology and Evolution, together fostering cross-taxon synergies in the field to identify and tackle key open questions in the understanding of within-individual genome variation.
Collapse
Affiliation(s)
- Alexander Suh
- School of Biological Sciences—Organisms and the Environment, University of East Anglia, Norwich, United Kingdom
- Department of Organismal Biology—Systematic Biology, Evolutionary Biology Centre (EBC), Science for Life Laboratory, Uppsala University, Sweden
| | | |
Collapse
|
37
|
Vondrak T, Oliveira L, Novák P, Koblížková A, Neumann P, Macas J. Complex sequence organization of heterochromatin in the holocentric plant Cuscuta europaea elucidated by the computational analysis of nanopore reads. Comput Struct Biotechnol J 2021; 19:2179-2189. [PMID: 33995911 PMCID: PMC8091179 DOI: 10.1016/j.csbj.2021.04.011] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2021] [Revised: 03/31/2021] [Accepted: 04/03/2021] [Indexed: 12/20/2022] Open
Abstract
Repeat-rich regions of higher plant genomes are usually associated with constitutive heterochromatin, a specific type of chromatin that forms tightly packed nuclear chromocenters and chromosome bands. There is a large body of cytogenetic evidence that these chromosome regions are often composed of tandemly organized satellite DNA. However, comparatively little is known about the sequence arrangement within heterochromatic regions, which are difficult to assemble due to their repeated nature. Here, we explore long-range sequence organization of heterochromatin regions containing the major satellite repeat CUS-TR24 in the holocentric plant Cuscuta europaea. Using a combination of ultra-long read sequencing with assembly-free sequence analysis, we reveal the complex structure of these loci, which are composed of short arrays of CUS-TR24 interrupted frequently by emerging simple sequence repeats and targeted insertions of a specific lineage of LINE retrotransposons. These data suggest that the organization of satellite repeats constituting heterochromatic chromosome bands can be more complex than previously envisioned, and demonstrate that heterochromatin organization can be efficiently investigated without the need for genome assembly.
Collapse
Affiliation(s)
- Tihana Vondrak
- Biology Centre, Czech Academy of Sciences, Institute of Plant Molecular Biology, Branišovská 31, České Budějovice CZ-37005, Czech Republic
- University of South Bohemia, Faculty of Science, České Budějovice, Czech Republic
| | - Ludmila Oliveira
- Biology Centre, Czech Academy of Sciences, Institute of Plant Molecular Biology, Branišovská 31, České Budějovice CZ-37005, Czech Republic
| | - Petr Novák
- Biology Centre, Czech Academy of Sciences, Institute of Plant Molecular Biology, Branišovská 31, České Budějovice CZ-37005, Czech Republic
| | - Andrea Koblížková
- Biology Centre, Czech Academy of Sciences, Institute of Plant Molecular Biology, Branišovská 31, České Budějovice CZ-37005, Czech Republic
| | - Pavel Neumann
- Biology Centre, Czech Academy of Sciences, Institute of Plant Molecular Biology, Branišovská 31, České Budějovice CZ-37005, Czech Republic
| | - Jiří Macas
- Biology Centre, Czech Academy of Sciences, Institute of Plant Molecular Biology, Branišovská 31, České Budějovice CZ-37005, Czech Republic
| |
Collapse
|
38
|
Blom MPK. Opportunities and challenges for high-quality biodiversity tissue archives in the age of long-read sequencing. Mol Ecol 2021; 30:5935-5948. [PMID: 33786900 DOI: 10.1111/mec.15909] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Revised: 03/06/2021] [Accepted: 03/22/2021] [Indexed: 12/11/2022]
Abstract
The technological ability to characterize genetic variation at a genome-wide scale provides an unprecedented opportunity to study the genetic underpinnings and evolutionary mechanisms that promote and sustain biodiversity. The transition from short- to long-read sequencing is particularly promising and allows a more holistic view on any changes in genetic diversity across time and space. Long-read sequencing has tremendous potential but sequencing success strongly depends on the long-range integrity of DNA molecules and therefore on the availability of high-quality tissue samples. With the scope of genomic experiments expanding and wild populations simultaneously disappearing at an unprecedented rate, access to high-quality samples may soon be a major concern for many projects. The need for high-quality biodiversity tissue archives is therefore urgent but sampling and preserving high-quality samples is not a trivial exercise. In this review, I will briefly outline how long-read sequencing can benefit the study of molecular ecology, how this will substantially increase the demand for high-quality tissues and why it is challenging to preserve DNA integrity. I will then provide an overview of preservation approaches and end with a call for support to acknowledge the efforts needed to assemble high-quality tissue archives. In doing so, I hope to simultaneously motivate field biologists to expand sampling practices and molecular biologists to develop (cost) efficient guidelines for the sampling and long-term storage of tissues. A concerted, interdisciplinary, effort is needed to catalogue the genetic variation underlying contemporary biodiversity and will eventually provide a critical resource for future studies.
Collapse
Affiliation(s)
- Mozes P K Blom
- Leibniz Institut für Evolutions- und Biodiversitätsforschung, Museum für Naturkunde, Berlin, Germany
| |
Collapse
|
39
|
Borůvková V, Howell WM, Matoulek D, Symonová R. Quantitative Approach to Fish Cytogenetics in the Context of Vertebrate Genome Evolution. Genes (Basel) 2021; 12:genes12020312. [PMID: 33671814 PMCID: PMC7926999 DOI: 10.3390/genes12020312] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Revised: 02/01/2021] [Accepted: 02/17/2021] [Indexed: 01/14/2023] Open
Abstract
Our novel Python-based tool EVANGELIST allows the visualization of GC and repeats percentages along chromosomes in sequenced genomes and has enabled us to perform quantitative large-scale analyses on the chromosome level in fish and other vertebrates. This is a different approach from the prevailing analyses, i.e., analyses of GC% in the coding sequences that make up not more than 2% in human. We identified GC content (GC%) elevations in microchromosomes in ancient fish lineages similar to avian microchromosomes and a large variability in the relationship between the chromosome size and their GC% across fish lineages. This raises the question as to what extent does the chromosome size drive GC% as posited by the currently accepted explanation based on the recombination rate. We ascribe the differences found across fishes to varying GC% of repetitive sequences. Generally, our results suggest that the GC% of repeats and proportion of repeats are independent of the chromosome size. This leaves an open space for another mechanism driving the GC evolution in vertebrates.
Collapse
Affiliation(s)
- Veronika Borůvková
- Faculty of Science, University of Hradec Kralove, 500 03 Hradec Kralove, Czech Republic; (V.B.); (D.M.)
| | - W. Mike Howell
- Department of Biological and Environmental Sciences, Samford University, Birmingham, AL 35226, USA;
| | - Dominik Matoulek
- Faculty of Science, University of Hradec Kralove, 500 03 Hradec Kralove, Czech Republic; (V.B.); (D.M.)
| | - Radka Symonová
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, 85354 Freising, Germany
- Correspondence:
| |
Collapse
|
40
|
Ottenburghs J, Geng K, Suh A, Kutter C. Genome Size Reduction and Transposon Activity Impact tRNA Gene Diversity While Ensuring Translational Stability in Birds. Genome Biol Evol 2021; 13:6127176. [PMID: 33533905 PMCID: PMC8044555 DOI: 10.1093/gbe/evab016] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/22/2021] [Indexed: 12/12/2022] Open
Abstract
As a highly diverse vertebrate class, bird species have adapted to various ecological systems. How this phenotypic diversity can be explained genetically is intensively debated and is likely grounded in differences in the genome content. Larger and more complex genomes could allow for greater genetic regulation that results in more phenotypic variety. Surprisingly, avian genomes are much smaller compared to other vertebrates but contain as many protein-coding genes as other vertebrates. This supports the notion that the phenotypic diversity is largely determined by selection on non-coding gene sequences. Transfer RNAs (tRNAs) represent a group of non-coding genes. However, the characteristics of tRNA genes across bird genomes have remained largely unexplored. Here, we exhaustively investigated the evolution and functional consequences of these crucial translational regulators within bird species and across vertebrates. Our dense sampling of 55 avian genomes representing each bird order revealed an average of 169 tRNA genes with at least 31% being actively used. Unlike other vertebrates, avian tRNA genes are reduced in number and complexity but are still in line with vertebrate wobble pairing strategies and mutation-driven codon usage. Our detailed phylogenetic analyses further uncovered that new tRNA genes can emerge through multiplication by transposable elements. Together, this study provides the first comprehensive avian and cross-vertebrate tRNA gene analyses and demonstrates that tRNA gene evolution is flexible albeit constrained within functional boundaries of general mechanisms in protein translation.
Collapse
Affiliation(s)
- Jente Ottenburghs
- Department of Microbiology, Tumor and Cell Biology, Science for Life Laboratory, Karolinska Institute, Stockholm, Sweden.,Department of Ecology and Genetics, Evolutionary Biology Centre, Science for Life Laboratory, Uppsala University, Sweden
| | - Keyi Geng
- Department of Microbiology, Tumor and Cell Biology, Science for Life Laboratory, Karolinska Institute, Stockholm, Sweden
| | - Alexander Suh
- Department of Ecology and Genetics, Evolutionary Biology Centre, Science for Life Laboratory, Uppsala University, Sweden
| | - Claudia Kutter
- Department of Microbiology, Tumor and Cell Biology, Science for Life Laboratory, Karolinska Institute, Stockholm, Sweden
| |
Collapse
|
41
|
Whibley A, Kelley JL, Narum SR. The changing face of genome assemblies: Guidance on achieving high-quality reference genomes. Mol Ecol Resour 2021; 21:641-652. [PMID: 33326691 DOI: 10.1111/1755-0998.13312] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 12/08/2020] [Accepted: 12/11/2020] [Indexed: 12/20/2022]
Abstract
The quality of genome assemblies has improved rapidly in recent years due to continual advances in sequencing technology, assembly approaches, and quality control. In the field of molecular ecology, this has led to the development of exceptional quality genome assemblies that will be important long-term resources for broader studies into ecological, conservation, evolutionary, and population genomics of naturally occurring species. Moreover, the extent to which a single reference genome represents the diversity within a species varies: pan-genomes will become increasingly important ecological genomics resources, particularly in systems found to have considerable presence-absence variation in their functional content. Here, we highlight advances in technology that have raised the bar for genome assembly and provide guidance on standards to achieve exceptional quality reference genomes. Key recommendations include the following: (a) Genome assemblies should include long-read sequencing except in rare cases where it is effectively impossible to acquire adequately preserved samples needed for high molecular weight DNA standards. (b) At least one scaffolding approach should be included with genome assembly such as Hi-C or optical mapping. (c) Genome assemblies should be carefully evaluated, this may involve utilising short read data for genome polishing, error correction, k-mer analyses, and estimating the percent of reads that map back to an assembly. Finally, a genome assembly is most valuable if all data and methods are made publicly available and the utility of a genome for further studies is verified through examples. While these recommendations are based on current technology, we anticipate that future advances will push the field further and the molecular ecology community should continue to adopt new approaches that attain the highest quality genome assemblies.
Collapse
Affiliation(s)
| | | | - Shawn R Narum
- University of Idaho, Moscow, ID, USA.,Columbia River Inter-Tribal Fish Commission, Hagerman, ID, USA
| |
Collapse
|
42
|
Peona V, Blom MPK, Xu L, Burri R, Sullivan S, Bunikis I, Liachko I, Haryoko T, Jønsson KA, Zhou Q, Irestedt M, Suh A. Identifying the causes and consequences of assembly gaps using a multiplatform genome assembly of a bird-of-paradise. Mol Ecol Resour 2021; 21:263-286. [PMID: 32937018 PMCID: PMC7757076 DOI: 10.1111/1755-0998.13252] [Citation(s) in RCA: 74] [Impact Index Per Article: 24.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Revised: 08/21/2020] [Accepted: 08/26/2020] [Indexed: 01/09/2023]
Abstract
Genome assemblies are currently being produced at an impressive rate by consortia and individual laboratories. The low costs and increasing efficiency of sequencing technologies now enable assembling genomes at unprecedented quality and contiguity. However, the difficulty in assembling repeat-rich and GC-rich regions (genomic "dark matter") limits insights into the evolution of genome structure and regulatory networks. Here, we compare the efficiency of currently available sequencing technologies (short/linked/long reads and proximity ligation maps) and combinations thereof in assembling genomic dark matter. By adopting different de novo assembly strategies, we compare individual draft assemblies to a curated multiplatform reference assembly and identify the genomic features that cause gaps within each assembly. We show that a multiplatform assembly implementing long-read, linked-read and proximity sequencing technologies performs best at recovering transposable elements, multicopy MHC genes, GC-rich microchromosomes and the repeat-rich W chromosome. Telomere-to-telomere assemblies are not a reality yet for most organisms, but by leveraging technology choice it is now possible to minimize genome assembly gaps for downstream analysis. We provide a roadmap to tailor sequencing projects for optimized completeness of both the coding and noncoding parts of nonmodel genomes.
Collapse
Affiliation(s)
- Valentina Peona
- Department of Ecology and Genetics—Evolutionary BiologyScience for Life LaboratoriesUppsala UniversityUppsalaSweden
- Department of Organismal Biology—Systematic BiologyScience for Life LaboratoriesUppsala UniversityUppsalaSweden
| | - Mozes P. K. Blom
- Department of Bioinformatics and GeneticsSwedish Museum of Natural HistoryStockholmSweden
- Museum für NaturkundeLeibniz Institut für Evolutions‐ und BiodiversitätsforschungBerlinGermany
| | - Luohao Xu
- Department of Neurosciences and Developmental BiologyUniversity of ViennaViennaAustria
| | - Reto Burri
- Department of Population EcologyInstitute of Ecology and EvolutionFriedrich‐Schiller‐University JenaJenaGermany
| | | | - Ignas Bunikis
- Department of Immunology, Genetics and PathologyScience for Life LaboratoryUppsala Genome CenterUppsala UniversityUppsalaSweden
| | | | - Tri Haryoko
- Research Centre for BiologyMuseum Zoologicum BogorienseIndonesian Institute of Sciences (LIPI)CibinongIndonesia
| | - Knud A. Jønsson
- Natural History Museum of DenmarkUniversity of CopenhagenCopenhagenDenmark
| | - Qi Zhou
- Department of Neurosciences and Developmental BiologyUniversity of ViennaViennaAustria
- MOE Laboratory of Biosystems Homeostasis & ProtectionLife Sciences InstituteZhejiang UniversityHangzhouChina
- Center for Reproductive MedicineThe 2nd Affiliated HospitalSchool of MedicineZhejiang UniversityHangzhouChina
| | - Martin Irestedt
- Department of Bioinformatics and GeneticsSwedish Museum of Natural HistoryStockholmSweden
| | - Alexander Suh
- Department of Ecology and Genetics—Evolutionary BiologyScience for Life LaboratoriesUppsala UniversityUppsalaSweden
- Department of Organismal Biology—Systematic BiologyScience for Life LaboratoriesUppsala UniversityUppsalaSweden
- School of Biological Sciences—Organisms and the EnvironmentUniversity of East AngliaNorwichUK
| |
Collapse
|
43
|
Maiwald S, Weber B, Seibt KM, Schmidt T, Heitkam T. The Cassandra retrotransposon landscape in sugar beet (Beta vulgaris) and related Amaranthaceae: recombination and re-shuffling lead to a high structural variability. ANNALS OF BOTANY 2021; 127:91-109. [PMID: 33009553 PMCID: PMC7750724 DOI: 10.1093/aob/mcaa176] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 09/28/2020] [Indexed: 05/26/2023]
Abstract
BACKGROUND AND AIMS Plant genomes contain many retrotransposons and their derivatives, which are subject to rapid sequence turnover. As non-autonomous retrotransposons do not encode any proteins, they experience reduced selective constraints leading to their diversification into multiple families, usually limited to a few closely related species. In contrast, the non-coding Cassandra terminal repeat retrotransposons in miniature (TRIMs) are widespread in many plants. Their hallmark is a conserved 5S rDNA-derived promoter in their long terminal repeats (LTRs). As sugar beet (Beta vulgaris) has a well-described LTR retrotransposon landscape, we aim to characterize TRIMs in beet and related genomes. METHODS We identified Cassandra retrotransposons in the sugar beet reference genome and characterized their structural relationships. Genomic organization, chromosomal localization, and distribution of Cassandra-TRIMs across the Amaranthaceae were verified by Southern and fluorescent in situ hybridization. KEY RESULTS All 638 Cassandra sequences in the sugar beet genome contain conserved LTRs and thus constitute a single family. Nevertheless, variable internal regions required a subdivision into two Cassandra subfamilies within B. vulgaris. The related Chenopodium quinoa harbours a third subfamily. These subfamilies vary in their distribution within Amaranthaceae genomes, their insertion times and the degree of silencing by small RNAs. Cassandra retrotransposons gave rise to many structural variants, such as solo LTRs or tandemly arranged Cassandra retrotransposons. These Cassandra derivatives point to an interplay of template switch and recombination processes - mechanisms that likely caused Cassandra's subfamily formation and diversification. CONCLUSIONS We traced the evolution of Cassandra in the Amaranthaceae and detected a considerable variability within the short internal regions, whereas the LTRs are strongly conserved in sequence and length. Presumably these hallmarks make Cassandra a prime target for unequal recombination, resulting in the observed structural diversity, an example of the impact of LTR-mediated evolutionary mechanisms on the host genome.
Collapse
Affiliation(s)
- Sophie Maiwald
- Institute of Botany, Technische Universität Dresden, Dresden, Germany
| | - Beatrice Weber
- Institute of Botany, Technische Universität Dresden, Dresden, Germany
| | - Kathrin M Seibt
- Institute of Botany, Technische Universität Dresden, Dresden, Germany
| | - Thomas Schmidt
- Institute of Botany, Technische Universität Dresden, Dresden, Germany
| | - Tony Heitkam
- Institute of Botany, Technische Universität Dresden, Dresden, Germany
| |
Collapse
|
44
|
Cechova M. Probably Correct: Rescuing Repeats with Short and Long Reads. Genes (Basel) 2020; 12:48. [PMID: 33396198 PMCID: PMC7823596 DOI: 10.3390/genes12010048] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Revised: 12/23/2020] [Accepted: 12/24/2020] [Indexed: 02/07/2023] Open
Abstract
Ever since the introduction of high-throughput sequencing following the human genome project, assembling short reads into a reference of sufficient quality posed a significant problem as a large portion of the human genome-estimated 50-69%-is repetitive. As a result, a sizable proportion of sequencing reads is multi-mapping, i.e., without a unique placement in the genome. The two key parameters for whether or not a read is multi-mapping are the read length and genome complexity. Long reads are now able to span difficult, heterochromatic regions, including full centromeres, and characterize chromosomes from "telomere to telomere". Moreover, identical reads or repeat arrays can be differentiated based on their epigenetic marks, such as methylation patterns, aiding in the assembly process. This is despite the fact that long reads still contain a modest percentage of sequencing errors, disorienting the aligners and assemblers both in accuracy and speed. Here, I review the proposed and implemented solutions to the repeat resolution and the multi-mapping read problem, as well as the downstream consequences of reference choice, repeat masking, and proper representation of sex chromosomes. I also consider the forthcoming challenges and solutions with regards to long reads, where we expect the shift from the problem of repeat localization within a single individual to the problem of repeat positioning within pangenomes.
Collapse
Affiliation(s)
- Monika Cechova
- Genetics and Reproductive Biotechnologies, Veterinary Research Institute, Central European Institute of Technology (CEITEC), 621 00 Brno, Czech Republic
| |
Collapse
|
45
|
Palacios-Gimenez OM, Koelman J, Palmada-Flores M, Bradford TM, Jones KK, Cooper SJB, Kawakami T, Suh A. Comparative analysis of morabine grasshopper genomes reveals highly abundant transposable elements and rapidly proliferating satellite DNA repeats. BMC Biol 2020; 18:199. [PMID: 33349252 PMCID: PMC7754599 DOI: 10.1186/s12915-020-00925-x] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2020] [Accepted: 11/10/2020] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Repetitive DNA sequences, including transposable elements (TEs) and tandemly repeated satellite DNA (satDNAs), collectively called the "repeatome", are found in high proportion in organisms across the Tree of Life. Grasshoppers have large genomes, averaging 9 Gb, that contain a high proportion of repetitive DNA, which has hampered progress in assembling reference genomes. Here we combined linked-read genomics with transcriptomics to assemble, characterize, and compare the structure of repetitive DNA sequences in four chromosomal races of the morabine grasshopper Vandiemenella viatica species complex and determine their contribution to genome evolution. RESULTS We obtained linked-read genome assemblies of 2.73-3.27 Gb from estimated genome sizes of 4.26-5.07 Gb DNA per haploid genome of the four chromosomal races of V. viatica. These constitute the third largest insect genomes assembled so far. Combining complementary annotation tools and manual curation, we found a large diversity of TEs and satDNAs, constituting 66 to 75% per genome assembly. A comparison of sequence divergence within the TE classes revealed massive accumulation of recent TEs in all four races (314-463 Mb per assembly), indicating that their large genome sizes are likely due to similar rates of TE accumulation. Transcriptome sequencing showed more biased TE expression in reproductive tissues than somatic tissues, implying permissive transcription in gametogenesis. Out of 129 satDNA families, 102 satDNA families were shared among the four chromosomal races, which likely represent a diversity of satDNA families in the ancestor of the V. viatica chromosomal races. Notably, 50 of these shared satDNA families underwent differential proliferation since the recent diversification of the V. viatica species complex. CONCLUSION This in-depth annotation of the repeatome in morabine grasshoppers provided new insights into the genome evolution of Orthoptera. Our TEs analysis revealed a massive recent accumulation of TEs equivalent to the size of entire Drosophila genomes, which likely explains the large genome sizes in grasshoppers. Despite an overall high similarity of the TE and satDNA diversity between races, the patterns of TE expression and satDNA proliferation suggest rapid evolution of grasshopper genomes on recent timescales.
Collapse
Affiliation(s)
- Octavio M Palacios-Gimenez
- Department of Ecology and Genetics - Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, SE-752 36, Uppsala, Sweden.
- Department of Organismal Biology - Systematic Biology, Evolutionary Biology Centre, Uppsala University, SE-752 36, Uppsala, Sweden.
| | - Julia Koelman
- Department of Ecology and Genetics - Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, SE-752 36, Uppsala, Sweden
| | - Marc Palmada-Flores
- Department of Ecology and Genetics - Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, SE-752 36, Uppsala, Sweden
| | - Tessa M Bradford
- Evolutionary Biology Unit, South Australian Museum, Adelaide, SA, 5000, Australia
- School of Biological Sciences and Australian Centre for Evolutionary Biology and Biodiversity, The University of Adelaide, Adelaide, SA, 5005, Australia
| | - Karl K Jones
- Evolutionary Biology Unit, South Australian Museum, Adelaide, SA, 5000, Australia
| | - Steven J B Cooper
- Evolutionary Biology Unit, South Australian Museum, Adelaide, SA, 5000, Australia
- School of Biological Sciences and Australian Centre for Evolutionary Biology and Biodiversity, The University of Adelaide, Adelaide, SA, 5005, Australia
| | - Takeshi Kawakami
- Department of Ecology and Genetics - Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, SE-752 36, Uppsala, Sweden.
- Embark Veterinary, Inc., Boston, MA, USA.
| | - Alexander Suh
- Department of Ecology and Genetics - Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, SE-752 36, Uppsala, Sweden.
- Department of Organismal Biology - Systematic Biology, Evolutionary Biology Centre, Uppsala University, SE-752 36, Uppsala, Sweden.
- School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich, NR4 7TU, UK.
| |
Collapse
|
46
|
Borges dos Santos L, Paulo Gomes Viana J, José Biasotto Francischini F, Victoria Fogliata S, L. Joyce A, Pereira de Souza A, Gabriela Murúa M, J. Clough S, Imaculada Zucchi M. A first draft genome of the Sugarcane borer, Diatraea saccharalis. F1000Res 2020. [DOI: 10.12688/f1000research.26614.1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Background: The sugarcane borer (Diatraea saccharalis), a widely distributed moth throughout the Americas, is a pest that affects economically important crops such as sugarcane, sorghum, wheat, maize and rice. Given its significant impact on yield reduction, whole-genome information of the species is needed. Here, we report the first draft assembly of the D. saccharalis genome. Methods: The genomic sequences were obtained using the Illumina HiSeq 2500 whole-genome sequencing of a single adult male specimen. We assembled the short-reads using the SPAdes software and predicted protein-coding genes using MAKER. Genome assembly completeness was assessed through BUSCO and the repetitive content by RepeatMasker. Results: The 453 Mb assembled sequences contain 1,445 BUSCO gene orthologs and 1,161 predicted gene models identified based on homology evidence to the domestic silk moth, Bombyx mori. The repeat content composes 41.18% of the genomic sequences which is in the range of other lepidopteran species. Conclusions: Functional annotation reveals that predicted gene models are involved in important cellular mechanisms such as metabolic pathways and protein synthesis. Thus, the data generated in this study expands our knowledge on the genomic characteristics of this devastating pest and provides essential resources for future genetic studies of the species.
Collapse
|
47
|
Blommaert J. Genome size evolution: towards new model systems for old questions. Proc Biol Sci 2020; 287:20201441. [PMID: 32842932 PMCID: PMC7482279 DOI: 10.1098/rspb.2020.1441] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Accepted: 07/29/2020] [Indexed: 12/20/2022] Open
Abstract
Genome size (GS) variation is a fundamental biological characteristic; however, its evolutionary causes and consequences are the topic of ongoing debate. Whether GS is a neutral trait or one subject to selective pressures, and how strong these selective pressures are, may remain open questions. Fundamentally, the genomic sequences responsible for this variation directly impact the potential evolutionary outcomes and, equally, are the targets of different evolutionary pressures. For example, duplications and deletions of genic regions (large or small) can have immediate and drastic phenotypic effects, while an expansion or contraction of non-coding DNA is less likely to cause catastrophic phenotypic effects. However, in the long term, the accumulation or deletion of ncDNA is likely to have larger effects. Modern sequencing technologies are allowing for the dissection of these proximate causes, but a combination of these new technologies with more traditional evolutionary experiments and approaches could revolutionize this debate and potentially resolve many of these arguments. Here, I discuss an ambitious way forward for GS research, putting it in context of historical debates, theories and sometimes contradictory evidence, and highlighting the promise of combining new sequencing technologies and analytical developments with more traditional experimental evolution approaches.
Collapse
Affiliation(s)
- Julie Blommaert
- Department of Organismal Biology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
48
|
Weissensteiner MH, Bunikis I, Catalán A, Francoijs KJ, Knief U, Heim W, Peona V, Pophaly SD, Sedlazeck FJ, Suh A, Warmuth VM, Wolf JBW. Discovery and population genomics of structural variation in a songbird genus. Nat Commun 2020; 11:3403. [PMID: 32636372 PMCID: PMC7341801 DOI: 10.1038/s41467-020-17195-4] [Citation(s) in RCA: 59] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2019] [Accepted: 06/16/2020] [Indexed: 02/07/2023] Open
Abstract
Structural variation (SV) constitutes an important type of genetic mutations providing the raw material for evolution. Here, we uncover the genome-wide spectrum of intra- and interspecific SV segregating in natural populations of seven songbird species in the genus Corvus. Combining short-read (N = 127) and long-read re-sequencing (N = 31), as well as optical mapping (N = 16), we apply both assembly- and read mapping approaches to detect SV and characterize a total of 220,452 insertions, deletions and inversions. We exploit sampling across wide phylogenetic timescales to validate SV genotypes and assess the contribution of SV to evolutionary processes in an avian model of incipient speciation. We reveal an evolutionary young (~530,000 years) cis-acting 2.25-kb LTR retrotransposon insertion reducing expression of the NDP gene with consequences for premating isolation. Our results attest to the wealth and evolutionary significance of SV segregating in natural populations and highlight the need for reliable SV genotyping.
Collapse
Affiliation(s)
- Matthias H Weissensteiner
- Department of Evolutionary Biology and Science for Life Laboratory, Uppsala University, 752 36, Uppsala, Sweden.
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Grosshaderner Str. 2, 82152, Planegg-Martinsried, Germany.
- Department of Biology, Pennsylvania State University, 310 Wartik Lab, University Park, PA, 16802, USA.
| | - Ignas Bunikis
- Uppsala Genome Center, Science for Life Laboratory, Department of Immunology, Genetics and Pathology, Uppsala University, BMC, Box 815, 752 37, Uppsala, Sweden
| | - Ana Catalán
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Grosshaderner Str. 2, 82152, Planegg-Martinsried, Germany
| | | | - Ulrich Knief
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Grosshaderner Str. 2, 82152, Planegg-Martinsried, Germany
| | - Wieland Heim
- Institute of Landscsape Ecology, University of Münster, Heisenbergstrasse 2, 48149, Münster, Germany
| | - Valentina Peona
- Department of Evolutionary Biology and Science for Life Laboratory, Uppsala University, 752 36, Uppsala, Sweden
- Department of Organismal Biology - Systematic Biology, Uppsala University, 752 36, Uppsala, Sweden
| | - Saurabh D Pophaly
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Grosshaderner Str. 2, 82152, Planegg-Martinsried, Germany
- Max Planck Institute for Plant Breeding Research, Carl-von-Linné-Weg 10, 50829, Cologne, Germany
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center at Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA
| | - Alexander Suh
- Department of Evolutionary Biology and Science for Life Laboratory, Uppsala University, 752 36, Uppsala, Sweden
- Department of Organismal Biology - Systematic Biology, Uppsala University, 752 36, Uppsala, Sweden
- School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich, NR4 7TU, UK
| | - Vera M Warmuth
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Grosshaderner Str. 2, 82152, Planegg-Martinsried, Germany
| | - Jochen B W Wolf
- Department of Evolutionary Biology and Science for Life Laboratory, Uppsala University, 752 36, Uppsala, Sweden.
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Grosshaderner Str. 2, 82152, Planegg-Martinsried, Germany.
| |
Collapse
|
49
|
Fietz K, Trofimenko E, Guerin PE, Arnal V, Torres-Oliva M, Lobréaux S, Pérez-Ruzafa A, Manel S, Puebla O. New genomic resources for three exploited Mediterranean fishes. Genomics 2020; 112:4297-4303. [PMID: 32629099 DOI: 10.1016/j.ygeno.2020.06.041] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2020] [Revised: 06/22/2020] [Accepted: 06/24/2020] [Indexed: 10/23/2022]
Abstract
Extensive fishing has led to fish stock declines throughout the last decades. While clear stock identification is required for designing management schemes, stock delineation is problematic due to generally low levels of genetic structure in marine species. The development of genomic resources can help to solve this issue. Here, we present the first mitochondrial and nuclear draft genome assemblies of three economically important Mediterranean fishes, the white seabream, the striped red mullet, and the comber. The assemblies are between 613 and 785 Mbp long and contain between 27,222 and 32,375 predicted genes. They were used as references to map Restriction-site Associated DNA markers, which were developed with a single-digest approach. This approach provided between 15,710 and 21,101 Single Nucleotide Polymorphism markers per species. These genomic resources will allow uncovering subtle genetic structure, identifying stocks, assigning catches to populations and assessing connectivity. Furthermore, the annotated genomes will help to characterize adaptive divergence.
Collapse
Affiliation(s)
- Katharina Fietz
- GEOMAR Helmholtz Centre for Ocean Research Kiel, Evolutionary Ecology of Marine Fishes, Düsternbrooker Weg 20, 24105 Kiel, Germany
| | - Elena Trofimenko
- GEOMAR Helmholtz Centre for Ocean Research Kiel, Evolutionary Ecology of Marine Fishes, Düsternbrooker Weg 20, 24105 Kiel, Germany
| | - Pierre-Edouard Guerin
- CEFE, Univ Montpellier, CNRS, EPHE-PSL University, IRD, Univ Paul Valéry Montpellier 3, Montpellier, France
| | - Véronique Arnal
- CEFE, Univ Montpellier, CNRS, EPHE-PSL University, IRD, Univ Paul Valéry Montpellier 3, Montpellier, France
| | - Montserrat Torres-Oliva
- Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel, University Hospital Schleswig-Holstein, Kiel, Germany
| | - Stéphane Lobréaux
- Laboratoire d'Ecologie Alpine, CNRS, Université Grenoble-Alpes, Grenoble, France
| | - Angel Pérez-Ruzafa
- Departmento de Ecología e Hidrología, Facultad de Biología, Campus de Espinardo, Regional Campus of International Excellence "Campus Mare Nostrum", University of Murcia, 30100 Murcia, Spain
| | - Stéphanie Manel
- CEFE, Univ Montpellier, CNRS, EPHE-PSL University, IRD, Univ Paul Valéry Montpellier 3, Montpellier, France.
| | - Oscar Puebla
- GEOMAR Helmholtz Centre for Ocean Research Kiel, Evolutionary Ecology of Marine Fishes, Düsternbrooker Weg 20, 24105 Kiel, Germany; Leibniz Centre for Tropical Marine Research, Fahrenheitstrasse 6, 28359 Bremen, Germany
| |
Collapse
|
50
|
Heitkam T, Weber B, Walter I, Liedtke S, Ost C, Schmidt T. Satellite DNA landscapes after allotetraploidization of quinoa (Chenopodium quinoa) reveal unique A and B subgenomes. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 103:32-52. [PMID: 31981259 DOI: 10.1111/tpj.14705] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Revised: 12/10/2019] [Accepted: 01/17/2020] [Indexed: 06/10/2023]
Abstract
If two related plant species hybridize, their genomes may be combined and duplicated within a single nucleus, thereby forming an allotetraploid. How the emerging plant balances two co-evolved genomes is still a matter of ongoing research. Here, we focus on satellite DNA (satDNA), the fastest turn-over sequence class in eukaryotes, aiming to trace its emergence, amplification, and loss during plant speciation and allopolyploidization. As a model, we used Chenopodium quinoa Willd. (quinoa), an allopolyploid crop with 2n = 4x = 36 chromosomes. Quinoa originated by hybridization of an unknown female American Chenopodium diploid (AA genome) with an unknown male Old World diploid species (BB genome), dating back 3.3-6.3 million years. Applying short read clustering to quinoa (AABB), C. pallidicaule (AA), and C. suecicum (BB) whole genome shotgun sequences, we classified their repetitive fractions, and identified and characterized seven satDNA families, together with the 5S rDNA model repeat. We show unequal satDNA amplification (two families) and exclusive occurrence (four families) in the AA and BB diploids by read mapping as well as Southern, genomic, and fluorescent in situ hybridization. Whereas the satDNA distributions support C. suecicum as possible parental species, we were able to exclude C. pallidicaule as progenitor due to unique repeat profiles. Using quinoa long reads and scaffolds, we detected only limited evidence of intergenomic homogenization of satDNA after allopolyploidization, but were able to exclude dispersal of 5S rRNA genes between subgenomes. Our results exemplify the complex route of tandem repeat evolution through Chenopodium speciation and allopolyploidization, and may provide sequence targets for the identification of quinoa's progenitors.
Collapse
Affiliation(s)
- Tony Heitkam
- Institute of Botany, Technische Universität Dresden, 01069, Dresden, Germany
| | - Beatrice Weber
- Institute of Botany, Technische Universität Dresden, 01069, Dresden, Germany
| | - Ines Walter
- Institute of Botany, Technische Universität Dresden, 01069, Dresden, Germany
| | - Susan Liedtke
- Institute of Botany, Technische Universität Dresden, 01069, Dresden, Germany
| | - Charlotte Ost
- Institute of Botany, Technische Universität Dresden, 01069, Dresden, Germany
- Institute of Biology, Martin-Luther-Universität Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Thomas Schmidt
- Institute of Botany, Technische Universität Dresden, 01069, Dresden, Germany
| |
Collapse
|