1
|
Madrigal G, Minhas BF, Catchen J. Klumpy: A tool to evaluate the integrity of long-read genome assemblies and illusive sequence motifs. Mol Ecol Resour 2024:e13982. [PMID: 38800997 DOI: 10.1111/1755-0998.13982] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Accepted: 05/13/2024] [Indexed: 05/29/2024]
Abstract
The improvement and decreasing costs of third-generation sequencing technologies has widened the scope of biological questions researchers can address with de novo genome assemblies. With the increasing number of reference genomes, validating their integrity with minimal overhead is vital for establishing confident results in their applications. Here, we present Klumpy, a tool for detecting and visualizing both misassembled regions in a genome assembly and genetic elements (e.g. genes) of interest in a set of sequences. By leveraging the initial raw reads in combination with their respective genome assembly, we illustrate Klumpy's utility by investigating antifreeze glycoprotein (afgp) loci across two icefishes, by searching for a reported absent gene in the northern snakehead fish, and by scanning the reference genomes of a mudskipper and bumblebee for misassembled regions. In the two former cases, we were able to provide support for the noncanonical placement of an afgp locus in the icefishes and locate the missing snakehead gene. Furthermore, our genome scans were able identify an unmappable locus in the mudskipper reference genome and identify a putative repetitive element shared among several species of bees.
Collapse
Affiliation(s)
- Giovanni Madrigal
- Department of Evolution, Ecology, and Behavior, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
| | - Bushra Fazal Minhas
- Informatics Program, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
| | - Julian Catchen
- Department of Evolution, Ecology, and Behavior, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
- Informatics Program, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
| |
Collapse
|
2
|
Ferguson S, Jones A, Murray K, Andrew R, Schwessinger B, Borevitz J. Plant genome evolution in the genus Eucalyptus is driven by structural rearrangements that promote sequence divergence. Genome Res 2024; 34:606-619. [PMID: 38589251 PMCID: PMC11146599 DOI: 10.1101/gr.277999.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Accepted: 03/22/2024] [Indexed: 04/10/2024]
Abstract
Genomes have a highly organized architecture (nonrandom organization of functional and nonfunctional genetic elements within chromosomes) that is essential for many biological functions, particularly gene expression and reproduction. Despite the need to conserve genome architecture, a high level of structural variation has been observed within species. As species separate and diverge, genome architecture also diverges, becoming increasingly poorly conserved as divergence time increases. However, within plant genomes, the processes of genome architecture divergence are not well described. Here we use long-read sequencing and de novo assembly of 33 phylogenetically diverse, wild and naturally evolving Eucalyptus species, covering 1-50 million years of diverging genome evolution to measure genome architectural conservation and describe architectural divergence. The investigation of these genomes revealed that following lineage divergence, genome architecture is highly fragmented by rearrangements. As genomes continue to diverge, the accumulation of mutations and the subsequent divergence beyond recognition of rearrangements become the primary driver of genome divergence. The loss of syntenic regions also contribute to genome divergence but at a slower pace than that of rearrangements. We hypothesize that duplications and translocations are potentially the greatest contributors to Eucalyptus genome divergence.
Collapse
Affiliation(s)
- Scott Ferguson
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, 2601, Australia;
| | - Ashley Jones
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, 2601, Australia;
| | - Kevin Murray
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, 2601, Australia
- Weigel Department, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
| | - Rose Andrew
- Botany & N.C.W. Beadle Herbarium, School of Environmental and Rural Science, University of New England, Armidale, New South Wales 2351, Australia
| | - Benjamin Schwessinger
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, 2601, Australia
| | - Justin Borevitz
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, 2601, Australia
| |
Collapse
|
3
|
Fukasawa Y, Driguez P, Bougouffa S, Carty K, Putra A, Cheung MS, Ermini L. Plasticity of repetitive sequences demonstrated by the complete mitochondrial genome of Eucalyptus camaldulensis. FRONTIERS IN PLANT SCIENCE 2024; 15:1339594. [PMID: 38601302 PMCID: PMC11005031 DOI: 10.3389/fpls.2024.1339594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Accepted: 03/07/2024] [Indexed: 04/12/2024]
Abstract
The tree Eucalyptus camaldulensis is a ubiquitous member of the Eucalyptus genus, which includes several hundred species. Despite the extensive sequencing and assembly of nuclear genomes from various eucalypts, the genus has only one fully annotated and complete mitochondrial genome (mitogenome). Plant mitochondria are characterized by dynamic genomic rearrangements, facilitated by repeat content, a feature that has hindered the assembly of plant mitogenomes. This complexity is evident in the paucity of available mitogenomes. This study, to the best of our knowledge, presents the first E. camaldulensis mitogenome. Our findings suggest the presence of multiple isomeric forms of the E. camaldulensis mitogenome and provide novel insights into minor rearrangements triggered by nested repeat sequences. A comparative sequence analysis of the E. camaldulensis and E. grandis mitogenomes unveils evolutionary changes between the two genomes. A significant divergence is the evolution of a large repeat sequence, which may have contributed to the differences observed between the two genomes. The largest repeat sequences in the E. camaldulensis mitogenome align well with significant yet unexplained structural variations in the E. grandis mitogenome, highlighting the adaptability of repeat sequences in plant mitogenomes.
Collapse
Affiliation(s)
- Yoshinori Fukasawa
- Center for Bioscience Research and Education, Utsunomiya University, Utsunomiya, Japan
| | - Patrick Driguez
- Core Labs, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Salim Bougouffa
- Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Karen Carty
- Core Labs, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Alexander Putra
- Core Labs, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Ming-Sin Cheung
- Core Labs, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Luca Ermini
- NORLUX NeuroOncology Laboratory, Department of Cancer Research, Luxembourg Institute of Health, Luxembourg, Luxembourg
| |
Collapse
|
4
|
Kang JN, Lee SM, Choi JW, Lee SS, Kim CK. First Contiguous Genome Assembly of Japanese Lady Bell ( Adenophora triphylla) and Insights into Development of Different Leaf Types. Genes (Basel) 2023; 15:58. [PMID: 38254948 PMCID: PMC10815912 DOI: 10.3390/genes15010058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 12/26/2023] [Accepted: 12/27/2023] [Indexed: 01/24/2024] Open
Abstract
Adenophora triphylla is an important medicinal and food plant found in East Asia. This plant is rich in secondary metabolites such as triterpenoid saponin, and its leaves can develop into different types, such as round and linear, depending on the origin of germination even within the same species. Despite this, few studies have comprehensively characterized the development processes of different leaf types and triterpenoid saponin pathways in this plant. Herein, we provide the first report of a high-quality genome assembly of A. triphylla based on a combination of Oxford Nanopore Technologies and Illumina sequencing methods. Its genome size was estimated to be 2.6 Gb, and the assembled genome finalized as 2.48 Gb, containing 57,729 protein-coding genes. Genome completeness was assessed as 95.6% using the Benchmarking Universal Single-Copy Orthologs score. The evolutionary divergence of A. triphylla was investigated using the genomes of five plant species, including two other species in the Campanulaceae family. The species A. triphylla diverged approximately 51-118 million years ago from the other four plants, and 579 expanded/contracted gene families were clustered in the Gene Ontology terms. The expansion of the β-amyrin synthase (bAS) gene, a key enzyme in the triterpenoid saponin pathway, was identified in the A. triphylla genome. Furthermore, transcriptome analysis of the two leaf types revealed differences in the activity of starch, sucrose, unsaturated fatty acid pathways, and oxidoreductase enzymes. The heat and endoplasmic reticulum pathways related to plant stress were active in the development of round type leaf, while an enhancement of pyrimidine metabolism related to cell development was confirmed in the development of the linear type leaf. This study provides insight into the evolution of bAS genes and the development of different leaf types in A. triphylla.
Collapse
Affiliation(s)
- Ji-Nam Kang
- Genomics Division, National Institute of Agricultural Sciences, Jeonju 54874, Republic of Korea; (J.-N.K.); (S.-M.L.)
| | - Si-Myung Lee
- Genomics Division, National Institute of Agricultural Sciences, Jeonju 54874, Republic of Korea; (J.-N.K.); (S.-M.L.)
| | - Ji-Weon Choi
- Postharvest Technology Division, National Institute of Horticultural and Herbal Science, Wanju 55365, Republic of Korea;
| | - Seung-Sik Lee
- Advanced Radiation Technology Institute, Korea Atomic Energy Research Institute, Jeongeup 56212, Republic of Korea;
- Department of Radiation Science, University of Science and Technology, Daejeon 34113, Republic of Korea
| | - Chang-Kug Kim
- Genomics Division, National Institute of Agricultural Sciences, Jeonju 54874, Republic of Korea; (J.-N.K.); (S.-M.L.)
| |
Collapse
|
5
|
Koo H, Lee GW, Ko SR, Go S, Kwon SY, Kim YM, Shin AY. Two long read-based genome assembly and annotation of polyploidy woody plants, Hibiscus syriacus L. using PacBio and Nanopore platforms. Sci Data 2023; 10:713. [PMID: 37853021 PMCID: PMC10584963 DOI: 10.1038/s41597-023-02631-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Accepted: 10/11/2023] [Indexed: 10/20/2023] Open
Abstract
Improvements in long read DNA sequencing and related techniques facilitated the generation of complex eukaryotic genomes. Despite these advances, the quality of constructed plant reference genomes remains relatively poor due to the large size of genomes, high content of repetitive sequences, and wide variety of ploidy. Here, we developed the de novo sequencing and assembly of high polyploid plant genome, Hibiscus syriacus, a flowering plant species of the Malvaceae family, using the Oxford Nanopore Technologies and Pacific Biosciences Sequel sequencing platforms. We investigated an efficient combination of high-quality and high-molecular-weight DNA isolation procedure and suitable assembler to achieve optimal results using long read sequencing data. We found that abundant ultra-long reads allow for large and complex polyploid plant genome assemblies with great recovery of repetitive sequences and error correction even at relatively low depth Nanopore sequencing data and polishing compared to previous studies. Collectively, our combination provides cost effective methods to improve genome continuity and quality compared to the previously reported reference genome by accessing highly repetitive regions. The application of this combination may enable genetic research and breeding of polyploid crops, thus leading to improvements in crop production.
Collapse
Affiliation(s)
- Hyunjin Koo
- Plant Systems Engineering Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), Daejeon, 34141, Republic of Korea
| | - Gir-Won Lee
- SML Genetree Co. Ltd., Seoul, 05855, Republic of Korea
| | - Seo-Rin Ko
- Plant Systems Engineering Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), Daejeon, 34141, Republic of Korea
- Biosystems and Bioengineering Program, University of Science and Technology, Daejeon, 34113, Korea
| | - Sangjin Go
- Plant Systems Engineering Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), Daejeon, 34141, Republic of Korea
- Biosystems and Bioengineering Program, University of Science and Technology, Daejeon, 34113, Korea
| | - Suk-Yoon Kwon
- Plant Systems Engineering Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), Daejeon, 34141, Republic of Korea
- Biosystems and Bioengineering Program, University of Science and Technology, Daejeon, 34113, Korea
| | - Yong-Min Kim
- Plant Systems Engineering Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), Daejeon, 34141, Republic of Korea.
- Department of Bioinformatics, KRIBB School of Bioscience, Korea University of Science and Technology (UST), Daejeon, 34141, Republic of Korea.
- Digital Biotech Innovation Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), Daejeon, 34141, Republic of Korea.
| | - Ah-Young Shin
- Plant Systems Engineering Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), Daejeon, 34141, Republic of Korea.
- Department of Bioinformatics, KRIBB School of Bioscience, Korea University of Science and Technology (UST), Daejeon, 34141, Republic of Korea.
| |
Collapse
|
6
|
Mochizuki T, Sakamoto M, Tanizawa Y, Nakayama T, Tanifuji G, Kamikawa R, Nakamura Y. A practical assembly guideline for genomes with various levels of heterozygosity. Brief Bioinform 2023; 24:bbad337. [PMID: 37798248 PMCID: PMC10555665 DOI: 10.1093/bib/bbad337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 08/06/2023] [Accepted: 09/03/2023] [Indexed: 10/07/2023] Open
Abstract
Although current long-read sequencing technologies have a long-read length that facilitates assembly for genome reconstruction, they have high sequence errors. While various assemblers with different perspectives have been developed, no systematic evaluation of assemblers with long reads for diploid genomes with varying heterozygosity has been performed. Here, we evaluated a series of processes, including the estimation of genome characteristics such as genome size and heterozygosity, de novo assembly, polishing, and removal of allelic contigs, using six genomes with various heterozygosity levels. We evaluated five long-read-only assemblers (Canu, Flye, miniasm, NextDenovo and Redbean) and five hybrid assemblers that combine short and long reads (HASLR, MaSuRCA, Platanus-allee, SPAdes and WENGAN) and proposed a concrete guideline for the construction of haplotype representation according to the degree of heterozygosity, followed by polishing and purging haplotigs, using stable and high-performance assemblers: Redbean, Flye and MaSuRCA.
Collapse
Affiliation(s)
| | - Mika Sakamoto
- Genome Informatics Laboratory, National Institute of Genetics
| | | | - Takuro Nakayama
- Division of Life Sciences Center for Computational Sciences, University of Tsukuba, Japan
| | - Goro Tanifuji
- Department of Zoology, National Museum of Nature and Science
| | | | | |
Collapse
|
7
|
Roberts MB, Schultz DT, Gatins R, Escalona M, Bernardi G. Chromosome-level genome of the three-spot damselfish, Dascyllus trimaculatus. G3 (BETHESDA, MD.) 2023; 13:jkac339. [PMID: 36905099 PMCID: PMC10085752 DOI: 10.1093/g3journal/jkac339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Accepted: 09/14/2022] [Indexed: 04/12/2023]
Abstract
Damselfishes (Family: Pomacentridae) are a group of ecologically important, primarily coral reef fishes that include over 400 species. Damselfishes have been used as model organisms to study recruitment (anemonefishes), the effects of ocean acidification (spiny damselfish), population structure, and speciation (Dascyllus). The genus Dascyllus includes a group of small-bodied species, and a complex of relatively larger bodied species, the Dascyllus trimaculatus species complex that is comprised of several species including D. trimaculatus itself. The three-spot damselfish, D. trimaculatus, is a widespread and common coral reef fish species found across the tropical Indo-Pacific. Here, we present the first-genome assembly of this species. This assembly contains 910 Mb, 90% of the bases are in 24 chromosome-scale scaffolds, and the Benchmarking Universal Single-Copy Orthologs score of the assembly is 97.9%. Our findings confirm previous reports of a karyotype of 2n = 47 in D. trimaculatus in which one parent contributes 24 chromosomes and the other 23. We find evidence that this karyotype is the result of a heterozygous Robertsonian fusion. We also find that the D. trimaculatus chromosomes are each homologous with single chromosomes of the closely related clownfish species, Amphiprion percula. This assembly will be a valuable resource in the population genomics and conservation of Damselfishes, and continued studies of the karyotypic diversity in this clade.
Collapse
Affiliation(s)
- May B Roberts
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, Santa Cruz, CA 95060, USA
| | - Darrin T Schultz
- Department of Molecular Evolution and Development, University of Vienna, Vienna 1010, Austria
- Monterey Bay Aquarium Research Institute, Moss Landing, CA 95039, USA
- Department of Biomolecular Engineering and Bioinformatics, University of California, Santa Cruz, Santa Cruz, CA 95060, USA
| | - Remy Gatins
- Department of Marine Sciences, Northeastern University, Boston, MA 02115, USA
| | - Merly Escalona
- Department of Biomolecular Engineering and Bioinformatics, University of California, Santa Cruz, Santa Cruz, CA 95060, USA
| | - Giacomo Bernardi
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, Santa Cruz, CA 95060, USA
| |
Collapse
|
8
|
Detcharoen M, Bumrungsri S, Voravuthikunchai SP. Complete Genome of Rose Myrtle, Rhodomyrtus tomentosa, and Its Population Genetics in Thai Peninsula. PLANTS (BASEL, SWITZERLAND) 2023; 12:1582. [PMID: 37111806 PMCID: PMC10144328 DOI: 10.3390/plants12081582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 04/04/2023] [Accepted: 04/05/2023] [Indexed: 06/19/2023]
Abstract
Several parts of rose myrtle, Rhodomyrtus tomentosa, exhibited profound antibacterial and anti-inflammatory activities, suggesting its potential in healthcare and cosmetics applications. During the past few years, the demand for biologically active compounds in the industrial sectors increased. Therefore, gathering comprehensive information on all aspects of this plant species is essential. Here, the genome sequencing using short and long reads was used to understand the genome biology of R. tomentosa. Inter-simple sequence repeats (ISSR) and simple sequence repeats (SSR) markers, and geometric morphometrics of the leaves of R. tomentosa collected across Thai Peninsula, were determined for population differentiation analysis. The genome size of R. tomentosa was 442 Mb, and the divergence time between R. tomentosa and Rhodamnia argentea, the white myrtle of eastern Australia, was around 15 million years. No population structure was observed between R. tomentosa on the eastern and western sides of the Thai Peninsula using the ISSR and SSR markers. However, significant differences in leaf size and shape of R. tomentosa were observed in all locations.
Collapse
Affiliation(s)
- Matsapume Detcharoen
- Division of Biological Science, Faculty of Science, Prince of Songkla University, Hat Yai 90110, Thailand
| | - Sara Bumrungsri
- Division of Biological Science, Faculty of Science, Prince of Songkla University, Hat Yai 90110, Thailand
| | - Supayang Piyawan Voravuthikunchai
- Center of Antimicrobial Biomaterial Innovation-Southeast Asia, Faculty of Science, Prince of Songkla University, Hat Yai 90110, Thailand
- Natural Product Research Center of Excellence, Faculty of Science, Prince of Songkla University, Hat Yai 90110, Thailand
| |
Collapse
|
9
|
Ferguson S, Jones A, Murray K, Schwessinger B, Borevitz JO. Interspecies genome divergence is predominantly due to frequent small scale rearrangements in Eucalyptus. Mol Ecol 2023; 32:1271-1287. [PMID: 35810343 DOI: 10.1111/mec.16608] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 07/02/2022] [Accepted: 07/04/2022] [Indexed: 11/27/2022]
Abstract
Synteny, the ordering of sequences within homologous chromosomes, must be maintained within the genomes of sexually reproducing species for the sharing of alleles and production of viable, reproducing offspring. However, when the genomes of closely related species are compared, a loss of synteny is often observed. Unequal homologous recombination is the primary mechanism behind synteny loss, occurring more often in transposon rich regions, and resulting in the formation of chromosomal rearrangements. To examine patterns of synteny among three closely related, interbreeding, and wild Eucalyptus species, we assembled their genomes using long-read DNA sequencing and de novo assembly. We identify syntenic and rearranged regions between these genomes and estimate that ~48% of our genomes remain syntenic while ~36% is rearranged. We observed that rearrangements highly fragment microsynteny. Our results suggest that synteny between these species is primarily lost through small-scale rearrangements, not through sequence loss, gain, or sequence divergence. Further examination of identified rearrangements suggests that rearrangements may be altering the phenotypes of Eucalyptus species. Our study also underscores that the use of single reference genomes in genomic variation studies could lead to reference bias, especially given the scale at which we show potentially adaptive loci have highly diverged, deleted, duplicated and/or rearranged. This study provides an unbiased framework to look at potential speciation and adaptive loci among a rapidly radiating foundation species of woodland trees that are free from selective breeding seen in most crop species.
Collapse
Affiliation(s)
- Scott Ferguson
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Ashley Jones
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Kevin Murray
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia.,Weigel Department, Max Planck Institute for Developmental Biology, Tuebingen, Germany
| | - Benjamin Schwessinger
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Justin O Borevitz
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
| |
Collapse
|
10
|
Zhang X, Chen S, Zhang Y, Xiao Y, Qin Y, Li Q, Liu L, Liu B, Chai L, Yang H, Liu H. Draft genome of the medicinal tea tree Melaleuca alternifolia. Mol Biol Rep 2023; 50:1545-1552. [PMID: 36513867 DOI: 10.1007/s11033-022-08157-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2022] [Accepted: 11/15/2022] [Indexed: 12/15/2022]
Abstract
BACKGROUND Melaleuca alternifolia is a commercially important medicinal tea tree native to Australia. Tea tree oil, the essential oil distilled from its branches and leaves, has broad-spectrum germicidal activity and is highly valued in the pharmaceutical and cosmetic industries. Thus, the study of genome, which can provide reference for the investigation of genes involved in terpinen-4-ol biosynthesis, is quite crucial for improving the productivity of Tea tree oil. METHODS AND RESULTS In our study, the next-generation sequencing was used to investigate the whole genome of Melaleuca alternifolia. About 114 Gb high quality sequence data were obtained and assembled into 1,838,159 scafolds with an N50 length of 1021 bp. The assembled genome size is about 595 Mb, twice of that predicted by flow cytometer (300 Mb) and k-mer analysis (345 Mb). Benchmarking Universal Single-Copy Orthologs analyses indicated that only 11.3% of the conserved single-copy genes were miss. Repetitive regions cover over 40.43% of the genome. A total of 44,369 protein-coding genes were predicted and annotated against Nr, Swissprot, Refseq, COG, KOG, and KEGG database. Among these genes, 32,909 and 16,241 genes were functionally annotated in Nr and KEGG, respectively. Moreover, 29,411 and 14,435 genes were functionally annotated in COG and KOG. Additionally, 457,661 simple sequence repeats and 1109 transcription factors (TFs) form 67 TF families were identified in the assembled genome. CONCLUSION Our findings provide a draft genome sequencing of M. alternifolia which can act as a reference for the deep sequencing strategies, and are useful for future functional and comparative genomics analyses.
Collapse
Affiliation(s)
- Xiaoning Zhang
- Guangxi Forestry Research Institute, YongWu Road 23, Xixiangtang District, Nanning, 530002, Guangxi, China
| | - Silin Chen
- State Key Laboratory of Biocatalysis and Enzyme Engineering, College of Life Sciences, Hubei University, Wuhan, China
| | - Ye Zhang
- Guangxi Forestry Research Institute, YongWu Road 23, Xixiangtang District, Nanning, 530002, Guangxi, China
| | - Yufei Xiao
- Guangxi Forestry Research Institute, YongWu Road 23, Xixiangtang District, Nanning, 530002, Guangxi, China
| | - Yufeng Qin
- Guangxi Forestry Research Institute, YongWu Road 23, Xixiangtang District, Nanning, 530002, Guangxi, China
| | - Qing Li
- State Key Laboratory of Biocatalysis and Enzyme Engineering, College of Life Sciences, Hubei University, Wuhan, China
| | - Li Liu
- State Key Laboratory of Biocatalysis and Enzyme Engineering, College of Life Sciences, Hubei University, Wuhan, China
| | - Buming Liu
- Guangxi Key Laboratory of Traditional Chinese Medicine Quality Standards, Nanning, China
| | - Ling Chai
- Guangxi Key Laboratory of Traditional Chinese Medicine Quality Standards, Nanning, China
| | - Hong Yang
- State Key Laboratory of Biocatalysis and Enzyme Engineering, College of Life Sciences, Hubei University, Wuhan, China.
| | - Hailong Liu
- Guangxi Forestry Research Institute, YongWu Road 23, Xixiangtang District, Nanning, 530002, Guangxi, China.
| |
Collapse
|
11
|
Chen SH, Martino AM, Luo Z, Schwessinger B, Jones A, Tolessa T, Bragg JG, Tobias PA, Edwards RJ. A high-quality pseudo-phased genome for Melaleuca quinquenervia shows allelic diversity of NLR-type resistance genes. Gigascience 2022; 12:giad102. [PMID: 38096477 PMCID: PMC10720953 DOI: 10.1093/gigascience/giad102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 09/11/2023] [Accepted: 11/14/2023] [Indexed: 12/17/2023] Open
Abstract
BACKGROUND Melaleuca quinquenervia (broad-leaved paperbark) is a coastal wetland tree species that serves as a foundation species in eastern Australia, Indonesia, Papua New Guinea, and New Caledonia. While extensively cultivated for its ornamental value, it has also become invasive in regions like Florida, USA. Long-lived trees face diverse pest and pathogen pressures, and plant stress responses rely on immune receptors encoded by the nucleotide-binding leucine-rich repeat (NLR) gene family. However, the comprehensive annotation of NLR encoding genes has been challenging due to their clustering arrangement on chromosomes and highly repetitive domain structure; expansion of the NLR gene family is driven largely by tandem duplication. Additionally, the allelic diversity of the NLR gene family remains largely unexplored in outcrossing tree species, as many genomes are presented in their haploid, collapsed state. RESULTS We assembled a chromosome-level pseudo-phased genome for M. quinquenervia and described the allelic diversity of plant NLRs using the novel FindPlantNLRs pipeline. Analysis reveals variation in the number of NLR genes on each haplotype, distinct clustering patterns, and differences in the types and numbers of novel integrated domains. CONCLUSIONS The high-quality M. quinquenervia genome assembly establishes a new framework for functional and evolutionary studies of this significant tree species. Our findings suggest that maintaining allelic diversity within the NLR gene family is crucial for enabling responses to environmental stress, particularly in long-lived plants.
Collapse
Affiliation(s)
- Stephanie H Chen
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Kensington NSW 2052, Australia
- Research Centre for Ecosystem Resilience, Botanic Gardens of Sydney, Sydney NSW 2000, Australia
| | - Alyssa M Martino
- School of Life and Environmental Sciences, The University of Sydney, Camperdown NSW 2006, Australia
| | - Zhenyan Luo
- Research School of Biology, The Australian National University, Canberra ACT 2601, Australia
| | - Benjamin Schwessinger
- Research School of Biology, The Australian National University, Canberra ACT 2601, Australia
| | - Ashley Jones
- Research School of Biology, The Australian National University, Canberra ACT 2601, Australia
| | - Tamene Tolessa
- Research School of Biology, The Australian National University, Canberra ACT 2601, Australia
- School of Environment and Rural Science, University of New England, Armidale NSW 2351, Australia
| | - Jason G Bragg
- Research Centre for Ecosystem Resilience, Botanic Gardens of Sydney, Sydney NSW 2000, Australia
- School of Biological, Earth and Environmental Sciences, UNSW Sydney, Kensington NSW 2052, Australia
| | - Peri A Tobias
- School of Life and Environmental Sciences, The University of Sydney, Camperdown NSW 2006, Australia
| | - Richard J Edwards
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Kensington NSW 2052, Australia
- Minderoo OceanOmics Centre at UWA, UWA Oceans Institute, University of Western Australia, Crawley WA 6009, Australia
| |
Collapse
|
12
|
Lötter A, Duong TA, Candotti J, Mizrachi E, Wegrzyn JL, Myburg AA. Haplogenome assembly reveals structural variation in Eucalyptus interspecific hybrids. Gigascience 2022; 12:giad064. [PMID: 37632754 PMCID: PMC10460159 DOI: 10.1093/gigascience/giad064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Revised: 02/15/2023] [Accepted: 07/27/2023] [Indexed: 08/28/2023] Open
Abstract
BACKGROUND De novo phased (haplo)genome assembly using long-read DNA sequencing data has improved the detection and characterization of structural variants (SVs) in plant and animal genomes. Able to span across haplotypes, long reads allow phased, haplogenome assembly in highly outbred organisms such as forest trees. Eucalyptus tree species and interspecific hybrids are the most widely planted hardwood trees with F1 hybrids of Eucalyptus grandis and E. urophylla forming the bulk of fast-growing pulpwood plantations in subtropical regions. The extent of structural variation and its effect on interspecific hybridization is unknown in these trees. As a first step towards elucidating the extent of structural variation between the genomes of E. grandis and E. urophylla, we sequenced and assembled the haplogenomes contained in an F1 hybrid of the two species. FINDINGS Using Nanopore sequencing and a trio-binning approach, we assembled the separate haplogenomes (566.7 Mb and 544.5 Mb) to 98.0% BUSCO completion. High-density SNP genetic linkage maps of both parents allowed scaffolding of 88.0% of the haplogenome contigs into 11 pseudo-chromosomes (scaffold N50 of 43.8 Mb and 42.5 Mb for the E. grandis and E. urophylla haplogenomes, respectively). We identify 48,729 SVs between the two haplogenomes providing the first detailed insight into genome structural rearrangement in these species. The two haplogenomes have similar gene content, 35,572 and 33,915 functionally annotated genes, of which 34.7% are contained in genome rearrangements. CONCLUSIONS Knowledge of SV and haplotype diversity in the two species will form the basis for understanding the genetic basis of hybrid superiority in these trees.
Collapse
Affiliation(s)
- Anneri Lötter
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Private bag X20, Pretoria 0028, South Africa
| | - Tuan A Duong
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Private bag X20, Pretoria 0028, South Africa
| | - Julia Candotti
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Private bag X20, Pretoria 0028, South Africa
| | - Eshchar Mizrachi
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Private bag X20, Pretoria 0028, South Africa
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, Institute for Systems Genomics: Computational Biology Core, University of Connecticut, Storrs, CT 06269, USA
| | - Alexander A Myburg
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Private bag X20, Pretoria 0028, South Africa
| |
Collapse
|
13
|
Establishing MinION Sequencing and Genome Assembly Procedures for the Analysis of the Rooibos (Aspalathus linearis) Genome. PLANTS 2022; 11:plants11162156. [PMID: 36015459 PMCID: PMC9416007 DOI: 10.3390/plants11162156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/25/2022] [Revised: 08/08/2022] [Accepted: 08/14/2022] [Indexed: 11/17/2022]
Abstract
While plant genome analysis is gaining speed worldwide, few plant genomes have been sequenced and analyzed on the African continent. Yet, this information holds the potential to transform diverse industries as it unlocks medicinally and industrially relevant biosynthesis pathways for bioprospecting. Considering that South Africa is home to the highly diverse Cape Floristic Region, local establishment of methods for plant genome analysis is essential. Long-read sequencing is becoming standard procedure for plant genome research, as these reads can span repetitive regions of the DNA, substantially facilitating reassembly of a contiguous genome. With the MinION, Oxford Nanopore offers a cost-efficient sequencing method to generate long reads; however, DNA purification protocols must be adapted for each plant species to generate ultra-pure DNA, essential for these analyses. Here, we describe a cost-effective procedure for the extraction and purification of plant DNA and evaluate diverse genome assembly approaches for the reconstruction of the genome of rooibos (Aspalathus linearis), an endemic South African medicinal plant widely used for tea production. We discuss the pros and cons of nine tested assembly programs, specifically Redbean and NextDenovo, which generated the most contiguous assemblies, and Flye, which produced an assembly closest to the predicted genome size.
Collapse
|
14
|
Field MA, Yadav S, Dudchenko O, Esvaran M, Rosen BD, Skvortsova K, Edwards RJ, Keilwagen J, Cochran BJ, Manandhar B, Bustamante S, Rasmussen JA, Melvin RG, Chernoff B, Omer A, Colaric Z, Chan EKF, Minoche AE, Smith TPL, Gilbert MTP, Bogdanovic O, Zammit RA, Thomas T, Aiden EL, Ballard JWO. The Australian dingo is an early offshoot of modern breed dogs. SCIENCE ADVANCES 2022; 8:eabm5944. [PMID: 35452284 PMCID: PMC9032958 DOI: 10.1126/sciadv.abm5944] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Accepted: 03/09/2022] [Indexed: 06/11/2023]
Abstract
Dogs are uniquely associated with human dispersal and bring transformational insight into the domestication process. Dingoes represent an intriguing case within canine evolution being geographically isolated for thousands of years. Here, we present a high-quality de novo assembly of a pure dingo (CanFam_DDS). We identified large chromosomal differences relative to the current dog reference (CanFam3.1) and confirmed no expanded pancreatic amylase gene as found in breed dogs. Phylogenetic analyses using variant pairwise matrices show that the dingo is distinct from five breed dogs with 100% bootstrap support when using Greenland wolf as the outgroup. Functionally, we observe differences in methylation patterns between the dingo and German shepherd dog genomes and differences in serum biochemistry and microbiome makeup. Our results suggest that distinct demographic and environmental conditions have shaped the dingo genome. In contrast, artificial human selection has likely shaped the genomes of domestic breed dogs after divergence from the dingo.
Collapse
Affiliation(s)
- Matt A. Field
- Centre for Tropical Bioinformatics and Molecular Biology, College of Public Health, Medical and Veterinary Sciences, James Cook University, Cairns, QLD 4878, Australia
- Garvan Institute of Medical Research, Victoria Street, Darlinghurst, NSW 2010, Australia
| | - Sonu Yadav
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, High St, Kensington, NSW 2052, Australia
| | - Olga Dudchenko
- The Center for Genome Architecture, Baylor College of Medicine, Houston, TX 77030, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77005, USA
| | - Meera Esvaran
- School of Biological, Earth and Environmental Sciences, University of New South Wales, Sydney, NSW 2052, Australia
| | - Benjamin D. Rosen
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, MD 20705, USA
| | - Ksenia Skvortsova
- Garvan Institute of Medical Research, Victoria Street, Darlinghurst, NSW 2010, Australia
| | - Richard J. Edwards
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, High St, Kensington, NSW 2052, Australia
| | - Jens Keilwagen
- Julius Kühn-Institut, Erwin-Baur-Str. 27, 06484 Quedlinburg, Germany
| | - Blake J. Cochran
- School of Medical Sciences, University of New South Wales, Sydney, NSW 2052, Australia
| | - Bikash Manandhar
- School of Medical Sciences, University of New South Wales, Sydney, NSW 2052, Australia
| | - Sonia Bustamante
- Bioanalytical Mass Spectrometry Facility, Mark Wainwright Analytical Centre, University of New South Wales, Sydney, NSW 2052, Australia
| | - Jacob Agerbo Rasmussen
- Laboratory of Genomics and Molecular Biomedicine, Department of Biology, University of Copenhagen, Copenhagen 2100, Denmark
- Center for Evolutionary Hologenomics, Faculty of Health and Medical Sciences, The GLOBE Institute University of Copenhagen, Copenhagen, Denmark
| | - Richard G. Melvin
- Department of Biomedical Sciences, University of Minnesota Medical School, 1035 University Drive, Duluth, MN 55812, USA
| | - Barry Chernoff
- College of the Environment, Departments of Biology, and Earth and Environmental Sciences, Wesleyan University, Middletown, CT 06459, USA
| | - Arina Omer
- The Center for Genome Architecture, Baylor College of Medicine, Houston, TX 77030, USA
| | - Zane Colaric
- The Center for Genome Architecture, Baylor College of Medicine, Houston, TX 77030, USA
| | - Eva K. F. Chan
- Garvan Institute of Medical Research, Victoria Street, Darlinghurst, NSW 2010, Australia
- Statewide Genomics, New South Wales Health Pathology, 45 Watt St, Newcastle, NSW 2300, Australia
| | - Andre E. Minoche
- Garvan Institute of Medical Research, Victoria Street, Darlinghurst, NSW 2010, Australia
| | - Timothy P. L. Smith
- U.S. Meat Animal Research Center, Agricultural Research Service, USDA, Rd 313, Clay Center, NE 68933, USA
| | - M. Thomas P. Gilbert
- Laboratory of Genomics and Molecular Biomedicine, Department of Biology, University of Copenhagen, Copenhagen 2100, Denmark
- University Museum, NTNU, Trondheim, Norway
| | - Ozren Bogdanovic
- Garvan Institute of Medical Research, Victoria Street, Darlinghurst, NSW 2010, Australia
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, High St, Kensington, NSW 2052, Australia
| | - Robert A. Zammit
- Vineyard Veterinary Hospital, 703 Windsor Rd, Vineyard, NSW 2765, Australia
| | - Torsten Thomas
- School of Biological, Earth and Environmental Sciences, University of New South Wales, Sydney, NSW 2052, Australia
| | - Erez L. Aiden
- The Center for Genome Architecture, Baylor College of Medicine, Houston, TX 77030, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77005, USA
- UWA School of Agriculture and Environment, The University of Western Australia, Perth, WA 6009, Australia
- Shanghai Institute for Advanced Immunochemical Studies, ShanghaiTech University, Pudong 201210, China
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - J. William O. Ballard
- Department of Environment and Genetics, SABE, La Trobe University, Melbourne, VIC 3086, Australia
- School of Biosciences, University of Melbourne, Royal Parade, Parkville, VIC 3052, Australia
| |
Collapse
|
15
|
Yu J, Xia M, Wang Y, Chi X, Xu H, Chen S, Zhang F. Short and long reads chloroplast genome assemblies and phylogenomics of Artemisia tangutica (Asteraceae). Biologia (Bratisl) 2022. [DOI: 10.1007/s11756-021-00951-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
16
|
Kress WJ, Soltis DE, Kersey PJ, Wegrzyn JL, Leebens-Mack JH, Gostel MR, Liu X, Soltis PS. Green plant genomes: What we know in an era of rapidly expanding opportunities. Proc Natl Acad Sci U S A 2022; 119:e2115640118. [PMID: 35042803 PMCID: PMC8795535 DOI: 10.1073/pnas.2115640118] [Citation(s) in RCA: 46] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Green plants play a fundamental role in ecosystems, human health, and agriculture. As de novo genomes are being generated for all known eukaryotic species as advocated by the Earth BioGenome Project, increasing genomic information on green land plants is essential. However, setting standards for the generation and storage of the complex set of genomes that characterize the green lineage of life is a major challenge for plant scientists. Such standards will need to accommodate the immense variation in green plant genome size, transposable element content, and structural complexity while enabling research into the molecular and evolutionary processes that have resulted in this enormous genomic variation. Here we provide an overview and assessment of the current state of knowledge of green plant genomes. To date fewer than 300 complete chromosome-scale genome assemblies representing fewer than 900 species have been generated across the estimated 450,000 to 500,000 species in the green plant clade. These genomes range in size from 12 Mb to 27.6 Gb and are biased toward agricultural crops with large branches of the green tree of life untouched by genomic-scale sequencing. Locating suitable tissue samples of most species of plants, especially those taxa from extreme environments, remains one of the biggest hurdles to increasing our genomic inventory. Furthermore, the annotation of plant genomes is at present undergoing intensive improvement. It is our hope that this fresh overview will help in the development of genomic quality standards for a cohesive and meaningful synthesis of green plant genomes as we scale up for the future.
Collapse
Affiliation(s)
- W John Kress
- National Museum of Natural History, Smithsonian Institution, Department of Botany, Washington, DC 20013-7012;
- Department of Biological Sciences, Dartmouth College, Hanover, NH 03755
- Arnold Arboretum, Harvard University, Boston, MA 02130
| | - Douglas E Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, FL 32611
- Biodiversity Institute, University of Florida, Gainesville, FL 32611
- Department of Biology, University of Florida, Gainesville, FL 32611
| | - Paul J Kersey
- Royal Botanic Gardens, Kew, Richmond, Surrey TW9 3AE, United Kingdom
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, Institute for Systems Genomics: Computational Biology Core, University of Connecticut, Storrs, CT 06269-3214
| | - James H Leebens-Mack
- Department of Plant Biology, 2101 Miller Plant Sciences, University of Georgia, Athens, GA 30602-7271
| | - Morgan R Gostel
- Botanical Research Institute of Texas, Fort Worth, TX 76107-3400
| | - Xin Liu
- China National GeneBank, BGI-Shenzhen, Shenzhen 518120, China
| | - Pamela S Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, FL 32611
- Biodiversity Institute, University of Florida, Gainesville, FL 32611
| |
Collapse
|
17
|
Can Forest Trees Cope with Climate Change?-Effects of DNA Methylation on Gene Expression and Adaptation to Environmental Change. Int J Mol Sci 2021; 22:ijms222413524. [PMID: 34948318 PMCID: PMC8703565 DOI: 10.3390/ijms222413524] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Revised: 12/09/2021] [Accepted: 12/12/2021] [Indexed: 12/13/2022] Open
Abstract
Epigenetic modifications, including chromatin modifications and DNA methylation, play key roles in regulating gene expression in both plants and animals. Transmission of epigenetic markers is important for some genes to maintain specific expression patterns and preserve the status quo of the cell. This article provides a review of existing research and the current state of knowledge about DNA methylation in trees in the context of global climate change, along with references to the potential of epigenome editing tools and the possibility of their use for forest tree research. Epigenetic modifications, including DNA methylation, are involved in evolutionary processes, developmental processes, and environmental interactions. Thus, the implications of epigenetics are important for adaptation and phenotypic plasticity because they provide the potential for tree conservation in forest ecosystems exposed to adverse conditions resulting from global warming and regional climate fluctuations.
Collapse
|
18
|
Ghosh Dasgupta M, Abdul Bari MP, Shanmugavel S, Dharanishanthi V, Muthupandi M, Kumar N, Chauhan SS, Kalaivanan J, Mohan H, Krutovsky KV, Rajasugunasekar D. Targeted re-sequencing and genome-wide association analysis for wood property traits in breeding population of Eucalyptus tereticornis × E. grandis. Genomics 2021; 113:4276-4292. [PMID: 34785351 DOI: 10.1016/j.ygeno.2021.11.013] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2020] [Revised: 06/20/2021] [Accepted: 11/10/2021] [Indexed: 11/16/2022]
Abstract
Globally, Eucalyptus plantations occupy 22 million ha area and is one of the preferred hardwood species due to their short rotation, rapid growth, adaptability and wood properties. In this study, we present results of GWAS in parents and 100 hybrids of Eucalyptus tereticornis × E. grandis using 762 genes presumably involved in wood formation. Comparative analysis between parents predicted 32,202 polymorphic SNPs with high average read depth of 269-562× per individual per nucleotide. Seventeen wood related traits were phenotyped across three diverse environments and GWAS was conducted using 13,610 SNPs. A total of 45 SNP-trait associations were predicted across two locations. Seven large effect markers were identified which explained more than 80% of phenotypic variation for fibre area. This study has provided an array of candidate genes which may govern fibre morphology in this genus and has predicted potential SNPs which can guide future breeding programs in tropical Eucalyptus.
Collapse
Affiliation(s)
| | | | | | | | - Muthusamy Muthupandi
- Institute of Forest Genetics and Tree Breeding, R.S. Puram, Coimbatore 641002, India
| | - Naveen Kumar
- Institute of Wood Science and Technology, 18(th) Cross Malleshwaram, Bangalore 560 003, India
| | - Shakti Singh Chauhan
- Institute of Wood Science and Technology, 18(th) Cross Malleshwaram, Bangalore 560 003, India
| | | | - Haritha Mohan
- Institute of Forest Genetics and Tree Breeding, R.S. Puram, Coimbatore 641002, India
| | - Konstantin V Krutovsky
- Department of Forest Genetics and Forest Tree Breeding, Georg-August University of Göttingen, 37077 Göttingen, Germany; Center for Integrated Breeding Research, George-August University of Göttingen, 37075 Göttingen, Germany; Laboratory of Forest Genomics, Genome Research and Education Center, Institute of Fundamental Biology and Biotechnology, Siberian Federal University, 660036 Krasnoyarsk, Russia; Laboratory of Population Genetics, N.I. Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia; Department of Ecosystem Science and Management, Texas A&M University, College Station, TX 77843-2138, USA
| | | |
Collapse
|
19
|
Patturaj M, Munusamy A, Kannan N, Kandasamy U, Ramasamy Y. Chromosome-specific polymorphic SSR markers in tropical eucalypt species using low coverage whole genome sequences: systematic characterization and validation. Genomics Inform 2021; 19:e33. [PMID: 34638180 PMCID: PMC8510864 DOI: 10.5808/gi.21031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Accepted: 06/29/2021] [Indexed: 11/20/2022] Open
Abstract
Eucalyptus is one of the major plantation species with wide variety of industrial uses. Polymorphic and informative simple sequence repeats (SSRs) have broad range of applications in genetic analysis. In this study, two individuals of Eucalyptus tereticornis (ET217 and ET86), one individual each from E. camaldulensis (EC17) and E. grandis (EG9) were subjected to whole genome resequencing. Low coverage (10×) genome sequencing was used to find polymorphic SSRs between the individuals. Average number of SSR loci identified was 95,513 and the density of SSRs per Mb was from 157.39 in EG9 to 155.08 in EC17. Among all the SSRs detected, the most abundant repeat motifs were di-nucleotide (59.6%–62.5%), followed by tri- (23.7%–27.2%), tetra- (5.2%–5.6%), penta- (5.0%–5.3%), and hexa-nucleotide (2.7%–2.9%). The predominant SSR motif units were AG/CT and AAG/TTC. Computational genome analysis predicted the SSR length variations between the individuals and identified the gene functions of SSR containing sequences. Selected subset of polymorphic markers was validated in a full-sib family of eucalypts. Additionally, genome-wide characterization of single nucleotide polymorphisms, InDels and transcriptional regulators were carried out. These variations will find their utility in genome-wide association studies as well as understanding of molecular mechanisms involved in key economic traits. The genomic resources generated in this study would provide an impetus to integrate genomics in marker-trait associations and breeding of tropical eucalypts.
Collapse
Affiliation(s)
- Maheswari Patturaj
- Institute of Forest Genetics and Tree Breeding, Coimbatore 641002, India
| | - Aiswarya Munusamy
- Institute of Forest Genetics and Tree Breeding, Coimbatore 641002, India
| | | | | | - Yasodha Ramasamy
- Institute of Forest Genetics and Tree Breeding, Coimbatore 641002, India
| |
Collapse
|
20
|
LeafGo: Leaf to Genome, a quick workflow to produce high-quality de novo plant genomes using long-read sequencing technology. Genome Biol 2021; 22:256. [PMID: 34479618 PMCID: PMC8414726 DOI: 10.1186/s13059-021-02475-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Accepted: 08/20/2021] [Indexed: 02/06/2023] Open
Abstract
Currently, different sequencing platforms are used to generate plant genomes and no workflow has been properly developed to optimize time, cost, and assembly quality. We present LeafGo, a complete de novo plant genome workflow, that starts from tissue and produces genomes with modest laboratory and bioinformatic resources in approximately 7 days and using one long-read sequencing technology. LeafGo is optimized with ten different plant species, three of which are used to generate high-quality chromosome-level assemblies without any scaffolding technologies. Finally, we report the diploid genomes of Eucalyptus rudis and E. camaldulensis and the allotetraploid genome of Arachis hypogaea.
Collapse
|
21
|
Voelker J, Shepherd M, Mauleon R. A high-quality draft genome for Melaleuca alternifolia (tea tree): a new platform for evolutionary genomics of myrtaceous terpene-rich species. GIGABYTE 2021; 2021:gigabyte28. [PMID: 36824337 PMCID: PMC9650293 DOI: 10.46471/gigabyte.28] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Accepted: 08/05/2021] [Indexed: 11/09/2022] Open
Abstract
The economically important Melaleuca alternifolia (tea tree) is the source of a terpene-rich essential oil with therapeutic and cosmetic uses around the world. Tea tree has been cultivated and bred in Australia since the 1990s. It has been extensively studied for the genetics and biochemistry of terpene biosynthesis. Here, we report a high quality de novo genome assembly using Pacific Biosciences and Illumina sequencing. The genome was assembled into 3128 scaffolds with a total length of 362 Mb (N50 = 1.9 Mb), with significantly higher contiguity than a previous assembly (N50 = 8.7 Kb). Using a homology-based, RNA-seq evidence-based and ab initio prediction approach, 37,226 protein-coding genes were predicted. Genome assembly and annotation exhibited high completeness scores of 98.1% and 89.4%, respectively. Sequence contiguity was sufficient to reveal extensive gene order conservation and chromosomal rearrangements in alignments with Eucalyptus grandis and Corymbia citriodora genomes. This new genome advances currently available resources to investigate the genome structure and gene family evolution of M. alternifolia. It will enable further comparative genomic studies in Myrtaceae to elucidate the genetic foundations of economically valuable traits in this crop.
Collapse
Affiliation(s)
- Julia Voelker
- Faculty of Science and Engineering, Southern Cross University, Military Road, East Lismore NSW 2480, Australia, Corresponding author. E-mail:
| | - Mervyn Shepherd
- Faculty of Science and Engineering, Southern Cross University, Military Road, East Lismore NSW 2480, Australia
| | - Ramil Mauleon
- Faculty of Science and Engineering, Southern Cross University, Military Road, East Lismore NSW 2480, Australia
| |
Collapse
|
22
|
Oliver A, Podell S, Pinowska A, Traller JC, Smith SR, McClure R, Beliaev A, Bohutskyi P, Hill EA, Rabines A, Zheng H, Allen LZ, Kuo A, Grigoriev IV, Allen AE, Hazlebeck D, Allen EE. Diploid genomic architecture of Nitzschia inconspicua, an elite biomass production diatom. Sci Rep 2021; 11:15592. [PMID: 34341414 PMCID: PMC8329260 DOI: 10.1038/s41598-021-95106-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Accepted: 07/14/2021] [Indexed: 01/13/2023] Open
Abstract
A near-complete diploid nuclear genome and accompanying circular mitochondrial and chloroplast genomes have been assembled from the elite commercial diatom species Nitzschia inconspicua. The 50 Mbp haploid size of the nuclear genome is nearly double that of model diatom Phaeodactylum tricornutum, but 30% smaller than closer relative Fragilariopsis cylindrus. Diploid assembly, which was facilitated by low levels of allelic heterozygosity (2.7%), included 14 candidate chromosome pairs composed of long, syntenic contigs, covering 93% of the total assembly. Telomeric ends were capped with an unusual 12-mer, G-rich, degenerate repeat sequence. Predicted proteins were highly enriched in strain-specific marker domains associated with cell-surface adhesion, biofilm formation, and raphe system gliding motility. Expanded species-specific families of carbonic anhydrases suggest potential enhancement of carbon concentration efficiency, and duplicated glycolysis and fatty acid synthesis pathways across cytosolic and organellar compartments may enhance peak metabolic output, contributing to competitive success over other organisms in mixed cultures. The N. inconspicua genome delivers a robust new reference for future functional and transcriptomic studies to illuminate the physiology of benthic pennate diatoms and harness their unique adaptations to support commercial algae biomass and bioproduct production.
Collapse
Affiliation(s)
- Aaron Oliver
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, CA, USA
| | - Sheila Podell
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, CA, USA.
| | | | | | - Sarah R Smith
- Microbial and Environmental Genomics Group, J. Craig Venter Institute, La Jolla, CA, USA
| | - Ryan McClure
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Alex Beliaev
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Pavlo Bohutskyi
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Eric A Hill
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Ariel Rabines
- Microbial and Environmental Genomics Group, J. Craig Venter Institute, La Jolla, CA, USA
| | - Hong Zheng
- Microbial and Environmental Genomics Group, J. Craig Venter Institute, La Jolla, CA, USA
| | - Lisa Zeigler Allen
- Microbial and Environmental Genomics Group, J. Craig Venter Institute, La Jolla, CA, USA
| | - Alan Kuo
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, USA
| | - Igor V Grigoriev
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, USA.,Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, USA
| | - Andrew E Allen
- Microbial and Environmental Genomics Group, J. Craig Venter Institute, La Jolla, CA, USA
| | | | - Eric E Allen
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, CA, USA. .,Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA, USA. .,Division of Biological Sciences, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
23
|
McCartney AM, Hilario E, Choi S, Guhlin J, Prebble JM, Houliston G, Buckley TR, Chagné D. An exploration of assembly strategies and quality metrics on the accuracy of the rewarewa (Knightia excelsa) genome. Mol Ecol Resour 2021; 21:2125-2144. [PMID: 33955186 PMCID: PMC8362059 DOI: 10.1111/1755-0998.13406] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Revised: 03/18/2021] [Accepted: 04/20/2021] [Indexed: 12/17/2022]
Abstract
We used long read sequencing data generated from Knightia excelsa, a nectar-producing Proteaceae tree endemic to Aotearoa (New Zealand), to explore how sequencing data type, volume and workflows can impact final assembly accuracy and chromosome reconstruction. Establishing a high-quality genome for this species has specific cultural importance to Māori and commercial importance to honey producers in Aotearoa. Assemblies were produced by five long read assemblers using data subsampled based on read lengths, two polishing strategies and two Hi-C mapping methods. Our results from subsampling the data by read length showed that each assembler tested performed differently depending on the coverage and the read length of the data. Subsampling highlighted that input data with longer read lengths but perhaps lower coverage constructed more contiguous, kmers and gene-complete assemblies than short read length input data with higher coverage. The final genome assembly was constructed into 14 pseudochromosomes using an initial flye long read assembly, a racon/medaka/pilon combined polishing strategy, salsa2 and allhic scaffolding, juicebox curation, and Macadamia linkage map validation. We highlighted the importance of developing assembly workflows based on the volume and read length of sequencing data and established a robust set of quality metrics for generating high-quality assemblies. Scaffolding analyses highlighted that problems found in the initial assemblies could not be resolved accurately by Hi-C data and that assembly scaffolding was more successful when the underlying contig assembly was of higher accuracy. These findings provide insight into how quality assessment tools can be implemented throughout genome assembly pipelines to inform the de novo reconstruction of a high-quality genome assembly for nonmodel organisms.
Collapse
Affiliation(s)
- Ann M. McCartney
- Manaaki Whenua ‐ Landcare ResearchAucklandNew Zealand
- Genomics AotearoaDunedinNew Zealand
| | - Elena Hilario
- Genomics AotearoaDunedinNew Zealand
- The New Zealand Institute for Plant and Food Research (Plant & Food Research)SandringhamNew Zealand
| | - Seung‐Sub Choi
- Manaaki Whenua ‐ Landcare ResearchAucklandNew Zealand
- Genomics AotearoaDunedinNew Zealand
- School of Biological SciencesThe University of AucklandAucklandNew Zealand
| | - Joseph Guhlin
- Genomics AotearoaDunedinNew Zealand
- University of OtagoDunedinNew Zealand
| | - Jessica M. Prebble
- Genomics AotearoaDunedinNew Zealand
- Manaaki Whenua Landcare ResearchLincolnNew Zealand
| | - Gary Houliston
- Genomics AotearoaDunedinNew Zealand
- Manaaki Whenua Landcare ResearchLincolnNew Zealand
| | - Thomas R. Buckley
- Manaaki Whenua ‐ Landcare ResearchAucklandNew Zealand
- Genomics AotearoaDunedinNew Zealand
- School of Biological SciencesThe University of AucklandAucklandNew Zealand
| | - David Chagné
- Genomics AotearoaDunedinNew Zealand
- Plant & Food ResearchFitzherbert, Palmerston NorthNew Zealand
| |
Collapse
|
24
|
Valdebenito-Maturana B, Riadi G. GSER (a Genome Size Estimator using R): a pipeline for quality assessment of sequenced genome libraries through genome size estimation. Interface Focus 2021; 11:20200077. [PMID: 34123359 DOI: 10.1098/rsfs.2020.0077] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/13/2021] [Indexed: 01/07/2023] Open
Abstract
The first step in any genome research after obtaining the read data is to perform a due quality control of the sequenced reads. In a de novo genome assembly project, the second step is to estimate two important features, the genome size and 'best k-mer', to start the assembly tests with different de novo assembly software and its parameters. However, the quality control of the sequenced genome libraries as a whole, instead of focusing on the reads only, is frequently overlooked and realized to be important only when the assembly tests did not render the expected results. We have developed GSER, a Genome Size Estimator using R, a pipeline to evaluate the relationship between k-mers and genome size, as a means for quality assessment of the sequenced genome libraries. GSER generates a set of charts that allow the analyst to evaluate the library datasets before starting the assembly. The script which runs the pipeline can be downloaded from http://www.mobilomics.org/GSER/downloads or http://github.com/mobilomics/GSER.
Collapse
Affiliation(s)
| | - Gonzalo Riadi
- ANID - Millennium Science Initiative Program, Millennium Nucleus of Ion Channels-Associated Diseases (MiNICAD); Center for Bioinformatics, Simulation and Modeling (CBSM); Department of Bioinformatics, Faculty of Engineering, University of Talca, Campus Talca, Chile
| |
Collapse
|
25
|
Sharma P, Al-Dossary O, Alsubaie B, Al-Mssallem I, Nath O, Mitter N, Rodrigues Alves Margarido G, Topp B, Murigneux V, Kharabian Masouleh A, Furtado A, Henry RJ. Improvements in the sequencing and assembly of plant genomes. GIGABYTE 2021; 2021:gigabyte24. [PMID: 36824328 PMCID: PMC9631998 DOI: 10.46471/gigabyte.24] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Accepted: 06/03/2021] [Indexed: 11/09/2022] Open
Abstract
Advances in DNA sequencing have made it easier to sequence and assemble plant genomes. Here, we extend an earlier study, and compare recent methods for long read sequencing and assembly. Updated Oxford Nanopore Technology software improved assemblies. Using more accurate sequences produced by repeated sequencing of the same molecule (Pacific Biosciences HiFi) resulted in less fragmented assembly of sequencing reads. Using data for increased genome coverage resulted in longer contigs, but reduced total assembly length and improved genome completeness. The original model species, Macadamia jansenii, was also compared with three other Macadamia species, as well as avocado (Persea americana) and jojoba (Simmondsia chinensis). In these angiosperms, increasing sequence data volumes caused a linear increase in contig size, decreased assembly length and further improved already high completeness. Differences in genome size and sequence complexity influenced the success of assembly. Advances in long read sequencing technology continue to improve plant genome sequencing and assembly. However, results were improved by greater genome coverage, with the amount needed to achieve a particular level of assembly being species dependent.
Collapse
Affiliation(s)
- Priyanka Sharma
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane 4072, Australia
| | - Othman Al-Dossary
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane 4072, Australia,College of Agriculture and Food Sciences, King Faisal University, Al Hofuf, Saudi Arabia
| | - Bader Alsubaie
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane 4072, Australia,College of Agriculture and Food Sciences, King Faisal University, Al Hofuf, Saudi Arabia
| | - Ibrahim Al-Mssallem
- College of Agriculture and Food Sciences, King Faisal University, Al Hofuf, Saudi Arabia
| | - Onkar Nath
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane 4072, Australia
| | - Neena Mitter
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane 4072, Australia
| | - Gabriel Rodrigues Alves Margarido
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane 4072, Australia,Departamento de Genética, Escola Superior de Agricultura “Luiz de Queiroz”, Universidade de São Paulo, Piracicaba, São Paulo 13418-900, Brazil
| | - Bruce Topp
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane 4072, Australia
| | | | | | - Agnelo Furtado
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane 4072, Australia
| | - Robert J. Henry
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane 4072, Australia,Centre of Excellence for Plant Success in Nature and Agriculture, University of Queensland, Brisbane 4072, Australia, Corresponding author. E-mail:
| |
Collapse
|
26
|
Healey AL, Shepherd M, King GJ, Butler JB, Freeman JS, Lee DJ, Potts BM, Silva-Junior OB, Baten A, Jenkins J, Shu S, Lovell JT, Sreedasyam A, Grimwood J, Furtado A, Grattapaglia D, Barry KW, Hundley H, Simmons BA, Schmutz J, Vaillancourt RE, Henry RJ. Pests, diseases, and aridity have shaped the genome of Corymbia citriodora. Commun Biol 2021; 4:537. [PMID: 33972666 PMCID: PMC8110574 DOI: 10.1038/s42003-021-02009-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Accepted: 03/05/2021] [Indexed: 02/03/2023] Open
Abstract
Corymbia citriodora is a member of the predominantly Southern Hemisphere Myrtaceae family, which includes the eucalypts (Eucalyptus, Corymbia and Angophora; ~800 species). Corymbia is grown for timber, pulp and paper, and essential oils in Australia, South Africa, Asia, and Brazil, maintaining a high-growth rate under marginal conditions due to drought, poor-quality soil, and biotic stresses. To dissect the genetic basis of these desirable traits, we sequenced and assembled the 408 Mb genome of Corymbia citriodora, anchored into eleven chromosomes. Comparative analysis with Eucalyptus grandis reveals high synteny, although the two diverged approximately 60 million years ago and have different genome sizes (408 vs 641 Mb), with few large intra-chromosomal rearrangements. C. citriodora shares an ancient whole-genome duplication event with E. grandis but has undergone tandem gene family expansions related to terpene biosynthesis, innate pathogen resistance, and leaf wax formation, enabling their successful adaptation to biotic/abiotic stresses and arid conditions of the Australian continent.
Collapse
Affiliation(s)
- Adam L Healey
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA.
- University of Queensland/QAAFI, Brisbane, QLD, Australia.
| | - Mervyn Shepherd
- Southern Cross Plant Science, Southern Cross University, Lismore, NSW, Australia
| | - Graham J King
- Southern Cross Plant Science, Southern Cross University, Lismore, NSW, Australia
| | - Jakob B Butler
- School of Natural Sciences, University of Tasmania, Hobart, TAS, Australia
| | - Jules S Freeman
- School of Natural Sciences, University of Tasmania, Hobart, TAS, Australia
- ARC Training Centre for Forest Value, University of Tasmania, Hobart, TAS, Australia
- Scion, Rotorua, New Zealand
| | - David J Lee
- Forest Industries Research Centre, University of the Sunshine Coast, Sippy Downs, QLD, Australia
| | - Brad M Potts
- School of Natural Sciences, University of Tasmania, Hobart, TAS, Australia
- ARC Training Centre for Forest Value, University of Tasmania, Hobart, TAS, Australia
| | | | - Abdul Baten
- Southern Cross Plant Science, Southern Cross University, Lismore, NSW, Australia
- Institute of Precision Medicine & Bioinformatics, Camperdown, NSW, Australia
| | - Jerry Jenkins
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | - Shengqiang Shu
- Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - John T Lovell
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | | | - Jane Grimwood
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | - Agnelo Furtado
- University of Queensland/QAAFI, Brisbane, QLD, Australia
| | - Dario Grattapaglia
- EMBRAPA Genetic Resources and Biotechnology, Brasília, Brazil
- Genomic Science Program, Universidade Catolica de Brasilia, Taguatinga, Brazil
| | - Kerrie W Barry
- Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Hope Hundley
- Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Blake A Simmons
- University of Queensland/QAAFI, Brisbane, QLD, Australia
- Joint BioEnergy Institute, Emeryville, CA, USA
| | - Jeremy Schmutz
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
- Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - René E Vaillancourt
- School of Natural Sciences, University of Tasmania, Hobart, TAS, Australia
- ARC Training Centre for Forest Value, University of Tasmania, Hobart, TAS, Australia
| | - Robert J Henry
- University of Queensland/QAAFI, Brisbane, QLD, Australia
| |
Collapse
|
27
|
Allio R, Tilak MK, Scornavacca C, Avenant NL, Kitchener AC, Corre E, Nabholz B, Delsuc F. High-quality carnivoran genomes from roadkill samples enable comparative species delineation in aardwolf and bat-eared fox. eLife 2021; 10:e63167. [PMID: 33599612 PMCID: PMC7963486 DOI: 10.7554/elife.63167] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Accepted: 02/16/2021] [Indexed: 12/26/2022] Open
Abstract
In a context of ongoing biodiversity erosion, obtaining genomic resources from wildlife is essential for conservation. The thousands of yearly mammalian roadkill provide a useful source material for genomic surveys. To illustrate the potential of this underexploited resource, we used roadkill samples to study the genomic diversity of the bat-eared fox (Otocyon megalotis) and the aardwolf (Proteles cristatus), both having subspecies with similar disjunct distributions in Eastern and Southern Africa. First, we obtained reference genomes with high contiguity and gene completeness by combining Nanopore long reads and Illumina short reads. Then, we showed that the two subspecies of aardwolf might warrant species status (P. cristatus and P. septentrionalis) by comparing their genome-wide genetic differentiation to pairs of well-defined species across Carnivora with a new Genetic Differentiation index (GDI) based on only a few resequenced individuals. Finally, we obtained a genome-scale Carnivora phylogeny including the new aardwolf species.
Collapse
Affiliation(s)
- Rémi Allio
- Institut des Sciences de l’Evolution de Montpellier (ISEM), CNRS, IRD, EPHE, Université de MontpellierMontpellierFrance
| | - Marie-Ka Tilak
- Institut des Sciences de l’Evolution de Montpellier (ISEM), CNRS, IRD, EPHE, Université de MontpellierMontpellierFrance
| | - Celine Scornavacca
- Institut des Sciences de l’Evolution de Montpellier (ISEM), CNRS, IRD, EPHE, Université de MontpellierMontpellierFrance
| | - Nico L Avenant
- National Museum and Centre for Environmental Management, University of the Free StateBloemfonteinSouth Africa
| | - Andrew C Kitchener
- Department of Natural Sciences, National Museums ScotlandEdinburghUnited Kingdom
| | - Erwan Corre
- CNRS, Sorbonne Université, CNRS, ABiMS, Station Biologique de RoscoffRoscoffFrance
| | - Benoit Nabholz
- Institut des Sciences de l’Evolution de Montpellier (ISEM), CNRS, IRD, EPHE, Université de MontpellierMontpellierFrance
- Institut Universitaire de France (IUF)ParisFrance
| | - Frédéric Delsuc
- Institut des Sciences de l’Evolution de Montpellier (ISEM), CNRS, IRD, EPHE, Université de MontpellierMontpellierFrance
| |
Collapse
|
28
|
Dumschott K, Schmidt MHW, Chawla HS, Snowdon R, Usadel B. Oxford Nanopore sequencing: new opportunities for plant genomics? JOURNAL OF EXPERIMENTAL BOTANY 2020; 71:5313-5322. [PMID: 32459850 PMCID: PMC7501810 DOI: 10.1093/jxb/eraa263] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2019] [Accepted: 05/25/2020] [Indexed: 05/06/2023]
Abstract
DNA sequencing was dominated by Sanger's chain termination method until the mid-2000s, when it was progressively supplanted by new sequencing technologies that can generate much larger quantities of data in a shorter time. At the forefront of these developments, long-read sequencing technologies (third-generation sequencing) can produce reads that are several kilobases in length. This greatly improves the accuracy of genome assemblies by spanning the highly repetitive segments that cause difficulty for second-generation short-read technologies. Third-generation sequencing is especially appealing for plant genomes, which can be extremely large with long stretches of highly repetitive DNA. Until recently, the low basecalling accuracy of third-generation technologies meant that accurate genome assembly required expensive, high-coverage sequencing followed by computational analysis to correct for errors. However, today's long-read technologies are more accurate and less expensive, making them the method of choice for the assembly of complex genomes. Oxford Nanopore Technologies (ONT), a third-generation platform for the sequencing of native DNA strands, is particularly suitable for the generation of high-quality assemblies of highly repetitive plant genomes. Here we discuss the benefits of ONT, especially for the plant science community, and describe the issues that remain to be addressed when using ONT for plant genome sequencing.
Collapse
Affiliation(s)
- Kathryn Dumschott
- Institute for Biology I, BioSC, RWTH Aachen University, Aachen, Germany
- IBG-4 Bioinformatics, CEPLAS, Forschungszentrum Jülich, Jülich, Germany
| | - Maximilian H-W Schmidt
- Institute for Biology I, BioSC, RWTH Aachen University, Aachen, Germany
- IBG-4 Bioinformatics, CEPLAS, Forschungszentrum Jülich, Jülich, Germany
| | - Harmeet Singh Chawla
- Department of Plant Breeding, Justus Liebig University Giessen, Giessen, Germany
| | - Rod Snowdon
- Department of Plant Breeding, Justus Liebig University Giessen, Giessen, Germany
| | - Björn Usadel
- Institute for Biology I, BioSC, RWTH Aachen University, Aachen, Germany
- IBG-4 Bioinformatics, CEPLAS, Forschungszentrum Jülich, Jülich, Germany
- Institute for Biological Data Science, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| |
Collapse
|