1
|
Madrigal G, Minhas BF, Catchen J. Klumpy: A tool to evaluate the integrity of long-read genome assemblies and illusive sequence motifs. Mol Ecol Resour 2025; 25:e13982. [PMID: 38800997 PMCID: PMC11646305 DOI: 10.1111/1755-0998.13982] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Accepted: 05/13/2024] [Indexed: 05/29/2024]
Abstract
The improvement and decreasing costs of third-generation sequencing technologies has widened the scope of biological questions researchers can address with de novo genome assemblies. With the increasing number of reference genomes, validating their integrity with minimal overhead is vital for establishing confident results in their applications. Here, we present Klumpy, a tool for detecting and visualizing both misassembled regions in a genome assembly and genetic elements (e.g. genes) of interest in a set of sequences. By leveraging the initial raw reads in combination with their respective genome assembly, we illustrate Klumpy's utility by investigating antifreeze glycoprotein (afgp) loci across two icefishes, by searching for a reported absent gene in the northern snakehead fish, and by scanning the reference genomes of a mudskipper and bumblebee for misassembled regions. In the two former cases, we were able to provide support for the noncanonical placement of an afgp locus in the icefishes and locate the missing snakehead gene. Furthermore, our genome scans were able identify an unmappable locus in the mudskipper reference genome and identify a putative repetitive element shared among several species of bees.
Collapse
Affiliation(s)
- Giovanni Madrigal
- Department of Evolution, Ecology, and BehaviorUniversity of Illinois at Urbana‐ChampaignUrbanaIllinoisUSA
| | - Bushra Fazal Minhas
- Informatics ProgramUniversity of Illinois at Urbana‐ChampaignUrbanaIllinoisUSA
| | - Julian Catchen
- Department of Evolution, Ecology, and BehaviorUniversity of Illinois at Urbana‐ChampaignUrbanaIllinoisUSA
- Informatics ProgramUniversity of Illinois at Urbana‐ChampaignUrbanaIllinoisUSA
| |
Collapse
|
2
|
Chen P, Lian JY, Wu B, Cao HL, Li ZH, Wang ZF. Draft genome of Castanopsis chinensis, a dominant species safeguarding biodiversity in subtropical broadleaved evergreen forests. BMC Genom Data 2023; 24:78. [PMID: 38097945 PMCID: PMC10722680 DOI: 10.1186/s12863-023-01183-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 12/08/2023] [Indexed: 12/17/2023] Open
Abstract
OBJECTIVES Castanopsis is the third largest genus in the Fagaceae family and is essentially tropical or subtropical in origin. The species in this genus are mainly canopy-dominant trees, and the key components of evergreen broadleaved forests play a crucial role in the maintenance of local biodiversity. Castanopsis chinensis, distributed from South China to Vietnam, is a representative species. It currently suffers from a high disturbance of human activity and climate change. Here, we present its assembled genome to facilitate its preliminary conservation and breeding on the genome level. DATA DESCRIPTION The C. chinensis genome was assembled and annotated by Nanopore and MGI whole-genome sequencing and RNA-seq reads using leaf tissues. The assembly was 888,699,661 bp in length, consisting of 133 contigs and a contig N50 of 23,395,510 bp. A completeness assessment of the assembly with Benchmarking Universal Single-Copy Orthologs (BUSCO) indicated a score of 98.3%. Repetitive elements comprised 471,006,885 bp, accounting for 55.9% of the assembled sequences. A total of 51,406 genes that coded for 54,310 proteins were predicted. Multiple databases were used to functionally annotate the protein sequences.
Collapse
Affiliation(s)
- Pan Chen
- Guangdong Forestry Survey and Planning Institute, Guangzhou, 510520, China
| | - Ju-Yu Lian
- Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.
- Key Laboratory of Vegetation Restoration and Management of Degraded Ecosystems, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.
- South China National Botanical Garden, Guangzhou, 510650, China.
| | - Bin Wu
- Guangdong Forestry Survey and Planning Institute, Guangzhou, 510520, China
| | - Hong-Lin Cao
- Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China
- Key Laboratory of Vegetation Restoration and Management of Degraded Ecosystems, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China
- South China National Botanical Garden, Guangzhou, 510650, China
| | - Zhi-Hong Li
- Guangdong Forestry Survey and Planning Institute, Guangzhou, 510520, China
| | - Zheng-Feng Wang
- Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.
- Key Laboratory of Vegetation Restoration and Management of Degraded Ecosystems, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.
- South China National Botanical Garden, Guangzhou, 510650, China.
| |
Collapse
|
3
|
Procko C, Chory J, Pirro S. The Genome Sequences of 17 Species of Carnivorous Plants. BIODIVERSITY GENOMES 2023; 2023:10.56179/001c.90164. [PMID: 37990687 PMCID: PMC10662931 DOI: 10.56179/001c.90164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2023]
Abstract
We present the genome sequences of 17 species of carnivorous plants. Illumina sequencing was performed on genetic material from cultivated individuals. The reads were assembled using a de novo method followed by a finishing step. The raw and assembled data are available via Genbank.
Collapse
Affiliation(s)
- Carl Procko
- Plant Biology Laboratory, Salk Institute for Biological Studies
| | - Joanne Chory
- Plant Biology Laboratory, Salk Institute for Biological Studies
- Howard Hughes Medical Institute
| | | |
Collapse
|
4
|
Fleck SJ, Jobson RW. Molecular Phylogenomics Reveals the Deep Evolutionary History of Carnivory across Land Plants. PLANTS (BASEL, SWITZERLAND) 2023; 12:3356. [PMID: 37836100 PMCID: PMC10574757 DOI: 10.3390/plants12193356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 09/18/2023] [Accepted: 09/18/2023] [Indexed: 10/15/2023]
Abstract
Plastid molecular phylogenies that broadly sampled angiosperm lineages imply that carnivorous plants evolved at least 11 times independently in 13 families and 6 orders. Within and between these clades, the different prey capture strategies involving flypaper and pitfall structures arose in parallel with the subsequent evolution of snap traps and suction bladders. Attempts to discern the deep ontological history of carnivorous structures using multigene phylogenies have provided a plastid-level picture of sister relationships at the family level. Here, we present a molecular phylogeny of the angiosperms based on nuclear target sequence capture data (Angiosperms-353 probe set), assembled by the Kew Plant Trees of Life initiative, which aims to complete the tree of life for plants. This phylogeny encompasses all carnivorous and protocarnivorous families, although certain genera such as Philcoxia (Plantaginaceae) are excluded. This study offers a novel nuclear gene-based overview of relationships within and between carnivorous families and genera. Consistent with previous broadly sampled studies, we found that most carnivorous families are not affiliated with any single family. Instead, they emerge as sister groups to large clades comprising multiple non-carnivorous families. Additionally, we explore recent genomic studies across various carnivorous clades that examine the evolution of the carnivorous syndrome in relation to whole-genome duplication, subgenome dominance, small-scale gene duplication, and convergent evolution. Furthermore, we discuss insights into genome size evolution through the lens of carnivorous plant genomes.
Collapse
Affiliation(s)
- Steven J. Fleck
- Department of Biological Sciences, University at Buffalo, Buffalo, NY 14260, USA
| | - Richard W. Jobson
- National Herbarium of New South Wales, Botanic Gardens of Sydney, Locked Bag 6002, Mount Annan, NSW 2567, Australia
| |
Collapse
|