1
|
Madrigal G, Minhas BF, Catchen J. Klumpy: A tool to evaluate the integrity of long-read genome assemblies and illusive sequence motifs. Mol Ecol Resour 2025; 25:e13982. [PMID: 38800997 PMCID: PMC11646305 DOI: 10.1111/1755-0998.13982] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Accepted: 05/13/2024] [Indexed: 05/29/2024]
Abstract
The improvement and decreasing costs of third-generation sequencing technologies has widened the scope of biological questions researchers can address with de novo genome assemblies. With the increasing number of reference genomes, validating their integrity with minimal overhead is vital for establishing confident results in their applications. Here, we present Klumpy, a tool for detecting and visualizing both misassembled regions in a genome assembly and genetic elements (e.g. genes) of interest in a set of sequences. By leveraging the initial raw reads in combination with their respective genome assembly, we illustrate Klumpy's utility by investigating antifreeze glycoprotein (afgp) loci across two icefishes, by searching for a reported absent gene in the northern snakehead fish, and by scanning the reference genomes of a mudskipper and bumblebee for misassembled regions. In the two former cases, we were able to provide support for the noncanonical placement of an afgp locus in the icefishes and locate the missing snakehead gene. Furthermore, our genome scans were able identify an unmappable locus in the mudskipper reference genome and identify a putative repetitive element shared among several species of bees.
Collapse
Affiliation(s)
- Giovanni Madrigal
- Department of Evolution, Ecology, and BehaviorUniversity of Illinois at Urbana‐ChampaignUrbanaIllinoisUSA
| | - Bushra Fazal Minhas
- Informatics ProgramUniversity of Illinois at Urbana‐ChampaignUrbanaIllinoisUSA
| | - Julian Catchen
- Department of Evolution, Ecology, and BehaviorUniversity of Illinois at Urbana‐ChampaignUrbanaIllinoisUSA
- Informatics ProgramUniversity of Illinois at Urbana‐ChampaignUrbanaIllinoisUSA
| |
Collapse
|
2
|
Chen Q, Yang C, Zhang G, Wu D. GCI: a continuity inspector for complete genome assembly. Bioinformatics 2024; 40:btae633. [PMID: 39432569 PMCID: PMC11550331 DOI: 10.1093/bioinformatics/btae633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2024] [Revised: 10/08/2024] [Accepted: 10/18/2024] [Indexed: 10/23/2024] Open
Abstract
MOTIVATION Recent advances in long-read sequencing technologies have significantly facilitated the production of high-quality genome assembly. The telomere-to-telomere (T2T) gapless assembly has become the new golden standard of genome assembly efforts. Several recent efforts have claimed to produce T2T-level reference genomes. However, a universal standard is still missing to qualify a genome assembly to be at T2T standard. Traditional genome assembly assessment metrics (N50 and its derivatives) have no capacity in differentiating between nearly T2T assembly and the truly T2T assembly in continuity either globally or locally. Additionally, these metrics are independent of raw reads, making them inflated easily by artificial operations. Therefore, a gaplessness evaluation tool at single-nucleotide resolution to reflect true completeness is urgently needed in the era of complete genomes. RESULTS Here, we present a tool called Genome Continuity Inspector (GCI), designed to assess genome assembly continuity at single-base resolution, and evaluate how close an assembly is to the T2T level. GCI utilizes multiple aligners to map long reads from various sequencing platforms back to the assembly. By incorporating curated mapping coverage of high-confidence read alignments, GCI identifies potential assembly issues. Meanwhile, it provides GCI scores that quantify overall assembly continuity on the whole genome or chromosome scales. AVAILABILITY AND IMPLEMENTATION The open-source GCI code is freely available on Github (https://github.com/yeeus/GCI) under the MIT license.
Collapse
Affiliation(s)
- Quanyu Chen
- International Institutes of Medicine, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu 322000, China
- Center for Evolutionary & Organismal Biology, Liangzhu Laboratory, Zhejiang University Medical Center, Hangzhou 311121, China
- Chu Kochen Honors College, Zhejiang University, Hangzhou 310058, China
| | - Chentao Yang
- BGI Research, Shenzhen 518083, China
- BGI Research, Wuhan 430074, China
| | - Guojie Zhang
- International Institutes of Medicine, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu 322000, China
- Center for Evolutionary & Organismal Biology, Liangzhu Laboratory, Zhejiang University Medical Center, Hangzhou 311121, China
- Women’s Hospital, School of Medicine, Zhejiang University, Hangzhou 310006, China
| | - Dongya Wu
- Center for Evolutionary & Organismal Biology, Liangzhu Laboratory, Zhejiang University Medical Center, Hangzhou 311121, China
| |
Collapse
|
3
|
Jiang Z, Peng Z, Wei Z, Sun J, Luo Y, Bie L, Zhang G, Wang Y. A deep learning-based method enables the automatic and accurate assembly of chromosome-level genomes. Nucleic Acids Res 2024; 52:e92. [PMID: 39287126 PMCID: PMC11514472 DOI: 10.1093/nar/gkae789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Revised: 08/25/2024] [Accepted: 08/30/2024] [Indexed: 09/19/2024] Open
Abstract
The application of high-throughput chromosome conformation capture (Hi-C) technology enables the construction of chromosome-level assemblies. However, the correction of errors and the anchoring of sequences to chromosomes in the assembly remain significant challenges. In this study, we developed a deep learning-based method, AutoHiC, to address the challenges in chromosome-level genome assembly by enhancing contiguity and accuracy. Conventional Hi-C-aided scaffolding often requires manual refinement, but AutoHiC instead utilizes Hi-C data for automated workflows and iterative error correction. When trained on data from 300+ species, AutoHiC demonstrated a robust average error detection accuracy exceeding 90%. The benchmarking results confirmed its significant impact on genome contiguity and error correction. The innovative approach and comprehensive results of AutoHiC constitute a breakthrough in automated error detection, promising more accurate genome assemblies for advancing genomics research.
Collapse
Affiliation(s)
- Zijie Jiang
- Integrative Science Center of Germplasm Creation in Western China (CHONGQING) Science City, Biological Science Research Center, Southwest University, Chongqing, China
| | - Zhixiang Peng
- Integrative Science Center of Germplasm Creation in Western China (CHONGQING) Science City, Biological Science Research Center, Southwest University, Chongqing, China
| | - Zhaoyuan Wei
- Integrative Science Center of Germplasm Creation in Western China (CHONGQING) Science City, Biological Science Research Center, Southwest University, Chongqing, China
| | - Jiahe Sun
- Integrative Science Center of Germplasm Creation in Western China (CHONGQING) Science City, Biological Science Research Center, Southwest University, Chongqing, China
| | - Yongjiang Luo
- Integrative Science Center of Germplasm Creation in Western China (CHONGQING) Science City, Biological Science Research Center, Southwest University, Chongqing, China
| | - Lingzi Bie
- Integrative Science Center of Germplasm Creation in Western China (CHONGQING) Science City, Biological Science Research Center, Southwest University, Chongqing, China
| | - Guoqing Zhang
- Integrative Science Center of Germplasm Creation in Western China (CHONGQING) Science City, Biological Science Research Center, Southwest University, Chongqing, China
| | - Yi Wang
- Integrative Science Center of Germplasm Creation in Western China (CHONGQING) Science City, Biological Science Research Center, Southwest University, Chongqing, China
| |
Collapse
|
4
|
Hjelmen CE. Genome size and chromosome number are critical metrics for accurate genome assembly assessment in Eukaryota. Genetics 2024; 227:iyae099. [PMID: 38869251 DOI: 10.1093/genetics/iyae099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Revised: 04/02/2024] [Accepted: 06/06/2024] [Indexed: 06/14/2024] Open
Abstract
The number of genome assemblies has rapidly increased in recent history, with NCBI databases reaching over 41,000 eukaryotic genome assemblies across about 2,300 species. Increases in read length and improvements in assembly algorithms have led to increased contiguity and larger genome assemblies. While this number of assemblies is impressive, only about a third of these assemblies have corresponding genome size estimations for their respective species on publicly available databases. In this paper, genome assemblies are assessed regarding their total size compared to their respective publicly available genome size estimations. These deviations in size are assessed related to genome size, kingdom, sequencing platform, and standard assembly metrics, such as N50 and BUSCO values. A large proportion of assemblies deviate from their estimated genome size by more than 10%, with increasing deviations in size with increased genome size, suggesting nonprotein coding and structural DNA may be to blame. Modest differences in performance of sequencing platforms are noted as well. While standard metrics of genome assessment are more likely to indicate an assembly approaching the estimated genome size, much of the variation in this deviation in size is not explained with these raw metrics. A new, proportional N50 metric is proposed, in which N50 values are made relative to the average chromosome size of each species. This new metric has a stronger relationship with complete genome assemblies and, due to its proportional nature, allows for a more direct comparison across assemblies for genomes with variation in sizes and architectures.
Collapse
Affiliation(s)
- Carl E Hjelmen
- Department of Biology, Utah Valley University, 800 W. University Parkway, Orem, UT 84058, USA
| |
Collapse
|
5
|
Gao Z, Lu Y, Chong Y, Li M, Hong J, Wu J, Wu D, Xi D, Deng W. Beef Cattle Genome Project: Advances in Genome Sequencing, Assembly, and Functional Genes Discovery. Int J Mol Sci 2024; 25:7147. [PMID: 39000250 PMCID: PMC11240973 DOI: 10.3390/ijms25137147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Revised: 06/23/2024] [Accepted: 06/26/2024] [Indexed: 07/16/2024] Open
Abstract
Beef is a major global source of protein, playing an essential role in the human diet. The worldwide production and consumption of beef continue to rise, reflecting a significant trend. However, despite the critical importance of beef cattle resources in agriculture, the diversity of cattle breeds faces severe challenges, with many breeds at risk of extinction. The initiation of the Beef Cattle Genome Project is crucial. By constructing a high-precision functional annotation map of their genome, it becomes possible to analyze the genetic mechanisms underlying important traits in beef cattle, laying a solid foundation for breeding more efficient and productive cattle breeds. This review details advances in genome sequencing and assembly technologies, iterative upgrades of the beef cattle reference genome, and its application in pan-genome research. Additionally, it summarizes relevant studies on the discovery of functional genes associated with key traits in beef cattle, such as growth, meat quality, reproduction, polled traits, disease resistance, and environmental adaptability. Finally, the review explores the potential of telomere-to-telomere (T2T) genome assembly, structural variations (SVs), and multi-omics techniques in future beef cattle genetic breeding. These advancements collectively offer promising avenues for enhancing beef cattle breeding and improving genetic traits.
Collapse
Affiliation(s)
- Zhendong Gao
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Ying Lu
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Yuqing Chong
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Mengfei Li
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Jieyun Hong
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Jiao Wu
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Dongwang Wu
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Dongmei Xi
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
| | - Weidong Deng
- Yunnan Provincial Key Laboratory of Animal Nutrition and Feed, Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, China
- State Key Laboratory for Conservation and Utilization of Bio-Resource in Yunnan, Kunming 650201, China
| |
Collapse
|
6
|
Wang S, Lu L, Xu M, Jiang J, Wang X, Zheng Y, Liang Y, Zhang T, Qin M, Zhu P, Xu L, Jiang Y. Near-complete de novo genome assemblies of tomato (Solanum lycopersicum) determinate cultivars Micro-Tom and M82. J Genet Genomics 2024:S1673-8527(24)00144-9. [PMID: 38897428 DOI: 10.1016/j.jgg.2024.06.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2024] [Revised: 06/09/2024] [Accepted: 06/11/2024] [Indexed: 06/21/2024]
Affiliation(s)
- Shuangshuang Wang
- School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Lei Lu
- School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Min Xu
- School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Jian Jiang
- School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Xiaofeng Wang
- School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Yao Zheng
- School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Yitao Liang
- School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Tianqi Zhang
- School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Minghui Qin
- School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Pinkuan Zhu
- School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Ling Xu
- School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Yina Jiang
- School of Life Sciences, East China Normal University, Shanghai 200241, China.
| |
Collapse
|
7
|
Ilík V, Schwarz EM, Nosková E, Pafčo B. Hookworm genomics: dusk or dawn? Trends Parasitol 2024; 40:452-465. [PMID: 38677925 DOI: 10.1016/j.pt.2024.04.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 03/28/2024] [Accepted: 04/04/2024] [Indexed: 04/29/2024]
Abstract
Hookworms are parasites, closely related to the model nematode Caenorhabditis elegans, that are a major economic and health burden worldwide. Primarily three hookworm species (Necator americanus, Ancylostoma duodenale, and Ancylostoma ceylanicum) infect humans. Another 100 hookworm species from 19 genera infect primates, ruminants, and carnivores. Genetic data exist for only seven of these species. Genome sequences are available from only four of these species in two genera, leaving 96 others (particularly those parasitizing wildlife) without any genomic data. The most recent hookworm genomes were published 5 years ago, leaving the field in a dusk. However, assembling genomes from single hookworms may bring a new dawn. Here we summarize advances, challenges, and opportunities for studying these neglected but important parasitic nematodes.
Collapse
Affiliation(s)
- Vladislav Ilík
- Institute of Vertebrate Biology, Czech Academy of Sciences, Brno, Czech Republic; Department of Botany and Zoology, Faculty of Science, Masaryk University, Brno, Czech Republic.
| | - Erich M Schwarz
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| | - Eva Nosková
- Institute of Vertebrate Biology, Czech Academy of Sciences, Brno, Czech Republic; Department of Botany and Zoology, Faculty of Science, Masaryk University, Brno, Czech Republic
| | - Barbora Pafčo
- Institute of Vertebrate Biology, Czech Academy of Sciences, Brno, Czech Republic.
| |
Collapse
|
8
|
Tao L, Guo S, Xiong Z, Zhang R, Sun W. Chromosome-level genome assembly of the threatened resource plant Cinnamomum chago. Sci Data 2024; 11:447. [PMID: 38702363 PMCID: PMC11068913 DOI: 10.1038/s41597-024-03293-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Accepted: 04/22/2024] [Indexed: 05/06/2024] Open
Abstract
Cinnamomum chago is a tree species endemic to Yunnan province, China, with potential economic value, phylogenetic importance, and conservation priority. We assembled the genome of C. chago using multiple sequencing technologies, resulting in a high-quality, chromosomal-level genome with annotation information. The assembled genome size is approximately 1.06 Gb, with a contig N50 length of 92.10 Mb. About 99.92% of the assembled sequences could be anchored to 12 pseudo-chromosomes, with only one gap, and 63.73% of the assembled genome consists of repeat sequences. In total, 30,497 genes were recognized according to annotation, including 28,681 protein-coding genes. This high-quality chromosome-level assembly and annotation of C. chago will assist us in the conservation and utilization of this valuable resource, while also providing crucial data for studying the evolutionary relationships within the Cinnamomum genus, offering opportunities for further research and exploration of its diverse applications.
Collapse
Affiliation(s)
- Lidan Tao
- Yunnan Key Laboratory for integrative conservation of Plant Species with extremely Small Populations, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, 650201, China
- CAS Key Laboratory for Plant Diversity and Biogeography of East Asia, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, 650201, China
- University of Chinese Academy of Sciences, Beijing, 101408, China
| | - Shiwei Guo
- Yunnan Key Laboratory for integrative conservation of Plant Species with extremely Small Populations, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, 650201, China
- CAS Key Laboratory for Plant Diversity and Biogeography of East Asia, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, 650201, China
- University of Chinese Academy of Sciences, Beijing, 101408, China
| | - Zizhu Xiong
- Yunnan Key Laboratory for integrative conservation of Plant Species with extremely Small Populations, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, 650201, China
- CAS Key Laboratory for Plant Diversity and Biogeography of East Asia, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, 650201, China
- University of Chinese Academy of Sciences, Beijing, 101408, China
| | - Rengang Zhang
- Yunnan Key Laboratory for integrative conservation of Plant Species with extremely Small Populations, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, 650201, China
- CAS Key Laboratory for Plant Diversity and Biogeography of East Asia, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, 650201, China
- University of Chinese Academy of Sciences, Beijing, 101408, China
| | - Weibang Sun
- Yunnan Key Laboratory for integrative conservation of Plant Species with extremely Small Populations, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, 650201, China.
- CAS Key Laboratory for Plant Diversity and Biogeography of East Asia, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, 650201, China.
- Kunming Botanic Garden, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, 650201, China.
| |
Collapse
|
9
|
Manuzzi A, Aguirre-Sarabia I, Díaz-Arce N, Bekkevold D, Jansen T, Gomez-Garrido J, Alioto TS, Gut M, Castonguay M, Sanchez-Maroño S, Álvarez P, Rodriguez-Ezpeleta N. Atlantic mackerel population structure does not support genetically distinct spawning components. OPEN RESEARCH EUROPE 2024; 4:82. [PMID: 39524113 PMCID: PMC11544206 DOI: 10.12688/openreseurope.17365.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 03/22/2024] [Indexed: 11/16/2024]
Abstract
Background The Atlantic mackerel, Scomber scombrus (Linnaeus, 1758) is a commercially valuable migratory pelagic fish inhabiting the northern Atlantic Ocean and the Mediterranean Sea. Given its highly migratory behaviour for feeding and spawning, several studies have been conducted to assess differentiation among spawning components to better define management units, as well as to investigate possible adaptations to comprehend and predict recent range expansion northwards. Methods Here, a high-quality genome of S. scombrus was sequenced and annotated, as an increasing number of population genetic studies have proven the relevance of reference genomes to investigate genomic markers/regions potentially linked to differences at finer scale. Such reference genome was used to map Restriction-site-associated sequencing (RAD-seq) reads for SNP discovery and genotyping in more than 500 samples distributed along the species range. The resulting genotyping tables have been used to perform connectivity and adaptation analyses. Results The assembly of the reference genome for S. scombrus resulted in a high-quality genome of 741 Mb. Our population genetic results show that the Atlantic mackerel consist of three previously known genetically isolated units (Northwest Atlantic, Northeast Atlantic, Mediterranean), and provide no evidence for genetically distinct spawning components within the Northwest or Northeast Atlantic. Conclusions Therefore, our findings resolved previous uncertainties by confirming the absence of genetically isolated spawning components in each side of the northern Atlantic, thus rejecting homing behaviour and the need to redefine management boundaries in this species. In addition, no further genetic signs of ongoing adaptation were detected in this species.
Collapse
Affiliation(s)
- Alice Manuzzi
- AZTI, Marine Research, Basque Research and Technology Alliance (BRTA), Sukarrieta, Spain
| | - Imanol Aguirre-Sarabia
- AZTI, Marine Research, Basque Research and Technology Alliance (BRTA), Sukarrieta, Spain
| | - Natalia Díaz-Arce
- AZTI, Marine Research, Basque Research and Technology Alliance (BRTA), Sukarrieta, Spain
| | - Dorte Bekkevold
- DTU Aqua, National Institute of Aquatic Resources, Section for Marine Living Resources, Silkeborg, Denmark
| | - Teunis Jansen
- DTU Aqua, National Institute of Aquatic Resources, Section for Marine Living Resources, Silkeborg, Denmark
- GINR, Greenland Institute of Natural Resources, Nuuk, Greenland
| | - Jessica Gomez-Garrido
- Centro Nacional de Análisis Genómico (CNAG), Barcelona, Spain
- Universitat de Barcelona (UB), Barcelona, Spain
| | - Tyler S. Alioto
- Centro Nacional de Análisis Genómico (CNAG), Barcelona, Spain
- Universitat de Barcelona (UB), Barcelona, Spain
| | - Marta Gut
- Centro Nacional de Análisis Genómico (CNAG), Barcelona, Spain
- Universitat de Barcelona (UB), Barcelona, Spain
| | - Martin Castonguay
- Maurice Lamontagne Institute, Fisheries and Oceans Canada, Mont-Joli, Canada, Mont-Joli, Canada
| | - Sonia Sanchez-Maroño
- AZTI, Marine Research, Basque Research and Technology Alliance (BRTA), Sukarrieta, Spain
| | - Paula Álvarez
- AZTI, Marine Research, Basque Research and Technology Alliance (BRTA), Sukarrieta, Spain
| | | |
Collapse
|
10
|
Han L, Luo X, Zhao Y, Li N, Xu Y, Ma K. A haplotype-resolved genome provides insight into allele-specific expression in wild walnut (Juglans regia L.). Sci Data 2024; 11:278. [PMID: 38459062 PMCID: PMC10923786 DOI: 10.1038/s41597-024-03096-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 02/27/2024] [Indexed: 03/10/2024] Open
Abstract
Wild germplasm resources are crucial for gene mining and molecular breeding because of their special trait performance. Haplotype-resolved genome is an ideal solution for fully understanding the biology of subgenomes in highly heterozygous species. Here, we surveyed the genome of a wild walnut tree from Gongliu County, Xinjiang, China, and generated a haplotype-resolved reference genome of 562.99 Mb (contig N50 = 34.10 Mb) for one haplotype (hap1) and 561.07 Mb (contig N50 = 33.91 Mb) for another haplotype (hap2) using PacBio high-fidelity (HiFi) reads and Hi-C technology. Approximately 527.20 Mb (93.64%) of hap1 and 526.40 Mb (93.82%) of hap2 were assigned to 16 pseudochromosomes. A total of 41039 and 39744 protein-coding gene models were predicted for hap1 and hap2, respectively. Moreover, 123 structural variations (SVs) were identified between the two haplotype genomes. Allele-specific expression genes (ASEGs) that respond to cold stress were ultimately identified. These datasets can be used to study subgenome evolution, for functional elite gene mining and to discover the transcriptional basis of specific traits related to environmental adaptation in wild walnut.
Collapse
Affiliation(s)
- Liqun Han
- Institute of Horticulture Crops, Xinjiang Academy of Agricultural Sciences, the State Key Laboratory of Genetic Improvement and Germplasm Innovation of Crop Resistance in Arid Desert Regions, Key Laboratory of Genome Research and Genetic Improvement of Xinjiang Characteristic Fruits and Vegetables, Urumqi, China
| | - Xiang Luo
- College of Agriculture, Henan University, Zhengzhou, China
| | - Yu Zhao
- Institute of Horticulture Crops, Xinjiang Academy of Agricultural Sciences, the State Key Laboratory of Genetic Improvement and Germplasm Innovation of Crop Resistance in Arid Desert Regions, Key Laboratory of Genome Research and Genetic Improvement of Xinjiang Characteristic Fruits and Vegetables, Urumqi, China
| | - Ning Li
- Institute of Horticulture Crops, Xinjiang Academy of Agricultural Sciences, the State Key Laboratory of Genetic Improvement and Germplasm Innovation of Crop Resistance in Arid Desert Regions, Key Laboratory of Genome Research and Genetic Improvement of Xinjiang Characteristic Fruits and Vegetables, Urumqi, China
| | - Yuhui Xu
- Institute of Horticulture Crops, Xinjiang Academy of Agricultural Sciences, the State Key Laboratory of Genetic Improvement and Germplasm Innovation of Crop Resistance in Arid Desert Regions, Key Laboratory of Genome Research and Genetic Improvement of Xinjiang Characteristic Fruits and Vegetables, Urumqi, China.
| | - Kai Ma
- Institute of Horticulture Crops, Xinjiang Academy of Agricultural Sciences, the State Key Laboratory of Genetic Improvement and Germplasm Innovation of Crop Resistance in Arid Desert Regions, Key Laboratory of Genome Research and Genetic Improvement of Xinjiang Characteristic Fruits and Vegetables, Urumqi, China.
| |
Collapse
|
11
|
Roell MS, Ott MC, Mair MM, Pamminger T. Missing Genomic Resources for the Next Generation of Environmental Risk Assessment. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:1877-1881. [PMID: 38245867 PMCID: PMC10832041 DOI: 10.1021/acs.est.3c08701] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 12/13/2023] [Accepted: 12/13/2023] [Indexed: 01/22/2024]
Abstract
Environmental risk assessment traditionally relies on a wide range of in vivo testing to assess the potential hazards of chemicals in the environment. These tests are often time-consuming and costly and can cause test organisms' suffering. Recent developments of reliable low-cost alternatives, both in vivo- and in silico-based, opened the door to reconsider current toxicity assessment. However, many of these new approach methodologies (NAMs) rely on high-quality annotated genomes for surrogate species of regulatory risk assessment. Currently, a lack of genomic information slows the process of NAM development. Here, we present a phylogenetically resolved overview of missing genomic resources for surrogate species within a regulatory ecotoxicological risk assessment. We call for an organized and systematic effort within the (regulatory) ecotoxicological community to provide these missing genomic resources. Further, we discuss the potential of a standardized genomic surrogate species landscape to enable a robust and nonanimal-reliant ecotoxicological risk assessment in the systems ecotoxicology era.
Collapse
Affiliation(s)
- Marc-Sven Roell
- R&D
Bayer AG, Crop Science Division, Monheim am Rhein 40789, Germany
| | | | - Magdalena M. Mair
- Bayreuth
Center for Ecology and Environmental Research (BayCEER), Bayreuth 95447, Germany
- Statistical
Ecotoxicology, University of Bayreuth, Bayreuth 95447, Germany
| | - Tobias Pamminger
- R&D
Bayer AG, Crop Science Division, Monheim am Rhein 40789, Germany
| |
Collapse
|
12
|
Lan L, Zhao H, Xu S, Kan S, Zhang X, Liu W, Liao X, Tembrock LR, Ren Y, Reeve W, Yang J, Wu Z. A high-quality Bougainvillea genome provides new insights into evolutionary history and pigment biosynthetic pathways in the Caryophyllales. HORTICULTURE RESEARCH 2023; 10:uhad124. [PMID: 37554346 PMCID: PMC10405137 DOI: 10.1093/hr/uhad124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Accepted: 06/05/2023] [Indexed: 08/10/2023]
Abstract
Bougainvillea is a perennial ornamental shrub that is highly regarded in ornamental horticulture around the world. However, the absence of genome data limits our understanding of the pathways involved in bract coloration and breeding. Here, we report a chromosome-level assembly of the giga-genome of Bougainvillea × buttiana 'Mrs Butt', a cultivar thought to be the origin of many other Bougainvillea cultivars. The assembled genome is ~5 Gb with a scaffold N50 of 151 756 278 bp and contains 86 572 genes which have undergone recent whole-genome duplication. We confirmed that multiple rounds of whole-genome multiplication have occurred in the evolutionary history of the Caryophyllales, reconstructed the relationship in the Caryophyllales at whole genome level, and found discordance between species and gene trees as the result of complex introgression events. We investigated betalain and anthocyanin biosynthetic pathways and found instances of independent evolutionary innovations in the nine different Caryophyllales species. To explore the potential formation mechanism of diverse bract colors in Bougainvillea, we analyzed the genes involved in betalain and anthocyanin biosynthesis and found extremely low expression of ANS and DFR genes in all cultivars, which may limit anthocyanin biosynthesis. Our findings indicate that the expression pattern of the betalain biosynthetic pathway did not directly correlate with bract color, and a higher expression level in the betalain biosynthetic pathway is required for colored bracts. This improved understanding of the correlation between gene expression and bract color allows plant breeding outcomes to be predicted with greater certainty.
Collapse
Affiliation(s)
- Lan Lan
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
- School of Medical, Molecularand Forensic Sciences, Murdoch University, 6150, Western Australia, 90 South Street, Murdoch, Australia
- Kunpeng Institute of Modern Agriculture at Foshan, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
| | - Huiqi Zhao
- Sanya Institute, Hainan Academy of Agricultural Sciences, Sanya, 572025, China
- Institute of Tropical Horticulture Research, Hainan Academy of Agricultural Sciences, Haikou, 571100, China
| | - Suxia Xu
- Fujian Key Laboratory of Subtropical Plant Physiology & Biochemistry, Fujian Institute of Subtropical Botany, Xiamen, 361006, China
| | - Shenglong Kan
- Kunpeng Institute of Modern Agriculture at Foshan, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
| | - Xiaoni Zhang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
- Kunpeng Institute of Modern Agriculture at Foshan, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
| | - Weichao Liu
- Kunpeng Institute of Modern Agriculture at Foshan, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
- Key Laboratory of Horticultural Plant Biology, College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, 430070, China
| | - Xuezhu Liao
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
- Kunpeng Institute of Modern Agriculture at Foshan, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
| | - Luke R Tembrock
- Department of Agricultural Biology, Colorado State University, Fort Collins, CO, 80523, USA
| | - Yonglin Ren
- School of Medical, Molecularand Forensic Sciences, Murdoch University, 6150, Western Australia, 90 South Street, Murdoch, Australia
| | - Wayne Reeve
- School of Medical, Molecularand Forensic Sciences, Murdoch University, 6150, Western Australia, 90 South Street, Murdoch, Australia
| | - Jun Yang
- Sanya Institute, Hainan Academy of Agricultural Sciences, Sanya, 572025, China
- Institute of Tropical Horticulture Research, Hainan Academy of Agricultural Sciences, Haikou, 571100, China
| | - Zhiqiang Wu
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
- Kunpeng Institute of Modern Agriculture at Foshan, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
| |
Collapse
|