1
|
Garg V, Bohra A, Mascher M, Spannagl M, Xu X, Bevan MW, Bennetzen JL, Varshney RK. Unlocking plant genetics with telomere-to-telomere genome assemblies. Nat Genet 2024:10.1038/s41588-024-01830-7. [PMID: 39048791 DOI: 10.1038/s41588-024-01830-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2024] [Accepted: 06/12/2024] [Indexed: 07/27/2024]
Abstract
Contiguous genome sequence assemblies will help us to realize the full potential of crop translational genomics. Recent advances in sequencing technologies, especially long-read sequencing strategies, have made it possible to construct gapless telomere-to-telomere (T2T) assemblies, thus offering novel insights into genome organization and function. Plant genomes pose unique challenges, such as a continuum of ancient to recent polyploidy and abundant highly similar and long repetitive elements. Owing to progress in sequencing approaches, for most crop plants, chromosome-scale reference genome assemblies are available, but T2T assembly construction remains challenging. Here we describe methods for haplotype-resolved, gapless T2T assembly construction in plants, including various crop species. We outline the impact of T2T assemblies in elucidating the roles of repetitive elements in gene regulation, as well as in pangenomics, functional genomics, genome-assisted breeding and targeted genome manipulation. In conjunction with sequence-enriched germplasm repositories, T2T assemblies thus hold great promise for basic and applied plant sciences.
Collapse
Affiliation(s)
- Vanika Garg
- WA State Agricultural Biotechnology Centre, Centre for Crop and Food Innovation, Food Futures Institute, Murdoch University, Murdoch, Western Australia, Australia
| | - Abhishek Bohra
- WA State Agricultural Biotechnology Centre, Centre for Crop and Food Innovation, Food Futures Institute, Murdoch University, Murdoch, Western Australia, Australia
- ICAR-Indian Institute of Pulses Research, Kanpur, India
| | - Martin Mascher
- Leibniz Institute of Plant Genetics and Crop Plant Research, Gatersleben, Seeland, Germany
| | - Manuel Spannagl
- WA State Agricultural Biotechnology Centre, Centre for Crop and Food Innovation, Food Futures Institute, Murdoch University, Murdoch, Western Australia, Australia
- Plant Genome and Systems Biology, German Research Center for Environmental Health, Helmholtz Zentrum München, Neuherberg, Germany
| | - Xun Xu
- WA State Agricultural Biotechnology Centre, Centre for Crop and Food Innovation, Food Futures Institute, Murdoch University, Murdoch, Western Australia, Australia
- BGI-Shenzhen, Shenzhen, China
| | | | | | - Rajeev K Varshney
- WA State Agricultural Biotechnology Centre, Centre for Crop and Food Innovation, Food Futures Institute, Murdoch University, Murdoch, Western Australia, Australia.
| |
Collapse
|
2
|
Goel M, Campoy JA, Krause K, Baus LC, Sahu A, Sun H, Walkemeier B, Marek M, Beaudry R, Ruiz D, Huettel B, Schneeberger K. The vast majority of somatic mutations in plants are layer-specific. Genome Biol 2024; 25:194. [PMID: 39049052 PMCID: PMC11267851 DOI: 10.1186/s13059-024-03337-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Accepted: 07/15/2024] [Indexed: 07/27/2024] Open
Abstract
BACKGROUND Plant meristems are structured organs consisting of distinct layers of stem cells, which differentiate into new plant tissue. Mutations in meristematic layers can propagate into large sectors of the plant. However, the characteristics of meristematic mutations remain unclear, limiting our understanding of the genetic basis of somaclonal phenotypic variation. RESULTS Here, we analyse the frequency and distribution of somatic mutations in an apricot tree. We separately sequence the epidermis (developing from meristem layer 1) and the flesh (developing from meristem layer 2) of several fruits sampled across the entire tree. We find that most somatic mutations (> 90%) are specific to individual layers. Interestingly, layer 1 shows a higher mutation load than layer 2, implying different mutational dynamics between the layers. The distribution of somatic mutations follows the branching of the tree. This suggests that somatic mutations are propagated to developing branches through axillary meristems. In turn, this leads us to the unexpected observation that the genomes of layer 1 of distant branches are more similar to each other than to the genomes of layer 2 of the same branches. Finally, using single-cell RNA sequencing, we demonstrate that layer-specific mutations were only transcribed in the cells of the respective layers and can form the genetic basis of somaclonal phenotypic variation. CONCLUSIONS Here, we analyse the frequency and distribution of somatic mutations with meristematic origin. Our observations on the layer specificity of somatic mutations outline how they are distributed, how they propagate, and how they can impact clonally propagated crops.
Collapse
Affiliation(s)
- Manish Goel
- Faculty of Biology, LMU Munich, Planegg-Martinsried, Germany
- Department of Chromosome Biology, Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | - José A Campoy
- Department of Chromosome Biology, Max Planck Institute for Plant Breeding Research, Cologne, Germany
- Department of Pomology, Estación Experimental de Aula Dei (EEAD), CSIC, Saragossa, 50059, Spain
| | - Kristin Krause
- Department of Chromosome Biology, Max Planck Institute for Plant Breeding Research, Cologne, Germany
- Present address: Illumina Solutions Center Berlin, Berlin, Germany
| | - Lisa C Baus
- Faculty of Biology, LMU Munich, Planegg-Martinsried, Germany
- Department of Chromosome Biology, Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | - Anshupa Sahu
- Institute for Medical Biometry, Informatics and Epidemiology, University Hospital Bonn, Bonn, Germany
- Institute for Genomic Statistics and Bioinformatics, University Hospital Bonn, Bonn, Germany
| | - Hequan Sun
- Department of Chromosome Biology, Max Planck Institute for Plant Breeding Research, Cologne, Germany
- Present address: Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Birgit Walkemeier
- Department of Chromosome Biology, Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | | | - Randy Beaudry
- Department of Horticulture, Michigan State University, East Lansing, MI, 48824, USA
| | - David Ruiz
- Department of Plant Breeding, CEBAS-CSIC, P.O. Box 164, Espinardo, Murcia, 30100, Spain
| | | | - Korbinian Schneeberger
- Faculty of Biology, LMU Munich, Planegg-Martinsried, Germany.
- Department of Chromosome Biology, Max Planck Institute for Plant Breeding Research, Cologne, Germany.
- CEPLAS (Cluster of Excellence On Plant Sciences), Heinrich-Heine University, Düsseldorf, Germany.
| |
Collapse
|
3
|
Bredemeyer KR, vonHoldt BM, Foley NM, Childers IR, Brzeski KE, Murphy WJ. The value of hybrid genomes: Building two highly contiguous reference genome assemblies to advance Canis genomic studies. J Hered 2024; 115:480-486. [PMID: 38416051 DOI: 10.1093/jhered/esae013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Revised: 02/21/2024] [Accepted: 02/27/2024] [Indexed: 02/29/2024] Open
Abstract
Previous studies of canid population and evolutionary genetics have relied on high-quality domestic dog reference genomes that have been produced primarily for biomedical and trait mapping studies in dog breeds. However, the absence of highly contiguous genomes from other Canis species like the gray wolf and coyote, that represent additional distinct demographic histories, may bias inferences regarding interspecific genetic diversity and phylogenetic relationships. Here, we present single haplotype de novo genome assemblies for the gray wolf and coyote, generated by applying the trio-binning approach to long sequence reads generated from the genome of a female first-generation hybrid produced from a gray wolf and coyote mating. The assemblies were highly contiguous, with contig N50 sizes of 44.6 and 42.0 Mb for the wolf and coyote, respectively. Genome scaffolding and alignments between the two Canis assemblies and published dog reference genomes showed near complete collinearity, with one exception: a coyote-specific chromosome fission of chromosome 13 and fusion of the proximal portion of that chromosome with chromosome 8, retaining the Canis-typical haploid chromosome number of 2n = 78. We evaluated mapping quality for previous RADseq data from 334 canids and found nearly identical mapping quality and patterns among canid species and regional populations regardless of the genome used for alignment (dog, coyote, or gray wolf). These novel wolf and coyote genome reference assemblies will be important resources for proper and accurate inference of Canis demography, taxonomic evaluation, and conservation genetics.
Collapse
Affiliation(s)
- Kevin R Bredemeyer
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, United States
- Interdisciplinary Program in Genetics & Genomics, Texas A&M University, College Station, TX, United States
| | - Bridgett M vonHoldt
- Department of Ecology & Evolutionary Biology, Princeton University, Princeton, NJ, United States
| | - Nicole M Foley
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, United States
| | - Isabella R Childers
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, United States
- Interdisciplinary Program in Genetics & Genomics, Texas A&M University, College Station, TX, United States
| | - Kristin E Brzeski
- College of Forest Resources and Environment Science, Michigan Technological University, Houghton, MI, United States
| | - William J Murphy
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, United States
- Interdisciplinary Program in Genetics & Genomics, Texas A&M University, College Station, TX, United States
| |
Collapse
|
4
|
Nguyen AK, Blacksmith MS, Kidd JM. Duplications and Retrogenes Are Numerous and Widespread in Modern Canine Genomic Assemblies. Genome Biol Evol 2024; 16:evae142. [PMID: 38946312 PMCID: PMC11259980 DOI: 10.1093/gbe/evae142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 05/08/2024] [Accepted: 06/24/2024] [Indexed: 07/02/2024] Open
Abstract
Recent years have seen a dramatic increase in the number of canine genome assemblies available. Duplications are an important source of evolutionary novelty and are also prone to misassembly. We explored the duplication content of nine canine genome assemblies using both genome self-alignment and read-depth approaches. We find that 8.58% of the genome is duplicated in the canFam4 assembly, derived from the German Shepherd Dog Mischka, including 90.15% of unplaced contigs. Highlighting the continued difficulty in properly assembling duplications, less than half of read-depth and assembly alignment duplications overlap, but the mCanLor1.2 Greenland wolf assembly shows greater concordance. Further study shows the presence of multiple segments that have alignments to four or more duplicate copies. These high-recurrence duplications correspond to gene retrocopies. We identified 3,892 candidate retrocopies from 1,316 parental genes in the canFam4 assembly and find that ∼8.82% of duplicated base pairs involve a retrocopy, confirming this mechanism as a major driver of gene duplication in canines. Similar patterns are found across eight other recent canine genome assemblies, with metrics supporting a greater quality of the PacBio HiFi mCanLor1.2 assembly. Comparison between the wolf and other canine assemblies found that 92% of retrocopy insertions are shared between assemblies. By calculating the number of generations since genome divergence, we estimate that new retrocopy insertions appear, on average, in 1 out of 3,514 births. Our analyses illustrate the impact of retrogene formation on canine genomes and highlight the variable representation of duplicated sequences among recently completed canine assemblies.
Collapse
Affiliation(s)
- Anthony K Nguyen
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Matthew S Blacksmith
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Jeffrey M Kidd
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
5
|
Talenti A, Wilkinson T, Cook EA, Hemmink JD, Paxton E, Mutinda M, Ngulu SD, Jayaraman S, Bishop RP, Obara I, Hourlier T, Garcia Giron C, Martin FJ, Labuschagne M, Atimnedi P, Nanteza A, Keyyu JD, Mramba F, Caron A, Cornelis D, Chardonnet P, Fyumagwa R, Lembo T, Auty HK, Michaux J, Smitz N, Toye P, Robert C, Prendergast JGD, Morrison LJ. Continent-wide genomic analysis of the African buffalo (Syncerus caffer). Commun Biol 2024; 7:792. [PMID: 38951693 PMCID: PMC11217449 DOI: 10.1038/s42003-024-06481-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Accepted: 06/21/2024] [Indexed: 07/03/2024] Open
Abstract
The African buffalo (Syncerus caffer) is a wild bovid with a historical distribution across much of sub-Saharan Africa. Genomic analysis can provide insights into the evolutionary history of the species, and the key selective pressures shaping populations, including assessment of population level differentiation, population fragmentation, and population genetic structure. In this study we generated the highest quality de novo genome assembly (2.65 Gb, scaffold N50 69.17 Mb) of African buffalo to date, and sequenced a further 195 genomes from across the species distribution. Principal component and admixture analyses provided little support for the currently described four subspecies. Estimating Effective Migration Surfaces analysis suggested that geographical barriers have played a significant role in shaping gene flow and the population structure. Estimated effective population sizes indicated a substantial drop occurring in all populations 5-10,000 years ago, coinciding with the increase in human populations. Finally, signatures of selection were enriched for key genes associated with the immune response, suggesting infectious disease exert a substantial selective pressure upon the African buffalo. These findings have important implications for understanding bovid evolution, buffalo conservation and population management.
Collapse
Affiliation(s)
- Andrea Talenti
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian, EH25 9RG, United Kingdom
- Centre for Tropical Livestock Genetics and Health (CTLGH), Roslin Institute, University of Edinburgh, Easter Bush Campus, Roslin, EH25 9RG, United Kingdom
| | - Toby Wilkinson
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian, EH25 9RG, United Kingdom
- Centre for Tropical Livestock Genetics and Health (CTLGH), Roslin Institute, University of Edinburgh, Easter Bush Campus, Roslin, EH25 9RG, United Kingdom
| | - Elizabeth A Cook
- International Livestock Research Institute, P.O. Box 30709, Nairobi, 00100, Kenya
- Centre for Tropical Livestock Genetics and Health (CTLGH), ILRI Kenya, P.O. Box 30709, Nairobi, 00100, Kenya
| | - Johanneke D Hemmink
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian, EH25 9RG, United Kingdom
- Centre for Tropical Livestock Genetics and Health (CTLGH), Roslin Institute, University of Edinburgh, Easter Bush Campus, Roslin, EH25 9RG, United Kingdom
- International Livestock Research Institute, P.O. Box 30709, Nairobi, 00100, Kenya
- Centre for Tropical Livestock Genetics and Health (CTLGH), ILRI Kenya, P.O. Box 30709, Nairobi, 00100, Kenya
| | - Edith Paxton
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian, EH25 9RG, United Kingdom
| | - Matthew Mutinda
- Kenya Wildlife Service, P.O. Box 40241, Nairobi, 00100, Kenya
| | | | - Siddharth Jayaraman
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian, EH25 9RG, United Kingdom
| | - Richard P Bishop
- International Livestock Research Institute, P.O. Box 30709, Nairobi, 00100, Kenya
| | - Isaiah Obara
- Institute for Parasitology and Tropical Veterinary Medicine, Freie Universität Berlin, Robert-von-Ostertag-Str. 7-13, 14163, Berlin, Germany
| | - Thibaut Hourlier
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, United Kingdom
| | - Carlos Garcia Giron
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, United Kingdom
| | - Fergal J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, United Kingdom
| | | | | | - Anne Nanteza
- College of Veterinary Medicine, Animal Resources and Biosecurity, Makerere University, Kampala, Uganda
| | - Julius D Keyyu
- Tanzania Wildlife Research Institute, Box 661, Arusha, Tanzania
| | - Furaha Mramba
- Vector and Vector-Borne Diseases Institute, Tanga, Tanzania
| | - Alexandre Caron
- ASTRE, University of Montpellier (UMR), CIRAD, 34090, Montpellier, France
- CIRAD, UMR ASTRE, RP-PCP, Maputo, 01009, Mozambique
- Faculdade Veterinaria, Universidade Eduardo Mondlan, Maputo, Mozambique
| | - Daniel Cornelis
- CIRAD, Forêts et Sociétés, 34398, Montpellier, France
- Forêts et Sociétés, University of Montpellier, CIRAD, 34090, Montpellier, France
| | | | - Robert Fyumagwa
- Tanzania Wildlife Research Institute, Box 661, Arusha, Tanzania
| | - Tiziana Lembo
- School of Biodiversity, One Health and Veterinary Medicine, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow, United Kingdom
| | - Harriet K Auty
- School of Biodiversity, One Health and Veterinary Medicine, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow, United Kingdom
| | - Johan Michaux
- Laboratoire de Génétique de la Conservation, Institut de Botanique (Bat. 22), Université de Liège (Sart Tilman), Chemin de la Vallée 4, B4000, Liège, Belgium
| | - Nathalie Smitz
- Royal Museum for Central Africa (BopCo), Leuvensesteenweg 13, 3080, Tervuren, Belgium
| | - Philip Toye
- International Livestock Research Institute, P.O. Box 30709, Nairobi, 00100, Kenya
- Centre for Tropical Livestock Genetics and Health (CTLGH), ILRI Kenya, P.O. Box 30709, Nairobi, 00100, Kenya
| | - Christelle Robert
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian, EH25 9RG, United Kingdom
- Centre for Tropical Livestock Genetics and Health (CTLGH), Roslin Institute, University of Edinburgh, Easter Bush Campus, Roslin, EH25 9RG, United Kingdom
- Centre for Genomic and Experimental Medicine, Institute of Genetics and Cancer, University of Edinburgh, Crewe Road South, Edinburgh, EH4 2XU, United Kingdom
| | - James G D Prendergast
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian, EH25 9RG, United Kingdom
- Centre for Tropical Livestock Genetics and Health (CTLGH), Roslin Institute, University of Edinburgh, Easter Bush Campus, Roslin, EH25 9RG, United Kingdom
| | - Liam J Morrison
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian, EH25 9RG, United Kingdom.
- Centre for Tropical Livestock Genetics and Health (CTLGH), Roslin Institute, University of Edinburgh, Easter Bush Campus, Roslin, EH25 9RG, United Kingdom.
| |
Collapse
|
6
|
Murphy WJ, Harris AJ. Toward telomere-to-telomere cat genomes for precision medicine and conservation biology. Genome Res 2024; 34:655-664. [PMID: 38849156 PMCID: PMC11216403 DOI: 10.1101/gr.278546.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2024]
Abstract
Genomic data from species of the cat family Felidae promise to stimulate veterinary and human medical advances, and clarify the coherence of genome organization. We describe how interspecies hybrids have been instrumental in the genetic analysis of cats, from the first genetic maps to propelling cat genomes toward the T2T standard set by the human genome project. Genotype-to-phenotype mapping in cat models has revealed dozens of health-related genetic variants, the molecular basis for mammalian pigmentation and patterning, and species-specific adaptations. Improved genomic surveillance of natural and captive populations across the cat family tree will increase our understanding of the genetic architecture of traits, population dynamics, and guide a future of genome-enabled biodiversity conservation.
Collapse
Affiliation(s)
- William J Murphy
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, Texas 77843-4458, USA;
- Department of Biology, Texas A&M University, College Station, Texas 77843-4458, USA
- Interdisciplinary Program in Genetics and Genomics, Texas A&M University, College Station, Texas 77843-4458, USA
| | - Andrew J Harris
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, Texas 77843-4458, USA
- Interdisciplinary Program in Genetics and Genomics, Texas A&M University, College Station, Texas 77843-4458, USA
| |
Collapse
|
7
|
Fu Y, Aganezov S, Mahmoud M, Beaulaurier J, Juul S, Treangen TJ, Sedlazeck FJ. MethPhaser: methylation-based long-read haplotype phasing of human genomes. Nat Commun 2024; 15:5327. [PMID: 38909018 PMCID: PMC11193733 DOI: 10.1038/s41467-024-49588-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 06/11/2024] [Indexed: 06/24/2024] Open
Abstract
The assignment of variants across haplotypes, phasing, is crucial for predicting the consequences, interaction, and inheritance of mutations and is a key step in improving our understanding of phenotype and disease. However, phasing is limited by read length and stretches of homozygosity along the genome. To overcome this limitation, we designed MethPhaser, a method that utilizes methylation signals from Oxford Nanopore Technologies to extend Single Nucleotide Variation (SNV)-based phasing. We demonstrate that haplotype-specific methylations extensively exist in Human genomes and the advent of long-read technologies enabled direct report of methylation signals. For ONT R9 and R10 cell line data, we increase the phase length N50 by 78%-151% at a phasing accuracy of 83.4-98.7% To assess the impact of tissue purity and random methylation signals due to inactivation, we also applied MethPhaser on blood samples from 4 patients, still showing improvements over SNV-only phasing. MethPhaser further improves phasing across HLA and multiple other medically relevant genes, improving our understanding of how mutations interact across multiple phenotypes. The concept of MethPhaser can also be extended to non-human diploid genomes. MethPhaser is available at https://github.com/treangenlab/methphaser .
Collapse
Affiliation(s)
- Yilei Fu
- Department of Computer Science, Rice University, Houston, TX, USA
| | | | - Medhat Mahmoud
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA
| | | | - Sissel Juul
- Oxford Nanopore Technologies Inc, New York, NY, USA
| | - Todd J Treangen
- Department of Computer Science, Rice University, Houston, TX, USA.
- Department of Bioengineering, Rice University, Houston, TX, USA.
| | - Fritz J Sedlazeck
- Department of Computer Science, Rice University, Houston, TX, USA.
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA.
| |
Collapse
|
8
|
Henglin M, Ghareghani M, Harvey W, Porubsky D, Koren S, Eichler EE, Ebert P, Marschall T. Phasing Diploid Genome Assembly Graphs with Single-Cell Strand Sequencing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.15.580432. [PMID: 38529499 PMCID: PMC10962706 DOI: 10.1101/2024.02.15.580432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/27/2024]
Abstract
Haplotype information is crucial for biomedical and population genetics research. However, current strategies to produce de-novo haplotype-resolved assemblies often require either difficult-to-acquire parental data or an intermediate haplotype-collapsed assembly. Here, we present Graphasing, a workflow which synthesizes the global phase signal of Strand-seq with assembly graph topology to produce chromosome-scale de-novo haplotypes for diploid genomes. Graphasing readily integrates with any assembly workflow that both outputs an assembly graph and has a haplotype assembly mode. Graphasing performs comparably to trio-phasing in contiguity, phasing accuracy, and assembly quality, outperforms Hi-C in phasing accuracy, and generates human assemblies with over 18 chromosome-spanning haplotypes.
Collapse
Affiliation(s)
- Mir Henglin
- Institute for Medical Biometry and Bioinformatics, Medical Faculty and University Hospital Düsseldorf, Heinrich Heine University Düsseldorf, Germany
- Center for Digital Medicine, Heinrich Heine University Düsseldorf, Germany
| | - Maryam Ghareghani
- Department of Mathematics and Computer Science, Freie Universität Berlin, Germany
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - William Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Peter Ebert
- Institute for Medical Biometry and Bioinformatics, Medical Faculty and University Hospital Düsseldorf, Heinrich Heine University Düsseldorf, Germany
- Center for Digital Medicine, Heinrich Heine University Düsseldorf, Germany
- Core Unit Bioinformatics, Medical Faculty and University Hospital Düsseldorf, Heinrich Heine University Düsseldorf, Germany
| | - Tobias Marschall
- Institute for Medical Biometry and Bioinformatics, Medical Faculty and University Hospital Düsseldorf, Heinrich Heine University Düsseldorf, Germany
- Center for Digital Medicine, Heinrich Heine University Düsseldorf, Germany
| |
Collapse
|
9
|
Niu Y, Fan X, Yang Y, Li J, Lian J, Wang L, Zhang Y, Tang Y, Tang Z. Haplotype-resolved assembly of a pig genome using single-sperm sequencing. Commun Biol 2024; 7:738. [PMID: 38890535 PMCID: PMC11189477 DOI: 10.1038/s42003-024-06397-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Accepted: 05/29/2024] [Indexed: 06/20/2024] Open
Abstract
Single gamete cell sequencing together with long-read sequencing can reliably produce chromosome-level phased genomes. In this study, we employed PacBio HiFi and Hi-C sequencing on a male Landrace pig, coupled with single-sperm sequencing of its 102 sperm cells. A haplotype assembly method was developed based on long-read sequencing and sperm-phased markers. The chromosome-level phased assembly showed higher phasing accuracy than methods that rely only on HiFi reads. The use of single-sperm sequencing data enabled the construction of a genetic map, successfully mapping the sperm motility trait to a specific region on chromosome 1 (105.40-110.70 Mb). Furthermore, with the assistance of Y chromosome-bearing sperm data, 26.16 Mb Y chromosome sequences were assembled. We report a reliable approach for assembling chromosome-level phased genomes and reveal the potential of sperm population in basic biology research and sperm phenotype research.
Collapse
Affiliation(s)
- Yongchao Niu
- Kunpeng Institute of Modern Agriculture at Foshan, Agricultural Genomics Institute, Chinese Academy of Agricultural Sciences, Foshan, China
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Agriculture Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Xinhao Fan
- Kunpeng Institute of Modern Agriculture at Foshan, Agricultural Genomics Institute, Chinese Academy of Agricultural Sciences, Foshan, China
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Agriculture Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
- GuangXi Engineering Centre for Resource Development of Bama Xiang Pig, Bama, China
- Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Yalan Yang
- Kunpeng Institute of Modern Agriculture at Foshan, Agricultural Genomics Institute, Chinese Academy of Agricultural Sciences, Foshan, China
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Agriculture Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
- Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Jiang Li
- Biozeron Shenzhen, Inc., Shenzhen, China
| | | | - Liu Wang
- Kunpeng Institute of Modern Agriculture at Foshan, Agricultural Genomics Institute, Chinese Academy of Agricultural Sciences, Foshan, China
| | - Yongjin Zhang
- GuangXi Engineering Centre for Resource Development of Bama Xiang Pig, Bama, China
| | - Yijie Tang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Agriculture Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
- Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Zhonglin Tang
- Kunpeng Institute of Modern Agriculture at Foshan, Agricultural Genomics Institute, Chinese Academy of Agricultural Sciences, Foshan, China.
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Agriculture Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China.
- GuangXi Engineering Centre for Resource Development of Bama Xiang Pig, Bama, China.
- Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China.
| |
Collapse
|
10
|
Li Q, Qiao X, Li L, Gu C, Yin H, Qi K, Xie Z, Yang S, Zhao Q, Wang Z, Yang Y, Pan J, Li H, Wang J, Wang C, Rieseberg LH, Zhang S, Tao S. Haplotype-resolved T2T genome assemblies and pangenome graph of pear reveal diverse patterns of allele-specific expression and the genomic basis of fruit quality traits. PLANT COMMUNICATIONS 2024:101000. [PMID: 38859586 DOI: 10.1016/j.xplc.2024.101000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 05/15/2024] [Accepted: 06/07/2024] [Indexed: 06/12/2024]
Abstract
Hybrid crops often exhibit increased yield and greater resilience, yet the genomic mechanism(s) underlying hybrid vigor or heterosis remain unclear, hindering our ability to predict the expression of phenotypic traits in hybrid breeding. Here, we generated haplotype-resolved T2T genome assemblies of two pear hybrid varieties, 'Yuluxiang' (YLX) and 'Hongxiangsu' (HXS), which share the same maternal parent but differ in their paternal parents. We then used these assemblies to explore the genome-scale landscape of allele-specific expression (ASE) and create a pangenome graph for pear. ASE was observed for close to 6000 genes in both hybrid cultivars. A subset of ASE genes related to aspects of fruit quality such as sugars, organic acids, and cuticular wax were identified, suggesting their important contributions to heterosis. Specifically, Ma1, a gene regulating fruit acidity, is absent in the paternal haplotypes of HXS and YLX. A pangenome graph was built based on our assemblies and seven published pear genomes. Resequencing data for 139 cultivated pear genotypes (including 97 genotypes sequenced here) were subsequently aligned to the pangenome graph, revealing numerous structural variant hotspots and selective sweeps during pear diversification. As predicted, the Ma1 allele was found to be absent in varieties with low organic acid content, and this association was functionally validated by Ma1 overexpression in pear fruit and calli. Overall, these results reveal the contributions of ASE to fruit-quality heterosis and provide a robust pangenome reference for high-resolution allele discovery and association mapping.
Collapse
Affiliation(s)
- Qionghou Li
- National Key Laboratory of Crop Genetics and Germplasm Enhancement and Utilization, Sanya Institute of Nanjing Agricultural University, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
| | - Xin Qiao
- National Key Laboratory of Crop Genetics and Germplasm Enhancement and Utilization, Sanya Institute of Nanjing Agricultural University, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
| | - Lanqing Li
- National Key Laboratory of Crop Genetics and Germplasm Enhancement and Utilization, Sanya Institute of Nanjing Agricultural University, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
| | - Chao Gu
- National Key Laboratory of Crop Genetics and Germplasm Enhancement and Utilization, Sanya Institute of Nanjing Agricultural University, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
| | - Hao Yin
- National Key Laboratory of Crop Genetics and Germplasm Enhancement and Utilization, Sanya Institute of Nanjing Agricultural University, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
| | - Kaijie Qi
- National Key Laboratory of Crop Genetics and Germplasm Enhancement and Utilization, Sanya Institute of Nanjing Agricultural University, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
| | - Zhihua Xie
- National Key Laboratory of Crop Genetics and Germplasm Enhancement and Utilization, Sanya Institute of Nanjing Agricultural University, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
| | - Sheng Yang
- Pomology Institute, Shanxi Agricultural University, Taigu, Shanxi 030801, China
| | - Qifeng Zhao
- Pomology Institute, Shanxi Agricultural University, Taigu, Shanxi 030801, China
| | - Zewen Wang
- National Key Laboratory of Crop Genetics and Germplasm Enhancement and Utilization, Sanya Institute of Nanjing Agricultural University, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
| | - Yuhang Yang
- National Key Laboratory of Crop Genetics and Germplasm Enhancement and Utilization, Sanya Institute of Nanjing Agricultural University, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
| | - Jiahui Pan
- National Key Laboratory of Crop Genetics and Germplasm Enhancement and Utilization, Sanya Institute of Nanjing Agricultural University, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
| | - Hongxiang Li
- National Key Laboratory of Crop Genetics and Germplasm Enhancement and Utilization, Sanya Institute of Nanjing Agricultural University, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
| | - Jie Wang
- National Key Laboratory of Crop Genetics and Germplasm Enhancement and Utilization, Sanya Institute of Nanjing Agricultural University, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
| | - Chao Wang
- National Key Laboratory of Crop Genetics and Germplasm Enhancement and Utilization, Sanya Institute of Nanjing Agricultural University, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
| | - Loren H Rieseberg
- Department of Botany and Biodiversity Research Centre, University of British Columbia, Vancouver, BC, Canada
| | - Shaoling Zhang
- National Key Laboratory of Crop Genetics and Germplasm Enhancement and Utilization, Sanya Institute of Nanjing Agricultural University, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
| | - Shutian Tao
- National Key Laboratory of Crop Genetics and Germplasm Enhancement and Utilization, Sanya Institute of Nanjing Agricultural University, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China.
| |
Collapse
|
11
|
Li J, Lin Y, Li D, He M, Kui H, Bai J, Chen Z, Gou Y, Zhang J, Wang T, Tang Q, Kong F, Jin L, Li M. Building Haplotype-Resolved 3D Genome Maps of Chicken Skeletal Muscle. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2305706. [PMID: 38582509 PMCID: PMC11200017 DOI: 10.1002/advs.202305706] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 03/07/2024] [Indexed: 04/08/2024]
Abstract
Haplotype-resolved 3D chromatin architecture related to allelic differences in avian skeletal muscle development has not been addressed so far, although chicken husbandry for meat consumption has been prevalent feature of cultures on every continent for more than thousands of years. Here, high-resolution Hi-C diploid maps (1.2-kb maximum resolution) are generated for skeletal muscle tissues in chicken across three developmental stages (embryonic day 15 to day 30 post-hatching). The sequence features governing spatial arrangement of chromosomes and characterize homolog pairing in the nucleus, are identified. Multi-scale characterization of chromatin reorganization between stages from myogenesis in the fetus to myofiber hypertrophy after hatching show concordant changes in transcriptional regulation by relevant signaling pathways. Further interrogation of parent-of-origin-specific chromatin conformation supported that genomic imprinting is absent in birds. This study also reveals promoter-enhancer interaction (PEI) differences between broiler and layer haplotypes in skeletal muscle development-related genes are related to genetic variation between breeds, however, only a minority of breed-specific variations likely contribute to phenotypic divergence in skeletal muscle potentially via allelic PEI rewiring. Beyond defining the haplotype-specific 3D chromatin architecture in chicken, this study provides a rich resource for investigating allelic regulatory divergence among chicken breeds.
Collapse
Affiliation(s)
- Jing Li
- State Key Laboratory of Swine and Poultry Breeding IndustryCollege of Animal Science and TechnologySichuan Agricultural UniversityChengdu611130China
| | - Yu Lin
- State Key Laboratory of Swine and Poultry Breeding IndustryCollege of Animal Science and TechnologySichuan Agricultural UniversityChengdu611130China
| | - Diyan Li
- School of PharmacyChengdu UniversityChengdu610106China
| | - Mengnan He
- Wildlife Conservation Research DepartmentChengdu Research Base of Giant Panda BreedingChengdu610057China
| | - Hua Kui
- State Key Laboratory of Swine and Poultry Breeding IndustryCollege of Animal Science and TechnologySichuan Agricultural UniversityChengdu611130China
| | - Jingyi Bai
- State Key Laboratory of Swine and Poultry Breeding IndustryCollege of Animal Science and TechnologySichuan Agricultural UniversityChengdu611130China
| | - Ziyu Chen
- State Key Laboratory of Swine and Poultry Breeding IndustryCollege of Animal Science and TechnologySichuan Agricultural UniversityChengdu611130China
| | - Yuwei Gou
- State Key Laboratory of Swine and Poultry Breeding IndustryCollege of Animal Science and TechnologySichuan Agricultural UniversityChengdu611130China
| | - Jiaman Zhang
- State Key Laboratory of Swine and Poultry Breeding IndustryCollege of Animal Science and TechnologySichuan Agricultural UniversityChengdu611130China
| | - Tao Wang
- School of PharmacyChengdu UniversityChengdu610106China
| | - Qianzi Tang
- State Key Laboratory of Swine and Poultry Breeding IndustryCollege of Animal Science and TechnologySichuan Agricultural UniversityChengdu611130China
| | - Fanli Kong
- College of Life ScienceSichuan Agricultural UniversityYa'an625014China
| | - Long Jin
- State Key Laboratory of Swine and Poultry Breeding IndustryCollege of Animal Science and TechnologySichuan Agricultural UniversityChengdu611130China
| | - Mingzhou Li
- State Key Laboratory of Swine and Poultry Breeding IndustryCollege of Animal Science and TechnologySichuan Agricultural UniversityChengdu611130China
| |
Collapse
|
12
|
Laforest M, Martin SL, Bisaillon K, Soufiane B, Meloche S, Tardif FJ, Page E. The ancestral karyotype of the Heliantheae Alliance, herbicide resistance, and human allergens: Insights from the genomes of common and giant ragweed. THE PLANT GENOME 2024; 17:e20442. [PMID: 38481294 DOI: 10.1002/tpg2.20442] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Revised: 01/23/2024] [Accepted: 02/04/2024] [Indexed: 07/02/2024]
Abstract
Ambrosia artemisiifolia and Ambrosia trifida (Asteraceae) are important pest species and the two greatest sources of aeroallergens globally. Here, we took advantage of a hybrid to simplify genome assembly and present chromosome-level assemblies for both species. These assemblies show high levels of completeness with Benchmarking Universal Single-Copy Ortholog (BUSCO) scores of 94.5% for A. artemisiifolia and 96.1% for A. trifida and long terminal repeat (LTR) Assembly Index values of 26.6 and 23.6, respectively. The genomes were annotated using RNA data identifying 41,642 genes in A. artemisiifolia and 50,203 in A. trifida. More than half of the genome is composed of repetitive elements, with 62% in A. artemisiifolia and 69% in A. trifida. Single copies of herbicide resistance-associated genes PPX2L, HPPD, and ALS were found, while two copies of the EPSPS gene were identified; this latter observation may reveal a possible mechanism of resistance to the herbicide glyphosate. Ten of the 12 main allergenicity genes were also localized, some forming clusters with several copies, especially in A. artemisiifolia. The evolution of genome structure has differed among these two species. The genome of A. trifida has undergone greater rearrangement, possibly the result of chromoplexy. In contrast, the genome of A. artemisiifolia retains a structure that makes the allotetraploidization of the most recent common ancestor of the Heliantheae Alliance the clearest feature of its genome. When compared to other Heliantheae Alliance species, this allowed us to reconstruct the common ancestor's karyotype-a key step for furthering of our understanding of the evolution and diversification of this economically and allergenically important group.
Collapse
Affiliation(s)
- Martin Laforest
- Saint-Jean-sur-Richelieu Research and Development Centre, Agriculture and Agri-Food Canada, Saint-Jean-sur-Richelieu, Quebec, Canada
| | - Sara L Martin
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, Ontario, Canada
| | - Katherine Bisaillon
- Saint-Jean-sur-Richelieu Research and Development Centre, Agriculture and Agri-Food Canada, Saint-Jean-sur-Richelieu, Quebec, Canada
| | - Brahim Soufiane
- Saint-Jean-sur-Richelieu Research and Development Centre, Agriculture and Agri-Food Canada, Saint-Jean-sur-Richelieu, Quebec, Canada
| | - Sydney Meloche
- Harrow Research and Development Centre, Agriculture and Agri-Food Canada, Harrow, Ontario, Canada
| | - François J Tardif
- Department of Plant Agriculture, University of Guelph, Guelph, Ontario, Canada
| | - Eric Page
- Harrow Research and Development Centre, Agriculture and Agri-Food Canada, Harrow, Ontario, Canada
| |
Collapse
|
13
|
Jiang S, Zou M, Zhang C, Ma W, Xia C, Li Z, Zhao L, Liu Q, Yu F, Huang D, Xia Z. A high-quality haplotype genome of Michelia alba DC reveals differences in methylation patterns and flower characteristics. MOLECULAR HORTICULTURE 2024; 4:23. [PMID: 38807235 PMCID: PMC11134676 DOI: 10.1186/s43897-024-00098-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 04/19/2024] [Indexed: 05/30/2024]
Abstract
Michelia alba DC is a highly valuable ornamental plant of the Magnoliaceae family. This evergreen tropical tree commonly grows in Southeast Asia and is adored for its delightful fragrance. Our study assembled the M. alba haplotype genome MC and MM by utilizing Nanopore ultralong reads, Pacbio Hifi long reads and parental second-generation data. Moreover, the first methylation map of Magnoliaceae was constructed based on the methylation site data obtained using Nanopore data. Metabolomic datasets were generated from the flowers of three different species to assess variations in pigment and volatile compound accumulation. Finally, transcriptome data were generated to link genomic, methylation, and morphological patterns to reveal the reasons underlying the differences between M. alba and its parental lines in petal color, flower shape, and fragrance. We found that the AP1 and AP2 genes are crucial in M. alba petal formation, while the 4CL, PAL, and C4H genes control petal color. The data generated in this study serve as a foundation for future physiological and biochemical research on M. alba, facilitate the targeted improvement of M. alba varieties, and offer a theoretical basis for molecular research on Michelia L.
Collapse
Affiliation(s)
- Sirong Jiang
- Sanya Nanfan Research Institute of Hainan University, Hainan Yazhou Bay Seed Laboratory, Sanya, China
- College of Tropical Crops, Hainan University, Haikou, China
| | - Meiling Zou
- Sanya Nanfan Research Institute of Hainan University, Hainan Yazhou Bay Seed Laboratory, Sanya, China
- College of Tropical Crops, Hainan University, Haikou, China
| | | | - Wanfeng Ma
- Sanya Nanfan Research Institute of Hainan University, Hainan Yazhou Bay Seed Laboratory, Sanya, China
- College of Tropical Crops, Hainan University, Haikou, China
| | - Chengcai Xia
- Sanya Nanfan Research Institute of Hainan University, Hainan Yazhou Bay Seed Laboratory, Sanya, China
- College of Tropical Crops, Hainan University, Haikou, China
| | - Zixuan Li
- Sanya Nanfan Research Institute of Hainan University, Hainan Yazhou Bay Seed Laboratory, Sanya, China
- College of Tropical Crops, Hainan University, Haikou, China
| | | | - Qi Liu
- Sanya Nanfan Research Institute of Hainan University, Hainan Yazhou Bay Seed Laboratory, Sanya, China
- College of Tropical Crops, Hainan University, Haikou, China
| | - Fen Yu
- Sanya Nanfan Research Institute of Hainan University, Hainan Yazhou Bay Seed Laboratory, Sanya, China
- College of Tropical Crops, Hainan University, Haikou, China
| | - Dongyi Huang
- College of Tropical Crops, Hainan University, Haikou, China.
| | - Zhiqiang Xia
- Sanya Nanfan Research Institute of Hainan University, Hainan Yazhou Bay Seed Laboratory, Sanya, China.
- College of Tropical Crops, Hainan University, Haikou, China.
| |
Collapse
|
14
|
Fodor E, Okendo J, Szabó N, Szabó K, Czimer D, Tarján-Rácz A, Szeverényi I, Low BW, Liew JH, Koren S, Rhie A, Orbán L, Miklósi Á, Varga M, Burgess SM. The reference genome of Macropodus opercularis (the paradise fish). Sci Data 2024; 11:540. [PMID: 38796485 PMCID: PMC11127978 DOI: 10.1038/s41597-024-03277-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 04/18/2024] [Indexed: 05/28/2024] Open
Abstract
Amongst fishes, zebrafish (Danio rerio) has gained popularity as a model system over most other species and while their value as a model is well documented, their usefulness is limited in certain fields of research such as behavior. By embracing other, less conventional experimental organisms, opportunities arise to gain broader insights into evolution and development, as well as studying behavioral aspects not available in current popular model systems. The anabantoid paradise fish (Macropodus opercularis), an "air-breather" species has a highly complex behavioral repertoire and has been the subject of many ethological investigations but lacks genomic resources. Here we report the reference genome assembly of M. opercularis using long-read sequences at 150-fold coverage. The final assembly consisted of 483,077,705 base pairs (~483 Mb) on 152 contigs. Within the assembled genome we identified and annotated 20,157 protein coding genes and assigned ~90% of them to orthogroups.
Collapse
Affiliation(s)
- Erika Fodor
- Department of Genetics, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Javan Okendo
- Translational and Functional Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA
| | - Nóra Szabó
- Department of Genetics, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Kata Szabó
- Department of Genetics, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Dávid Czimer
- Department of Genetics, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Anita Tarján-Rácz
- Department of Genetics, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Ildikó Szeverényi
- Frontline Fish Genomics Research Group, Department of Applied Fish Biology, Institute of Aquaculture and Environmental Safety, Hungarian University of Agriculture and Life Sciences, Georgikon Campus, Keszthely, Hungary
| | - Bi Wei Low
- Science Unit, Lingnan University, Hong Kong, China
| | | | - Sergey Koren
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA
| | - Arang Rhie
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA
| | - László Orbán
- Frontline Fish Genomics Research Group, Department of Applied Fish Biology, Institute of Aquaculture and Environmental Safety, Hungarian University of Agriculture and Life Sciences, Georgikon Campus, Keszthely, Hungary
| | - Ádám Miklósi
- Department of Ethology, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Máté Varga
- Department of Genetics, ELTE Eötvös Loránd University, Budapest, Hungary.
| | - Shawn M Burgess
- Translational and Functional Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA.
| |
Collapse
|
15
|
Wade KJ, Suseno R, Kizer K, Williams J, Boquett J, Caillier S, Pollock NR, Renschen A, Santaniello A, Oksenberg JR, Norman PJ, Augusto DG, Hollenbach JA. MHConstructor: A high-throughput, haplotype-informed solution to the MHC assembly challenge. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.20.595060. [PMID: 38826378 PMCID: PMC11142050 DOI: 10.1101/2024.05.20.595060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
The extremely high levels of genetic polymorphism within the human major histocompatibility complex (MHC) limit the usefulness of reference-based alignment methods for sequence assembly. We incorporate a short read de novo assembly algorithm into a workflow for novel application to the MHC. MHConstructor is a containerized pipeline designed for high-throughput, haplotype-informed, reproducible assembly of both whole genome sequencing and target-capture short read data in large, population cohorts. To-date, no other self-contained tool exists for the generation of de novo MHC assemblies from short read data. MHConstructor facilitates wide-spread access to high quality, alignment-free MHC sequence analysis.
Collapse
Affiliation(s)
- Kristen J. Wade
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Rayo Suseno
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Kerry Kizer
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Jacqueline Williams
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Juliano Boquett
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Stacy Caillier
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Nicholas R. Pollock
- Department of Biomedical Informatics, Anschutz Medical Campus, University of Colorado, Aurora, Colorado, USA
- Department of Immunology and Microbiology, Anschutz Medical Campus, University of Colorado, Aurora, Colorado, USA
| | - Adam Renschen
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Adam Santaniello
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Jorge R. Oksenberg
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Paul J. Norman
- Department of Biomedical Informatics, Anschutz Medical Campus, University of Colorado, Aurora, Colorado, USA
- Department of Immunology and Microbiology, Anschutz Medical Campus, University of Colorado, Aurora, Colorado, USA
| | - Danillo G. Augusto
- Department of Biological Sciences, University of North Carolina Charlotte, Charlotte, NC, United States
- Programa de Pós-Graduação em Genética, Universidade Federal do Paraná, Curitiba, Brazil
| | - Jill A. Hollenbach
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, United States
| |
Collapse
|
16
|
Chen Y, Huang JH, Sun Y, Zhang Y, Li Y, Xu X. Haplotype-resolved assembly of diploid and polyploid genomes using quantum computing. CELL REPORTS METHODS 2024; 4:100754. [PMID: 38614089 PMCID: PMC11133727 DOI: 10.1016/j.crmeth.2024.100754] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/08/2023] [Revised: 01/03/2024] [Accepted: 03/20/2024] [Indexed: 04/15/2024]
Abstract
Precision medicine's emphasis on individual genetic variants highlights the importance of haplotype-resolved assembly, a computational challenge in bioinformatics given its combinatorial nature. While classical algorithms have made strides in addressing this issue, the potential of quantum computing remains largely untapped. Here, we present the vehicle routing problem (VRP) assembler: an approach that transforms this task into a vehicle routing problem, an optimization formulation solvable on a quantum computer. We demonstrate its potential and feasibility through a proof of concept on short synthetic diploid and triploid genomes using a D-Wave quantum annealer. To tackle larger-scale assembly problems, we integrate the VRP assembler with Google's OR-Tools, achieving a haplotype-resolved local assembly across the human major histocompatibility complex (MHC) region. Our results show encouraging performance compared to Hifiasm with phasing accuracy approaching the theoretical limit, underscoring the promising future of quantum computing in bioinformatics.
Collapse
Affiliation(s)
- Yibo Chen
- BGI Research, Shenzhen 518083, China
| | | | - Yuhui Sun
- BGI Research, Shenzhen 518083, China
| | - Yong Zhang
- BGI Research, Wuhan 430047, China; Guangdong Bigdata Engineering Technology Research Center for Life Sciences, BGI Research, Shenzhen 518083, China.
| | - Yuxiang Li
- BGI Research, Wuhan 430047, China; Guangdong Bigdata Engineering Technology Research Center for Life Sciences, BGI Research, Shenzhen 518083, China.
| | - Xun Xu
- BGI Research, Shenzhen 518083, China; BGI Research, Wuhan 430047, China.
| |
Collapse
|
17
|
Wang J, Xu Y, Peng Y, Wang Y, Kang Z, Zhao J. A fully haplotype-resolved and nearly gap-free genome assembly of wheat stripe rust fungus. Sci Data 2024; 11:508. [PMID: 38755209 PMCID: PMC11099153 DOI: 10.1038/s41597-024-03361-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 05/10/2024] [Indexed: 05/18/2024] Open
Abstract
Stripe rust fungus Puccinia striiformis f. sp. tritici (Pst) is a destructive pathogen of wheat worldwide. Pst has a macrocyclic-heteroecious lifecycle, in which one-celled urediniospores are dikaryotic, each nucleus containing one haploid genome. We successfully generated the first fully haplotype-resolved and nearly gap-free chromosome-scale genome assembly of Pst by combining PacBio HiFi sequencing and trio-binning strategy. The genome size of the two haploid assemblies was 75.59 Mb and 75.91 Mb with contig N50 of 4.17 Mb and 4.60 Mb, and both had 18 pseudochromosomes. The high consensus quality values of 55.57 and 59.02 for both haplotypes confirmed the correctness of the assembly. Of the total 18 chromosomes, 15 and 16 were gapless while there were only five and two gaps for the remaining chromosomes of the two haplotypes, respectively. In total, 15,046 and 15,050 protein-coding genes were predicted for the two haplotypes, and the complete BUSCO scores achieved 97.7% and 97.9%, respectively. The genome will lay the foundation for further research on genetic variations and the evolution of rust fungi.
Collapse
Affiliation(s)
- Jierong Wang
- College of Plant Protection, Northwest A&F University, Yangling, Shaanxi, 712100, China
- State Key Laboratory of Crop Stress Biology for Arid Areas, Northwest A&F University, Yangling, Shaanxi, 712100, China
- College of Life Science, Northwest A&F University, Yangling, Shaanxi, 712100, China
| | - Yiwen Xu
- College of Plant Protection, Northwest A&F University, Yangling, Shaanxi, 712100, China
| | - Yuxi Peng
- College of Plant Protection, Northwest A&F University, Yangling, Shaanxi, 712100, China
| | - Yiping Wang
- College of Plant Protection, Northwest A&F University, Yangling, Shaanxi, 712100, China
| | - Zhensheng Kang
- College of Plant Protection, Northwest A&F University, Yangling, Shaanxi, 712100, China.
- State Key Laboratory of Crop Stress Biology for Arid Areas, Northwest A&F University, Yangling, Shaanxi, 712100, China.
| | - Jing Zhao
- College of Plant Protection, Northwest A&F University, Yangling, Shaanxi, 712100, China.
- State Key Laboratory of Crop Stress Biology for Arid Areas, Northwest A&F University, Yangling, Shaanxi, 712100, China.
| |
Collapse
|
18
|
Calderón L, Carbonell-Bejerano P, Muñoz C, Bree L, Sola C, Bergamin D, Tulle W, Gomez-Talquenca S, Lanz C, Royo C, Ibáñez J, Martinez-Zapater JM, Weigel D, Lijavetzky D. Diploid genome assembly of the Malbec grapevine cultivar enables haplotype-aware analysis of transcriptomic differences underlying clonal phenotypic variation. HORTICULTURE RESEARCH 2024; 11:uhae080. [PMID: 38766532 PMCID: PMC11101320 DOI: 10.1093/hr/uhae080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 03/08/2024] [Indexed: 05/22/2024]
Abstract
To preserve their varietal attributes, established grapevine cultivars (Vitis vinifera L. ssp. vinifera) must be clonally propagated, due to their highly heterozygous genomes. Malbec is a France-originated cultivar appreciated for producing high-quality wines and is the offspring of cultivars Prunelard and Magdeleine Noire des Charentes. Here, we have built a diploid genome assembly of Malbec, after trio binning of PacBio long reads into the two haploid complements inherited from either parent. After haplotype-aware deduplication and corrections, complete assemblies for the two haplophases were obtained with a very low haplotype switch-error rate (<0.025). The haplophase alignment identified > 25% of polymorphic regions. Gene annotation including RNA-seq transcriptome assembly and ab initio prediction evidence resulted in similar gene model numbers for both haplophases. The annotated diploid assembly was exploited in the transcriptomic comparison of four clonal accessions of Malbec that exhibited variation in berry composition traits. Analysis of the ripening pericarp transcriptome using either haplophases as a reference yielded similar results, although some differences were observed. Particularly, among the differentially expressed genes identified only with the Magdeleine-inherited haplotype as reference, we observed an over-representation of hypothetically hemizygous genes. The higher berry anthocyanin content of clonal accession 595 was associated with increased abscisic acid responses, possibly leading to the observed overexpression of phenylpropanoid metabolism genes and deregulation of genes associated with abiotic stress response. Overall, the results highlight the importance of producing diploid assemblies to fully represent the genomic diversity of highly heterozygous woody crop cultivars and unveil the molecular bases of clonal phenotypic variation.
Collapse
Affiliation(s)
- Luciano Calderón
- Instituto de Biología Agrícola de Mendoza (CONICET-UNCuyo), Genetica y Genomica de Vid, Chacras de Coria 5505, Mendoza, Argentina
| | - Pablo Carbonell-Bejerano
- Instituto de Ciencias de la Vid y del Vino, ICVV, CSIC - Universidad de La Rioja - Gobierno de La Rioja, Logroño 26007, La Rioja, Spain
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
| | - Claudio Muñoz
- Instituto de Biología Agrícola de Mendoza (CONICET-UNCuyo), Genetica y Genomica de Vid, Chacras de Coria 5505, Mendoza, Argentina
- Facultad de Ciencias Agrarias (UNCuyo), Cátedra Fitopatología, Chacras de Coria 5505, Mendoza, Argentina
| | - Laura Bree
- Vivero Mercier Argentina, Perdriel 5500, Mendoza, Argentina
| | - Cristobal Sola
- Vivero Mercier Argentina, Perdriel 5500, Mendoza, Argentina
| | | | - Walter Tulle
- Instituto de Biología Agrícola de Mendoza (CONICET-UNCuyo), Genetica y Genomica de Vid, Chacras de Coria 5505, Mendoza, Argentina
| | - Sebastian Gomez-Talquenca
- Plant Virology Laboratory, Instituto Nacional de Tecnología Agropecuaria, Luján de Cuyo 5534, Mendoza, Argentina
| | - Christa Lanz
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
| | - Carolina Royo
- Instituto de Ciencias de la Vid y del Vino, ICVV, CSIC - Universidad de La Rioja - Gobierno de La Rioja, Logroño 26007, La Rioja, Spain
| | - Javier Ibáñez
- Instituto de Ciencias de la Vid y del Vino, ICVV, CSIC - Universidad de La Rioja - Gobierno de La Rioja, Logroño 26007, La Rioja, Spain
| | - José Miguel Martinez-Zapater
- Instituto de Ciencias de la Vid y del Vino, ICVV, CSIC - Universidad de La Rioja - Gobierno de La Rioja, Logroño 26007, La Rioja, Spain
| | - Detlef Weigel
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
| | - Diego Lijavetzky
- Instituto de Biología Agrícola de Mendoza (CONICET-UNCuyo), Genetica y Genomica de Vid, Chacras de Coria 5505, Mendoza, Argentina
| |
Collapse
|
19
|
Grasberger H, Dumitrescu AM, Liao XH, Swanson EG, Weiss RE, Srichomkwun P, Pappa T, Chen J, Yoshimura T, Hoffmann P, França MM, Tagett R, Onigata K, Costagliola S, Ranchalis J, Vollger MR, Stergachis AB, Chong JX, Bamshad MJ, Smits G, Vassart G, Refetoff S. STR mutations on chromosome 15q cause thyrotropin resistance by activating a primate-specific enhancer of MIR7-2/MIR1179. Nat Genet 2024; 56:877-888. [PMID: 38714869 DOI: 10.1038/s41588-024-01717-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Accepted: 03/14/2024] [Indexed: 05/22/2024]
Abstract
Thyrotropin (TSH) is the master regulator of thyroid gland growth and function. Resistance to TSH (RTSH) describes conditions with reduced sensitivity to TSH. Dominantly inherited RTSH has been linked to a locus on chromosome 15q, but its genetic basis has remained elusive. Here we show that non-coding mutations in a (TTTG)4 short tandem repeat (STR) underlie dominantly inherited RTSH in all 82 affected participants from 12 unrelated families. The STR is contained in a primate-specific Alu retrotransposon with thyroid-specific cis-regulatory chromatin features. Fiber-seq and RNA-seq studies revealed that the mutant STR activates a thyroid-specific enhancer cluster, leading to haplotype-specific upregulation of the bicistronic MIR7-2/MIR1179 locus 35 kb downstream and overexpression of its microRNA products in the participants' thyrocytes. An imbalance in signaling pathways targeted by these micro-RNAs provides a working model for this cause of RTSH. This finding broadens our current knowledge of genetic defects altering pituitary-thyroid feedback regulation.
Collapse
Affiliation(s)
- Helmut Grasberger
- Department of Internal Medicine, Medical School, University of Michigan, Ann Arbor, MI, USA
| | - Alexandra M Dumitrescu
- Department of Medicine, The University of Chicago, Chicago, IL, USA
- Committee on Molecular Metabolism and Nutrition, The University of Chicago, Chicago, IL, USA
| | - Xiao-Hui Liao
- Department of Medicine, The University of Chicago, Chicago, IL, USA
| | - Elliott G Swanson
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA, USA
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Roy E Weiss
- Department of Medicine, University of Miami Miller School of Medicine, Miami, FL, USA
| | | | - Theodora Pappa
- Department of Medicine, The University of Chicago, Chicago, IL, USA
| | - Junfeng Chen
- Institute of Transformative Bio-Molecules (WPI-ITbM) and Graduate School of Bioagricultural Sciences, Nagoya University, Nagoya, Japan
| | - Takashi Yoshimura
- Institute of Transformative Bio-Molecules (WPI-ITbM) and Graduate School of Bioagricultural Sciences, Nagoya University, Nagoya, Japan
| | - Phillip Hoffmann
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, Brussels, Belgium
| | | | - Rebecca Tagett
- Michigan Medicine BRCF Bioinformatics Core, University of Michigan, Ann Arbor, MI, USA
| | | | - Sabine Costagliola
- Institut de Recherche Interdisciplinaire en Biologie Humaine et Moléculaire (IRIBHM), Université Libre de Bruxelles, Brussels, Belgium
| | - Jane Ranchalis
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Mitchell R Vollger
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Andrew B Stergachis
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA, USA
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman-Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Jessica X Chong
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA, USA
- Brotman-Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Michael J Bamshad
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA, USA
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman-Baty Institute for Precision Medicine, Seattle, WA, USA
| | - Guillaume Smits
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, Brussels, Belgium
- Center of Human Genetics, Hôpital Erasme, Hôpital Universitaire de Bruxelles, and Department of Genetics, Hôpital Universitaire des Enfants Reine Fabiola, Hôpital Universitaire de Bruxelles, Université Libre de Bruxelles, Brussels, Belgium
| | - Gilbert Vassart
- Institut de Recherche Interdisciplinaire en Biologie Humaine et Moléculaire (IRIBHM), Université Libre de Bruxelles, Brussels, Belgium
| | - Samuel Refetoff
- Department of Medicine, The University of Chicago, Chicago, IL, USA.
- Committee on Genetics, The University of Chicago, Chicago, IL, USA.
- Department of Pediatrics, The University of Chicago, Chicago, IL, USA.
| |
Collapse
|
20
|
Wennmann JT, Lim FS, Senger S, Gani M, Jehle JA, Keilwagen J. Haplotype determination of the Bombyx mori nucleopolyhedrovirus by Nanopore sequencing and linkage of single nucleotide variants. J Gen Virol 2024; 105. [PMID: 38767624 DOI: 10.1099/jgv.0.001983] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024] Open
Abstract
Naturally occurring isolates of baculoviruses, such as the Bombyx mori nucleopolyhedrovirus (BmNPV), usually consist of numerous genetically different haplotypes. Deciphering the different haplotypes of such isolates is hampered by the large size of the dsDNA genome, as well as the short read length of next generation sequencing (NGS) techniques that are widely applied for baculovirus isolate characterization. In this study, we addressed this challenge by combining the accuracy of NGS to determine single nucleotide variants (SNVs) as genetic markers with the long read length of Nanopore sequencing technique. This hybrid approach allowed the comprehensive analysis of genetically homogeneous and heterogeneous isolates of BmNPV. Specifically, this allowed the identification of two putative major haplotypes in the heterogeneous isolate BmNPV-Ja by SNV position linkage. SNV positions, which were determined based on NGS data, were linked by the long Nanopore reads in a Position Weight Matrix. Using a modified Expectation-Maximization algorithm, the Nanopore reads were assigned according to the occurrence of variable SNV positions by machine learning. The cohorts of reads were de novo assembled, which led to the identification of BmNPV haplotypes. The method demonstrated the strength of the combined approach of short- and long-read sequencing techniques to decipher the genetic diversity of baculovirus isolates.
Collapse
Affiliation(s)
- Jörg T Wennmann
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Biological Control, Schwabenheimer Str. 101, 69221 Dossenheim, Germany
| | - Fang-Shiang Lim
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Biological Control, Schwabenheimer Str. 101, 69221 Dossenheim, Germany
| | - Sergei Senger
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Biological Control, Schwabenheimer Str. 101, 69221 Dossenheim, Germany
| | - Mudasir Gani
- Division of Entomology, Faculty of Agriculture, Sher-e-Kashmir University of Agricultural Sciences & Technology, Kashmir 193 201, J&K, India
| | - Johannes A Jehle
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Biological Control, Schwabenheimer Str. 101, 69221 Dossenheim, Germany
| | - Jens Keilwagen
- Julius Kühn Institute (JKI) - Federal Research Centre for Cultivated Plants, Institute for Biosafety in Plant Biotechnology, Ernst-Baur-Str. 27, 06484 Quedlinburg, Germany
| |
Collapse
|
21
|
Shi TL, Jia KH, Bao YT, Nie S, Tian XC, Yan XM, Chen ZY, Li ZC, Zhao SW, Ma HY, Zhao Y, Li X, Zhang RG, Guo J, Zhao W, El-Kassaby YA, Müller N, Van de Peer Y, Wang XR, Street NR, Porth I, An X, Mao JF. High-quality genome assembly enables prediction of allele-specific gene expression in hybrid poplar. PLANT PHYSIOLOGY 2024; 195:652-670. [PMID: 38412470 PMCID: PMC11060683 DOI: 10.1093/plphys/kiae078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 01/08/2024] [Accepted: 01/09/2024] [Indexed: 02/29/2024]
Abstract
Poplar (Populus) is a well-established model system for tree genomics and molecular breeding, and hybrid poplar is widely used in forest plantations. However, distinguishing its diploid homologous chromosomes is difficult, complicating advanced functional studies on specific alleles. In this study, we applied a trio-binning design and PacBio high-fidelity long-read sequencing to obtain haplotype-phased telomere-to-telomere genome assemblies for the 2 parents of the well-studied F1 hybrid "84K" (Populus alba × Populus tremula var. glandulosa). Almost all chromosomes, including the telomeres and centromeres, were completely assembled for each haplotype subgenome apart from 2 small gaps on one chromosome. By incorporating information from these haplotype assemblies and extensive RNA-seq data, we analyzed gene expression patterns between the 2 subgenomes and alleles. Transcription bias at the subgenome level was not uncovered, but extensive-expression differences were detected between alleles. We developed machine-learning (ML) models to predict allele-specific expression (ASE) with high accuracy and identified underlying genome features most highly influencing ASE. One of our models with 15 predictor variables achieved 77% accuracy on the training set and 74% accuracy on the testing set. ML models identified gene body CHG methylation, sequence divergence, and transposon occupancy both upstream and downstream of alleles as important factors for ASE. Our haplotype-phased genome assemblies and ML strategy highlight an avenue for functional studies in Populus and provide additional tools for studying ASE and heterosis in hybrids.
Collapse
Affiliation(s)
- Tian-Le Shi
- State Key Laboratory of Tree Genetics and Breeding, National Engineering Research Center of Tree Breeding and Ecological Restoration, Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, National Engineering Laboratory for Tree Breeding, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China
| | - Kai-Hua Jia
- State Key Laboratory of Tree Genetics and Breeding, National Engineering Research Center of Tree Breeding and Ecological Restoration, Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, National Engineering Laboratory for Tree Breeding, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China
- Key Laboratory of Crop Genetic Improvement & Ecology and Physiology, Institute of Crop Germplasm Resources, Shandong Academy of Agricultural Sciences, Ji’nan 250100, China
| | - Yu-Tao Bao
- State Key Laboratory of Tree Genetics and Breeding, National Engineering Research Center of Tree Breeding and Ecological Restoration, Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, National Engineering Laboratory for Tree Breeding, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China
| | - Shuai Nie
- Rice Research Institute, Guangdong Academy of Agricultural Sciences & Key Laboratory of Genetics and Breeding of High Quality Rice in Southern China (Co-construction by Ministry and Province), Ministry of Agriculture and Rural Affairs & Guangdong Key Laboratory of New Technology in Rice Breeding, Guangzhou 510640, China
| | - Xue-Chan Tian
- State Key Laboratory of Tree Genetics and Breeding, National Engineering Research Center of Tree Breeding and Ecological Restoration, Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, National Engineering Laboratory for Tree Breeding, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China
| | - Xue-Mei Yan
- State Key Laboratory of Tree Genetics and Breeding, National Engineering Research Center of Tree Breeding and Ecological Restoration, Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, National Engineering Laboratory for Tree Breeding, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China
| | - Zhao-Yang Chen
- State Key Laboratory of Tree Genetics and Breeding, National Engineering Research Center of Tree Breeding and Ecological Restoration, Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, National Engineering Laboratory for Tree Breeding, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China
| | - Zhi-Chao Li
- State Key Laboratory of Tree Genetics and Breeding, National Engineering Research Center of Tree Breeding and Ecological Restoration, Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, National Engineering Laboratory for Tree Breeding, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China
| | - Shi-Wei Zhao
- State Key Laboratory of Tree Genetics and Breeding, National Engineering Research Center of Tree Breeding and Ecological Restoration, Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, National Engineering Laboratory for Tree Breeding, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China
| | - Hai-Yao Ma
- State Key Laboratory of Tree Genetics and Breeding, National Engineering Research Center of Tree Breeding and Ecological Restoration, Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, National Engineering Laboratory for Tree Breeding, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China
| | - Ye Zhao
- State Key Laboratory of Tree Genetics and Breeding, National Engineering Research Center of Tree Breeding and Ecological Restoration, Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, National Engineering Laboratory for Tree Breeding, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China
| | - Xiang Li
- School of Agriculture, Ningxia University, Yinchuan 750021, China
| | - Ren-Gang Zhang
- Yunnan Key Laboratory for Integrative Conservation of Plant Species with Extremely Small Populations, Key Laboratory for Plant Diversity and Biogeography of East Asia, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, Yunnan, China
| | - Jing Guo
- College of Forestry, Shandong Agricultural University, Tai’an 271000, China
| | - Wei Zhao
- Umeå Plant Science Centre, Department of Ecology and Environmental Science, Umeå University, SE-901 87 Umeå, Sweden
| | - Yousry Aly El-Kassaby
- Department of Forest and Conservation Sciences, Faculty of Forestry, University of British Columbia, Vancouver, Bc, V6T 1Z4, Canada
| | - Niels Müller
- Thünen-Institute of Forest Genetics, 22927 Grosshansdorf, Germany
| | - Yves Van de Peer
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium
- VIB Center for Plant Systems Biology, 9052 Ghent, Belgium
- Centre for Microbial Ecology and Genomics, Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria 0028, South Africa
- College of Horticulture, Academy for Advanced Interdisciplinary Studies, Nanjing Agricultural University, Nanjing 210095, China
| | - Xiao-Ru Wang
- Umeå Plant Science Centre, Department of Ecology and Environmental Science, Umeå University, SE-901 87 Umeå, Sweden
| | - Nathaniel Robert Street
- Umeå Plant Science Centre, Department of Plant Physiology, Umeå University, SE-901 87 Umeå, Sweden
| | - Ilga Porth
- Départment des Sciences du Bois et de la Forêt, Faculté de Foresterie, de Géographie et Géomatique, Université Laval, Québec, QC G1V 0A6, Canada
| | - Xinmin An
- State Key Laboratory of Tree Genetics and Breeding, National Engineering Research Center of Tree Breeding and Ecological Restoration, Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, National Engineering Laboratory for Tree Breeding, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China
| | - Jian-Feng Mao
- State Key Laboratory of Tree Genetics and Breeding, National Engineering Research Center of Tree Breeding and Ecological Restoration, Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, National Engineering Laboratory for Tree Breeding, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China
- Umeå Plant Science Centre, Department of Plant Physiology, Umeå University, SE-901 87 Umeå, Sweden
| |
Collapse
|
22
|
Lorig-Roach R, Meredith M, Monlong J, Jain M, Olsen HE, McNulty B, Porubsky D, Montague TG, Lucas JK, Condon C, Eizenga JM, Juul S, McKenzie SK, Simmonds SE, Park J, Asri M, Koren S, Eichler EE, Axel R, Martin B, Carnevali P, Miga KH, Paten B. Phased nanopore assembly with Shasta and modular graph phasing with GFAse. Genome Res 2024; 34:454-468. [PMID: 38627094 PMCID: PMC11067879 DOI: 10.1101/gr.278268.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Accepted: 03/19/2024] [Indexed: 04/30/2024]
Abstract
Reference-free genome phasing is vital for understanding allele inheritance and the impact of single-molecule DNA variation on phenotypes. To achieve thorough phasing across homozygous or repetitive regions of the genome, long-read sequencing technologies are often used to perform phased de novo assembly. As a step toward reducing the cost and complexity of this type of analysis, we describe new methods for accurately phasing Oxford Nanopore Technologies (ONT) sequence data with the Shasta genome assembler and a modular tool for extending phasing to the chromosome scale called GFAse. We test using new variants of ONT PromethION sequencing, including those using proximity ligation, and show that newer, higher accuracy ONT reads substantially improve assembly quality.
Collapse
Affiliation(s)
- Ryan Lorig-Roach
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, California 95060, USA;
| | - Melissa Meredith
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, California 95060, USA
| | - Jean Monlong
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, California 95060, USA
| | - Miten Jain
- Department of Bioengineering, Department of Physics, Northeastern University, Boston, Massachusetts 02120, USA
| | - Hugh E Olsen
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, California 95060, USA
| | - Brandy McNulty
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, California 95060, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Tessa G Montague
- The Mortimer B. Zuckerman Mind Brain Behavior Institute, Department of Neuroscience, Columbia University, New York, New York 10027, USA
- Howard Hughes Medical Institute, Columbia University, New York, New York 10032, USA
| | - Julian K Lucas
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, California 95060, USA
| | - Chris Condon
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, California 95060, USA
| | - Jordan M Eizenga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, California 95060, USA
| | - Sissel Juul
- Oxford Nanopore Technologies Incorporated, New York, New York 10013, USA
| | - Sean K McKenzie
- Oxford Nanopore Technologies Incorporated, New York, New York 10013, USA
| | - Sara E Simmonds
- Chan Zuckerberg Initiative Foundation, Redwood City, California 94063, USA
| | - Jimin Park
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, California 95060, USA
| | - Mobin Asri
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, California 95060, USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20894, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
| | - Richard Axel
- The Mortimer B. Zuckerman Mind Brain Behavior Institute, Department of Neuroscience, Columbia University, New York, New York 10027, USA
- Howard Hughes Medical Institute, Columbia University, New York, New York 10032, USA
| | - Bruce Martin
- Chan Zuckerberg Initiative Foundation, Redwood City, California 94063, USA
| | - Paolo Carnevali
- Chan Zuckerberg Initiative Foundation, Redwood City, California 94063, USA;
| | - Karen H Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, California 95060, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, California 95060, USA;
| |
Collapse
|
23
|
Wang H, Chen M, Wei X, Xia R, Pei D, Huang X, Han B. Computational tools for plant genomics and breeding. SCIENCE CHINA. LIFE SCIENCES 2024:10.1007/s11427-024-2578-6. [PMID: 38676814 DOI: 10.1007/s11427-024-2578-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Accepted: 03/25/2024] [Indexed: 04/29/2024]
Abstract
Plant genomics and crop breeding are at the intersection of biotechnology and information technology. Driven by a combination of high-throughput sequencing, molecular biology and data science, great advances have been made in omics technologies at every step along the central dogma, especially in genome assembling, genome annotation, epigenomic profiling, and transcriptome profiling. These advances further revolutionized three directions of development. One is genetic dissection of complex traits in crops, along with genomic prediction and selection. The second is comparative genomics and evolution, which open up new opportunities to depict the evolutionary constraints of biological sequences for deleterious variant discovery. The third direction is the development of deep learning approaches for the rational design of biological sequences, especially proteins, for synthetic biology. All three directions of development serve as the foundation for a new era of crop breeding where agronomic traits are enhanced by genome design.
Collapse
Affiliation(s)
- Hai Wang
- State Key Laboratory of Maize Bio-breeding, Frontiers Science Center for Molecular Design Breeding, Joint International Research Laboratory of Crop Molecular Breeding, National Maize Improvement Center, College of Agronomy and Biotechnology, China Agricultural University, Beijing, 100193, China.
- Sanya Institute of China Agricultural University, Sanya, 572025, China.
- Hainan Yazhou Bay Seed Laboratory, Sanya, 572025, China.
| | - Mengjiao Chen
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of the State Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China
| | - Xin Wei
- Shanghai Key Laboratory of Plant Molecular Sciences, College of Life Sciences, Shanghai Normal University, Shanghai, 200234, China
| | - Rui Xia
- College of Horticulture, South China Agricultural University, Guangzhou, 510640, China
| | - Dong Pei
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of the State Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China
| | - Xuehui Huang
- Shanghai Key Laboratory of Plant Molecular Sciences, College of Life Sciences, Shanghai Normal University, Shanghai, 200234, China
| | - Bin Han
- National Center for Gene Research, CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, 200233, China
| |
Collapse
|
24
|
Li H, Durbin R. Genome assembly in the telomere-to-telomere era. Nat Rev Genet 2024:10.1038/s41576-024-00718-w. [PMID: 38649458 DOI: 10.1038/s41576-024-00718-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/27/2024] [Indexed: 04/25/2024]
Abstract
Genome sequences largely determine the biology and encode the history of an organism, and de novo assembly - the process of reconstructing the genome sequence of an organism from sequencing reads - has been a central problem in bioinformatics for four decades. Until recently, genomes were typically assembled into fragments of a few megabases at best, but now technological advances in long-read sequencing enable the near-complete assembly of each chromosome - also known as telomere-to-telomere assembly - for many organisms. Here, we review recent progress on assembly algorithms and protocols, with a focus on how to derive near-telomere-to-telomere assemblies. We also discuss the additional developments that will be required to resolve remaining assembly gaps and to assemble non-diploid genomes.
Collapse
Affiliation(s)
- Heng Li
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA.
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
| | - Richard Durbin
- Department of Genetics, Cambridge University, Cambridge, UK.
| |
Collapse
|
25
|
Kalleberg J, Rissman J, Schnabel RD. Overcoming Limitations to Deep Learning in Domesticated Animals with TrioTrain. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.15.589602. [PMID: 38659907 PMCID: PMC11042298 DOI: 10.1101/2024.04.15.589602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
Variant calling across diverse species remains challenging as most bioinformatics tools default to assumptions based on human genomes. DeepVariant (DV) excels without joint genotyping while offering fewer implementation barriers. However, the growing appeal of a "universal" algorithm has magnified the unknown impacts when used with non-human genomes. Here, we use bovine genomes to assess the limits of human-genome-trained models in other species. We introduce the first multi-species DV model that achieves a lower Mendelian Inheritance Error (MIE) rate during single-sample genotyping. Our novel approach, TrioTrain, automates extending DV for species without Genome In A Bottle (GIAB) resources and uses region shuffling to mitigate barriers for SLURM-based clusters. To offset imperfect truth labels for animal genomes, we remove Mendelian discordant variants before training, where models are tuned to genotype the offspring correctly. With TrioTrain, we use cattle, yak, and bison trios to build 30 model iterations across five phases. We observe remarkable performance across phases when testing the GIAB human trios with a mean SNP F1 score >0.990. In HG002, our phase 4 bovine model identifies more variants at a lower MIE rate than DeepTrio. In bovine F1-hybrid genomes, our model substantially reduces inheritance errors with a mean MIE rate of 0.03 percent. Although constrained by imperfect labels, we find that multi-species, trio-based training produces a robust variant calling model. Our research demonstrates that exclusively training with human genomes restricts the application of deep-learning approaches for comparative genomics.
Collapse
Affiliation(s)
- Jenna Kalleberg
- University of Missouri, Division of Animal Sciences, Columbia, MO, 65201 USA
| | - Jacob Rissman
- University of Missouri, Division of Animal Sciences, Columbia, MO, 65201 USA
| | - Robert D Schnabel
- University of Missouri, Division of Animal Sciences, Columbia, MO, 65201 USA
- University of Missouri, Genetics Area Program, Columbia, MO, 65201 USA
| |
Collapse
|
26
|
Du X, Sun Y, Fu T, Gao T, Zhang T. Research Progress and Applications of Bovine Genome in the Tribe Bovini. Genes (Basel) 2024; 15:509. [PMID: 38674443 PMCID: PMC11050176 DOI: 10.3390/genes15040509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 04/16/2024] [Accepted: 04/17/2024] [Indexed: 04/28/2024] Open
Abstract
Various bovine species have been domesticated and bred for thousands of years, and they provide adequate animal-derived products, including meat, milk, and leather, to meet human requirements. Despite the review studies on economic traits in cattle, the genetic basis of traits has only been partially explained by phenotype and pedigree breeding methods, due to the complexity of genomic regulation during animal development and growth. With the advent of next-generation sequencing technology, genomics projects, such as the 1000 Bull Genomes Project, Functional Annotation of Animal Genomes project, and Bovine Pangenome Consortium, have advanced bovine genomic research. These large-scale genomics projects gave us a comprehensive concept, technology, and public resources. In this review, we summarize the genomics research progress of the main bovine species during the past decade, including cattle (Bos taurus), yak (Bos grunniens), water buffalo (Bubalus bubalis), zebu (Bos indicus), and gayal (Bos frontalis). We mainly discuss the development of genome sequencing and functional annotation, focusing on how genomic analysis reveals genetic variation and its impact on phenotypes in several bovine species.
Collapse
Affiliation(s)
- Xingjie Du
- College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China; (X.D.); (Y.S.); (T.F.); (T.G.)
- Henan International Joint Laboratory of Nutrition Regulation and Ecological Raising of Domestic Animal, College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China
| | - Yu Sun
- College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China; (X.D.); (Y.S.); (T.F.); (T.G.)
- Henan International Joint Laboratory of Nutrition Regulation and Ecological Raising of Domestic Animal, College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China
| | - Tong Fu
- College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China; (X.D.); (Y.S.); (T.F.); (T.G.)
- Henan International Joint Laboratory of Nutrition Regulation and Ecological Raising of Domestic Animal, College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China
| | - Tengyun Gao
- College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China; (X.D.); (Y.S.); (T.F.); (T.G.)
- Henan International Joint Laboratory of Nutrition Regulation and Ecological Raising of Domestic Animal, College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China
| | - Tianliu Zhang
- College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China; (X.D.); (Y.S.); (T.F.); (T.G.)
- Henan International Joint Laboratory of Nutrition Regulation and Ecological Raising of Domestic Animal, College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China
| |
Collapse
|
27
|
Lawson DJ, Howard-McCombe J, Beaumont M, Senn H. How admixed captive breeding populations could be rescued using local ancestry information. Mol Ecol 2024:e17349. [PMID: 38634332 DOI: 10.1111/mec.17349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 12/21/2023] [Accepted: 02/26/2024] [Indexed: 04/19/2024]
Abstract
This paper asks the question: can genomic information be used to recover a species that is already on the pathway to extinction due to genetic swamping from a related and more numerous population? We show that a breeding strategy in a captive breeding program can use whole genome sequencing to identify and remove segments of DNA introgressed through hybridisation. The proposed policy uses a generalized measure of kinship or heterozygosity accounting for local ancestry, that is, whether a specific genetic location was inherited from the target of conservation. We then show that optimizing these measures would minimize undesired ancestry while also controlling kinship and/or heterozygosity, in a simulated breeding population. The process is applied to real data representing the hybridized Scottish wildcat breeding population, with the result that it should be possible to breed out domestic cat ancestry. The ability to reverse introgression is a powerful tool brought about through the combination of sequencing with computational advances in ancestry estimation. Since it works best when applied early in the process, important decisions need to be made about which genetically distinct populations should benefit from it and which should be left to reform into a single population.
Collapse
Affiliation(s)
- Daniel J Lawson
- Institute of Statistical Sciences, School of Mathematics, University of Bristol, Bristol, UK
| | - Jo Howard-McCombe
- RZSS WildGenes Laboratory, Conservation Department, Royal Zoological Society of Scotland, Edinburgh, UK
| | - Mark Beaumont
- School of Biological Sciences, University of Bristol, Bristol, UK
| | - Helen Senn
- RZSS WildGenes Laboratory, Conservation Department, Royal Zoological Society of Scotland, Edinburgh, UK
| |
Collapse
|
28
|
Zhou Q, Ji F, Lin D, Liu X, Zhu Z, Ruan J. KSNP: a fast de Bruijn graph-based haplotyping tool approaching data-in time cost. Nat Commun 2024; 15:3126. [PMID: 38605047 PMCID: PMC11009271 DOI: 10.1038/s41467-024-47562-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Accepted: 04/04/2024] [Indexed: 04/13/2024] Open
Abstract
Long reads that cover more variants per read raise opportunities for accurate haplotype construction, whereas the genotype errors of single nucleotide polymorphisms pose great computational challenges for haplotyping tools. Here we introduce KSNP, an efficient haplotype construction tool based on the de Bruijn graph (DBG). KSNP leverages the ability of DBG in handling high-throughput erroneous reads to tackle the challenges. Compared to other notable tools in this field, KSNP achieves at least 5-fold speedup while producing comparable haplotype results. The time required for assembling human haplotypes is reduced to nearly the data-in time.
Collapse
Affiliation(s)
- Qian Zhou
- PengCheng Laboratory, Shenzhen, China
| | - Fahu Ji
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Dongxiao Lin
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
| | - Xianming Liu
- PengCheng Laboratory, Shenzhen, China
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Zexuan Zhu
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China.
- National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University, Shenzhen, China.
| | - Jue Ruan
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China.
| |
Collapse
|
29
|
Nie F, Ni P, Huang N, Zhang J, Wang Z, Xiao C, Luo F, Wang J. De novo diploid genome assembly using long noisy reads. Nat Commun 2024; 15:2964. [PMID: 38580638 PMCID: PMC10997618 DOI: 10.1038/s41467-024-47349-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Accepted: 03/25/2024] [Indexed: 04/07/2024] Open
Abstract
The high sequencing error rate has impeded the application of long noisy reads for diploid genome assembly. Most existing assemblers failed to generate high-quality phased assemblies using long noisy reads. Here, we present PECAT, a Phased Error Correction and Assembly Tool, for reconstructing diploid genomes from long noisy reads. We design a haplotype-aware error correction method that can retain heterozygote alleles while correcting sequencing errors. We combine a corrected read SNP caller and a raw read SNP caller to further improve the identification of inconsistent overlaps in the string graph. We use a grouping method to assign reads to different haplotype groups. PECAT efficiently assembles diploid genomes using Nanopore R9, PacBio CLR or Nanopore R10 reads only. PECAT generates more contiguous haplotype-specific contigs compared to other assemblers. Especially, PECAT achieves nearly haplotype-resolved assembly on B. taurus (Bison×Simmental) using Nanopore R9 reads and phase block NG50 with 59.4/58.0 Mb for HG002 using Nanopore R10 reads.
Collapse
Affiliation(s)
- Fan Nie
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
- Xiangjiang Laboratory, Changsha, 410205, China
- National Center for Applied Mathematics in Hunan and Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, Hunan, 411105, China
| | - Peng Ni
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
- Xiangjiang Laboratory, Changsha, 410205, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China
| | - Neng Huang
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
- Xiangjiang Laboratory, Changsha, 410205, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China
| | - Jun Zhang
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
- Xiangjiang Laboratory, Changsha, 410205, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China
| | - Zhenyu Wang
- Institute of Nanfan & Seed Industry, Guangdong Academy of Sciences, Guangdong, 510316, China
| | - Chuanle Xiao
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University #7 Jinsui Road, Tianhe District, Guangzhou, China.
| | - Feng Luo
- School of Computing, Clemson University, Clemson, SC, 29634-0974, USA.
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China.
- Xiangjiang Laboratory, Changsha, 410205, China.
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China.
| |
Collapse
|
30
|
Smith T, Olagunju T, Rosen B, Neibergs H, Becker G, Davenport K, Elsik C, Hadfield T, Koren S, Kuhn K, Rhie A, Shira K, Skibiel A, Stegemiller M, Thorne J, Villamediana P, Cockett N, Murdoch B. The first complete T2T Assemblies of Cattle and Sheep Y-Chromosomes uncover remarkable divergence in structure and gene content. RESEARCH SQUARE 2024:rs.3.rs-4033388. [PMID: 38712074 PMCID: PMC11071540 DOI: 10.21203/rs.3.rs-4033388/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
Reference genomes of cattle and sheep have lacked contiguous assemblies of the sex-determining Y chromosome. We assembled complete and gapless telomere to telomere (T2T) Y chromosomes for these species. The pseudo-autosomal regions were similar in length, but the total chromosome size was substantially different, with the cattle Y more than twice the length of the sheep Y. The length disparity was accounted for by expanded ampliconic region in cattle. The genic amplification in cattle contrasts with pseudogenization in sheep suggesting opposite evolutionary mechanisms since their divergence 18MYA. The centromeres also differed dramatically despite the close relationship between these species at the overall genome sequence level. These Y chromosome have been added to the current reference assemblies in GenBank opening new opportunities for the study of evolution and variation while supporting efforts to improve sustainability in these important livestock species that generally use sire-driven genetic improvement strategies.
Collapse
Affiliation(s)
- Timothy Smith
- USDA, ARS, U.S. Meat Animal Research Center (USMARC)
| | | | | | | | | | | | | | | | - Sergey Koren
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health
| | | | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | | | | | | | | | | | | |
Collapse
|
31
|
Creating diploid assemblies from Nanopore and Illumina reads with hypo-assembler. Nat Methods 2024; 21:560-561. [PMID: 38459387 DOI: 10.1038/s41592-023-02142-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/10/2024]
|
32
|
Koren S, Bao Z, Guarracino A, Ou S, Goodwin S, Jenike KM, Lucas J, McNulty B, Park J, Rautiainen M, Rhie A, Roelofs D, Schneiders H, Vrijenhoek I, Nijbroek K, Ware D, Schatz MC, Garrison E, Huang S, McCombie WR, Miga KH, Wittenberg AH, Phillippy AM. Gapless assembly of complete human and plant chromosomes using only nanopore sequencing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.15.585294. [PMID: 38529488 PMCID: PMC10962732 DOI: 10.1101/2024.03.15.585294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/27/2024]
Abstract
The combination of ultra-long Oxford Nanopore (ONT) sequencing reads with long, accurate PacBio HiFi reads has enabled the completion of a human genome and spurred similar efforts to complete the genomes of many other species. However, this approach for complete, "telomere-to-telomere" genome assembly relies on multiple sequencing platforms, limiting its accessibility. ONT "Duplex" sequencing reads, where both strands of the DNA are read to improve quality, promise high per-base accuracy. To evaluate this new data type, we generated ONT Duplex data for three widely-studied genomes: human HG002, Solanum lycopersicum Heinz 1706 (tomato), and Zea mays B73 (maize). For the diploid, heterozygous HG002 genome, we also used "Pore-C" chromatin contact mapping to completely phase the haplotypes. We found the accuracy of Duplex data to be similar to HiFi sequencing, but with read lengths tens of kilobases longer, and the Pore-C data to be compatible with existing diploid assembly algorithms. This combination of read length and accuracy enables the construction of a high-quality initial assembly, which can then be further resolved using the ultra-long reads, and finally phased into chromosome-scale haplotypes with Pore-C. The resulting assemblies have a base accuracy exceeding 99.999% (Q50) and near-perfect continuity, with most chromosomes assembled as single contigs. We conclude that ONT sequencing is a viable alternative to HiFi sequencing for de novo genome assembly, and has the potential to provide a single-instrument solution for the reconstruction of complete genomes.
Collapse
Affiliation(s)
- Sergey Koren
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Zhigui Bao
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, BadenWürttemberg, Germany
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Andrea Guarracino
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, Tennessee, USA
- Human Technopole, Milan, Italy
| | - Shujun Ou
- Ohio State University, Columbus, OH, USA
| | - Sara Goodwin
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Katharine M. Jenike
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Julian Lucas
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - Brandy McNulty
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - Jimin Park
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - Mikko Rautiainen
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Arang Rhie
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Dick Roelofs
- KeyGene, Agro Business Park 90, 6708 PW Wageningen, Netherlands
| | | | - Ilse Vrijenhoek
- KeyGene, Agro Business Park 90, 6708 PW Wageningen, Netherlands
| | - Koen Nijbroek
- KeyGene, Agro Business Park 90, 6708 PW Wageningen, Netherlands
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Michael C. Schatz
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Erik Garrison
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, Tennessee, USA
| | - Sanwen Huang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
- State Key Laboratory of Tropical Crop Breeding, Chinese Academy of Tropical Agricultural Sciences, Haikou, Hainan, China
| | | | - Karen H. Miga
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | - Adam M. Phillippy
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
33
|
Muthusamy PV, Vakayil Mani R, Kumari S, Kaur M, Bhaskar B, Raghavan Pillai R, Sajeev Kumar T, Anilkumar TV, Singh NS. Hybrid de novo and haplotype-resolved genome assembly of Vechur cattle - elucidating genetic variation. Front Genet 2024; 15:1338224. [PMID: 38510276 PMCID: PMC10952100 DOI: 10.3389/fgene.2024.1338224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 01/29/2024] [Indexed: 03/22/2024] Open
Abstract
Cattle contribute to the nutritional needs and economy of a place. The performance and fitness of cattle depend on the response and adaptation to local climatic conditions. Genomic and genetic studies are important for advancing cattle breeding, and availability of relevant reference genomes is essential. In the present study, the genome of a Vechur calf was sequenced on both short-read Illumina and long-read Nanopore sequencing platforms. The hybrid de novo assembly approach was deployed to obtain an average contig length of 1.97 Mbp and an N50 of 4.94 Mbp. By using a short-read genome sequence of the corresponding sire and dam, a haplotype-resolved genome was also assembled. In comparison to the taurine reference genome, we found 28,982 autosomal structural variants and 16,926,990 SNVs, with 883,544 SNVs homozygous in the trio samples. Many of these SNPs have been reported to be associated with various QTLs including growth, milk yield, and milk fat content, which are crucial determinants of cattle production. Furthermore, population genotype data analysis indicated that the present sample belongs to an Indian cattle breed forming a unique cluster of Bos indicus. Subsequent FST analysis revealed differentiation of the Vechur cattle genome at multiple loci, especially those regions related to whole body growth and cell division, especially IGF1, HMGA2, RRM2, and CD68 loci, suggesting a possible role of these genes in its small stature and better disease resistance capabilities in comparison with the local crossbreeds. This provides an opportunity to select and engineer cattle breeds optimized for local conditions.
Collapse
Affiliation(s)
- Poorvishaa V. Muthusamy
- School of Biology, Indian Institute of Science Education and Research, Thiruvananthapuram, Kerala, India
| | | | - Shivani Kumari
- School of Biology, Indian Institute of Science Education and Research, Thiruvananthapuram, Kerala, India
| | - Manpreet Kaur
- School of Biology, Indian Institute of Science Education and Research, Thiruvananthapuram, Kerala, India
| | - Balu Bhaskar
- Kerala Livestock Development Board, Thiruvananthapuram, Kerala, India
| | | | | | - Thapasimuthu Vijayamma Anilkumar
- School of Biology, Indian Institute of Science Education and Research, Thiruvananthapuram, Kerala, India
- Division of Experimental Pathology, Sree Chitra Tirunal Institute for Medical Sciences and Technology, Thiruvananthapuram, India
| | | |
Collapse
|
34
|
Filipović I, Marshall JM, Rašić G. Finding divergent sequences of homomorphic sex chromosomes via diploidized nanopore-based assembly from a single male. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.29.582759. [PMID: 38464271 PMCID: PMC10925256 DOI: 10.1101/2024.02.29.582759] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Although homomorphic sex chromosomes can have non-recombining regions with elevated sequence divergence between its complements, such divergence signals can be difficult to detect bioinformatically. If found in genomes of e.g. insect pests, these sequences could be targeted by the engineered genetic sexing and control systems. Here, we report an approach that can leverage long-read nanopore sequencing of a single XY male to identify divergent regions of homomorphic sex chromosomes. Long-read data are used for de novo genome assembly that is diploidized in a way that maximizes sex-specific differences between its haploid complements. We show that the correct assembly phasing is supported by the mapping of nanopore reads from the male's haploid Y-bearing sperm cells. The approach revealed a highly divergent region (HDR) near the centromere of the homomorphic sex chromosome of Aedes aegypti, the most important arboviral vector, for which there is a great interest in creating new genetic control tools. HDR is located ~5Mb downstream of the known male-determining locus on chromosome 1 and is significantly enriched for ovary-biased genes. While recombination in HDR ceased relatively recently (~1.4 MYA), HDR gametologs have divergent exons and introns of protein coding genes, and most lncRNA genes became X-specific. Megabases of previously invisible sex-linked sequences provide new putative targets for engineering the genetic systems to control this deadly mosquito. Broadly, our approach expands the toolbox for studying cryptic structure of sex chromosomes.
Collapse
Affiliation(s)
- Igor Filipović
- Mosquito Genomics, QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston QLD 4006, Australia
- The University of Queensland, School of Biological Sciences, St Lucia, QLD, Australia
| | - John M Marshall
- Divisions of Biostatistics and Epidemiology, School of Public Health, University of California, Berkeley, CA, USA
- Innovative Genomics Institute, University of California, Berkeley, CA, USA
| | - Gordana Rašić
- Mosquito Genomics, QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston QLD 4006, Australia
| |
Collapse
|
35
|
Porubsky D, Eichler EE. A 25-year odyssey of genomic technology advances and structural variant discovery. Cell 2024; 187:1024-1037. [PMID: 38290514 DOI: 10.1016/j.cell.2024.01.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Revised: 12/20/2023] [Accepted: 01/02/2024] [Indexed: 02/01/2024]
Abstract
This perspective focuses on advances in genome technology over the last 25 years and their impact on germline variant discovery within the field of human genetics. The field has witnessed tremendous technological advances from microarrays to short-read sequencing and now long-read sequencing. Each technology has provided genome-wide access to different classes of human genetic variation. We are now on the verge of comprehensive variant detection of all forms of variation for the first time with a single assay. We predict that this transition will further transform our understanding of human health and biology and, more importantly, provide novel insights into the dynamic mutational processes shaping our genomes.
Collapse
Affiliation(s)
- David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
36
|
Huang J, Chen J, Shi M, Zheng J, Chen M, Wu L, Zhu H, Zheng Y, Wu Q, Wu F. Genome assembly provides insights into the genome evolution of Baccaurea ramiflora Lour. Sci Rep 2024; 14:4867. [PMID: 38418841 PMCID: PMC10901894 DOI: 10.1038/s41598-024-55498-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 02/24/2024] [Indexed: 03/02/2024] Open
Abstract
Baccaurea ramiflora Lour., an evergreen tree of the Baccaurea genus of the Phyllanthaceae family, is primarily distributed in South Asia, Southeast Asia, and southern China, including southern Yunnan Province. It is a wild or semi-cultivated tree species with ornamental, edible, and medicinal value, exhibiting significant development potential. In this study, we present the whole-genome sequencing of B. ramiflora, employing a combination of PacBio SMRT and Illumina HiSeq 2500 sequencing techniques. The assembled genome size was 975.8 Mb, with a contig N50 of 509.33 kb and the longest contig measuring 7.74 Mb. The genome comprises approximately 73.47% highly repetitive sequences, of which 52.1% are long terminal repeat-retrotransposon sequences. A total of 29,172 protein-coding genes were predicted, of which 25,980 (89.06%) have been annotated, Additionally, 3452 non-coding RNAs were identified. Comparative genomic analysis revealed a close relationship between B. ramiflora and the Euphorbiaceae family, with both being sister groups that diverged approximately 59.9 million years ago. During the evolutionary process, B. ramiflora exhibited positive selection in 278 candidate genes. Synonymous substitution rate and collinearity analysis demonstrated that B. ramiflora underwent a single ancient genome-wide triploidization event, without recent genome-wide duplication events. This high-quality B. ramiflora genome provides a valuable resource for basic research and tree improvement programs focusing on the Phyllanthaceae family.
Collapse
Affiliation(s)
- Jianjian Huang
- School of Life Sciences and Food Engineering, Hanshan Normal University, Chaozhou, 521041, Guangdong, China
| | - Jie Chen
- College of Coastal Agricultural Sciences, Guangdong Ocean University, Zhanjiang, 524088, Guangdong, China
| | - Min Shi
- School of Life Sciences and Food Engineering, Hanshan Normal University, Chaozhou, 521041, Guangdong, China
| | - Jiaqi Zheng
- School of Life Sciences and Food Engineering, Hanshan Normal University, Chaozhou, 521041, Guangdong, China
| | - Ming Chen
- School of Life Sciences and Food Engineering, Hanshan Normal University, Chaozhou, 521041, Guangdong, China
| | - Linjun Wu
- School of Life Sciences and Food Engineering, Hanshan Normal University, Chaozhou, 521041, Guangdong, China
| | - Hui Zhu
- School of Life Sciences and Food Engineering, Hanshan Normal University, Chaozhou, 521041, Guangdong, China
| | - Yuzhong Zheng
- School of Life Sciences and Food Engineering, Hanshan Normal University, Chaozhou, 521041, Guangdong, China
| | - Qinghan Wu
- School of Life Sciences and Food Engineering, Hanshan Normal University, Chaozhou, 521041, Guangdong, China
| | - Fengnian Wu
- School of Life Sciences and Food Engineering, Hanshan Normal University, Chaozhou, 521041, Guangdong, China.
| |
Collapse
|
37
|
Hénault M, Marsit S, Charron G, Landry CR. The genomic landscape of transposable elements in yeast hybrids is shaped by structural variation and genotype-specific modulation of transposition rate. eLife 2024; 12:RP89277. [PMID: 38411604 PMCID: PMC10911583 DOI: 10.7554/elife.89277] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/28/2024] Open
Abstract
Transposable elements (TEs) are major contributors to structural genomic variation by creating interspersed duplications of themselves. In return, structural variants (SVs) can affect the genomic distribution of TE copies and shape their load. One long-standing hypothesis states that hybridization could trigger TE mobilization and thus increase TE load in hybrids. We previously tested this hypothesis (Hénault et al., 2020) by performing a large-scale evolution experiment by mutation accumulation (MA) on multiple hybrid genotypes within and between wild populations of the yeasts Saccharomyces paradoxus and Saccharomyces cerevisiae. Using aggregate measures of TE load with short-read sequencing, we found no evidence for TE load increase in hybrid MA lines. Here, we resolve the genomes of the hybrid MA lines with long-read phasing and assembly to precisely characterize the role of SVs in shaping the TE landscape. Highly contiguous phased assemblies of 127 MA lines revealed that SV types like polyploidy, aneuploidy, and loss of heterozygosity have large impacts on the TE load. We characterized 18 de novo TE insertions, indicating that transposition only has a minor role in shaping the TE landscape in MA lines. Because the scarcity of TE mobilization in MA lines provided insufficient resolution to confidently dissect transposition rate variation in hybrids, we adapted an in vivo assay to measure transposition rates in various S. paradoxus hybrid backgrounds. We found that transposition rates are not increased by hybridization, but are modulated by many genotype-specific factors including initial TE load, TE sequence variants, and mitochondrial DNA inheritance. Our results show the multiple scales at which TE load is shaped in hybrid genomes, being highly impacted by SV dynamics and finely modulated by genotype-specific variation in transposition rates.
Collapse
Affiliation(s)
- Mathieu Hénault
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université LavalQuébecCanada
- Département de biochimie, microbiologie et bioinformatique, Université LavalQuébecCanada
- Quebec Network for Research on Protein Function, Engineering, and Applications (PROTEO), Université LavalQuébecCanada
- Université Laval Big Data Research Center (BDRC_UL)QuébecCanada
| | - Souhir Marsit
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université LavalQuébecCanada
- Département de biochimie, microbiologie et bioinformatique, Université LavalQuébecCanada
- Quebec Network for Research on Protein Function, Engineering, and Applications (PROTEO), Université LavalQuébecCanada
- Université Laval Big Data Research Center (BDRC_UL)QuébecCanada
- Département de biologie, Université LavalQuébecCanada
| | - Guillaume Charron
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université LavalQuébecCanada
- Quebec Network for Research on Protein Function, Engineering, and Applications (PROTEO), Université LavalQuébecCanada
- Université Laval Big Data Research Center (BDRC_UL)QuébecCanada
- Département de biologie, Université LavalQuébecCanada
| | - Christian R Landry
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université LavalQuébecCanada
- Département de biochimie, microbiologie et bioinformatique, Université LavalQuébecCanada
- Quebec Network for Research on Protein Function, Engineering, and Applications (PROTEO), Université LavalQuébecCanada
- Université Laval Big Data Research Center (BDRC_UL)QuébecCanada
- Département de biologie, Université LavalQuébecCanada
| |
Collapse
|
38
|
Xie WZ, Zheng YY, He W, Bi F, Li Y, Dou T, Zhou R, Guo YX, Deng G, Zhang W, Yuan MH, Sanz-Jimenez P, Zhu XT, Xu XD, Zhou ZW, Zhou ZW, Feng JW, Liu S, Li C, Yang Q, Hu C, Gao H, Dong T, Dang J, Guo Q, Cai W, Zhang J, Yi G, Song JM, Sheng O, Chen LL. Two haplotype-resolved genome assemblies for AAB allotriploid bananas provide insights into banana subgenome asymmetric evolution and Fusarium wilt control. PLANT COMMUNICATIONS 2024; 5:100766. [PMID: 37974402 PMCID: PMC10873913 DOI: 10.1016/j.xplc.2023.100766] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Revised: 11/06/2023] [Accepted: 11/13/2023] [Indexed: 11/19/2023]
Abstract
Bananas (Musa spp.) are one of the world's most important fruit crops and play a vital role in food security for many developing countries. Most banana cultivars are triploids derived from inter- and intraspecific hybridizations between the wild diploid ancestor species Musa acuminate (AA) and M. balbisiana (BB). We report two haplotype-resolved genome assemblies of the representative AAB-cultivated types, Plantain and Silk, and precisely characterize ancestral contributions by examining ancestry mosaics across the genome. Widespread asymmetric evolution is observed in their subgenomes, which can be linked to frequent homologous exchange events. We reveal the genetic makeup of triploid banana cultivars and verify that subgenome B is a rich source of disease resistance genes. Only 58.5% and 59.4% of Plantain and Silk genes, respectively, are present in all three haplotypes, with >50% of genes being differentially expressed alleles in different subgenomes. We observed that the number of upregulated genes in Plantain is significantly higher than that in Silk at one-week post-inoculation with Fusarium wilt tropical race 4 (Foc TR4), which confirms that Plantain can initiate defense responses faster than Silk. Additionally, we compared genomic and transcriptomic differences among the genes related to carotenoid synthesis and starch metabolism between Plantain and Silk. Our study provides resources for better understanding the genomic architecture of cultivated bananas and has important implications for Musa genetics and breeding.
Collapse
Affiliation(s)
- Wen-Zhao Xie
- Institute of Fruit Tree Research, Guangdong Academy of Agricultural Sciences, Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization, Ministry of Agriculture and Rural Affairs, Guangdong Provincial Key Laboratory of Tropical and Subtropical Fruit Tree Research, Guangzhou 510640, China; College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Yu-Yu Zheng
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Weidi He
- Institute of Fruit Tree Research, Guangdong Academy of Agricultural Sciences, Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization, Ministry of Agriculture and Rural Affairs, Guangdong Provincial Key Laboratory of Tropical and Subtropical Fruit Tree Research, Guangzhou 510640, China
| | - Fangcheng Bi
- Institute of Fruit Tree Research, Guangdong Academy of Agricultural Sciences, Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization, Ministry of Agriculture and Rural Affairs, Guangdong Provincial Key Laboratory of Tropical and Subtropical Fruit Tree Research, Guangzhou 510640, China
| | - Yaoyao Li
- Institute of Fruit Tree Research, Guangdong Academy of Agricultural Sciences, Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization, Ministry of Agriculture and Rural Affairs, Guangdong Provincial Key Laboratory of Tropical and Subtropical Fruit Tree Research, Guangzhou 510640, China
| | - Tongxin Dou
- Institute of Fruit Tree Research, Guangdong Academy of Agricultural Sciences, Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization, Ministry of Agriculture and Rural Affairs, Guangdong Provincial Key Laboratory of Tropical and Subtropical Fruit Tree Research, Guangzhou 510640, China
| | - Run Zhou
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Yi-Xiong Guo
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Guiming Deng
- Institute of Fruit Tree Research, Guangdong Academy of Agricultural Sciences, Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization, Ministry of Agriculture and Rural Affairs, Guangdong Provincial Key Laboratory of Tropical and Subtropical Fruit Tree Research, Guangzhou 510640, China
| | - Wenhui Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Min-Hui Yuan
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, College of Life Science and Technology, Guangxi University, Nanning 530004, China
| | - Pablo Sanz-Jimenez
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Xi-Tong Zhu
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, College of Life Science and Technology, Guangxi University, Nanning 530004, China
| | - Xin-Dong Xu
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, College of Life Science and Technology, Guangxi University, Nanning 530004, China
| | - Zu-Wen Zhou
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, College of Life Science and Technology, Guangxi University, Nanning 530004, China
| | - Zhi-Wei Zhou
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Jia-Wu Feng
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Siwen Liu
- Institute of Fruit Tree Research, Guangdong Academy of Agricultural Sciences, Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization, Ministry of Agriculture and Rural Affairs, Guangdong Provincial Key Laboratory of Tropical and Subtropical Fruit Tree Research, Guangzhou 510640, China
| | - Chunyu Li
- Institute of Fruit Tree Research, Guangdong Academy of Agricultural Sciences, Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization, Ministry of Agriculture and Rural Affairs, Guangdong Provincial Key Laboratory of Tropical and Subtropical Fruit Tree Research, Guangzhou 510640, China
| | - Qiaosong Yang
- Institute of Fruit Tree Research, Guangdong Academy of Agricultural Sciences, Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization, Ministry of Agriculture and Rural Affairs, Guangdong Provincial Key Laboratory of Tropical and Subtropical Fruit Tree Research, Guangzhou 510640, China
| | - Chunhua Hu
- Institute of Fruit Tree Research, Guangdong Academy of Agricultural Sciences, Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization, Ministry of Agriculture and Rural Affairs, Guangdong Provincial Key Laboratory of Tropical and Subtropical Fruit Tree Research, Guangzhou 510640, China
| | - Huijun Gao
- Institute of Fruit Tree Research, Guangdong Academy of Agricultural Sciences, Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization, Ministry of Agriculture and Rural Affairs, Guangdong Provincial Key Laboratory of Tropical and Subtropical Fruit Tree Research, Guangzhou 510640, China
| | - Tao Dong
- Institute of Fruit Tree Research, Guangdong Academy of Agricultural Sciences, Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization, Ministry of Agriculture and Rural Affairs, Guangdong Provincial Key Laboratory of Tropical and Subtropical Fruit Tree Research, Guangzhou 510640, China
| | - Jiangbo Dang
- College of Horticulture and Landscape Architecture, Southwest University, Chongqing 400715, China
| | - Qigao Guo
- College of Horticulture and Landscape Architecture, Southwest University, Chongqing 400715, China
| | - Wenguo Cai
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, College of Life Science and Technology, Guangxi University, Nanning 530004, China
| | - Jianwei Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Ganjun Yi
- Institute of Fruit Tree Research, Guangdong Academy of Agricultural Sciences, Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization, Ministry of Agriculture and Rural Affairs, Guangdong Provincial Key Laboratory of Tropical and Subtropical Fruit Tree Research, Guangzhou 510640, China.
| | - Jia-Ming Song
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, College of Life Science and Technology, Guangxi University, Nanning 530004, China.
| | - Ou Sheng
- Institute of Fruit Tree Research, Guangdong Academy of Agricultural Sciences, Key Laboratory of South Subtropical Fruit Biology and Genetic Resource Utilization, Ministry of Agriculture and Rural Affairs, Guangdong Provincial Key Laboratory of Tropical and Subtropical Fruit Tree Research, Guangzhou 510640, China.
| | - Ling-Ling Chen
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, College of Life Science and Technology, Guangxi University, Nanning 530004, China.
| |
Collapse
|
39
|
Barela Hudgell MA, Momtaz F, Jafri A, Alekseyev MA, Smith LC. Local Genomic Instability of the SpTransformer Gene Family in the Purple Sea Urchin Inferred from BAC Insert Deletions. Genes (Basel) 2024; 15:222. [PMID: 38397211 PMCID: PMC10887614 DOI: 10.3390/genes15020222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 02/02/2024] [Accepted: 02/05/2024] [Indexed: 02/25/2024] Open
Abstract
The SpTransformer (SpTrf) gene family in the purple sea urchin, Strongylocentrotus purpuratus, encodes immune response proteins. The genes are clustered, surrounded by short tandem repeats, and some are present in genomic segmental duplications. The genes share regions of sequence and include repeats in the coding exon. This complex structure is consistent with putative local genomic instability. Instability of the SpTrf gene cluster was tested by 10 days of growth of Escherichia coli harboring bacterial artificial chromosome (BAC) clones of sea urchin genomic DNA with inserts containing SpTrf genes. After the growth period, the BAC DNA inserts were analyzed for size and SpTrf gene content. Clones with multiple SpTrf genes showed a variety of deletions, including loss of one, most, or all genes from the cluster. Alternatively, a BAC insert with a single SpTrf gene was stable. BAC insert instability is consistent with variations in the gene family composition among sea urchins, the types of SpTrf genes in the family, and a reduction in the gene copy number in single coelomocytes. Based on the sequence variability among SpTrf genes within and among sea urchins, local genomic instability of the family may be important for driving sequence diversity in this gene family that would be of benefit to sea urchins in their arms race with marine microbes.
Collapse
Affiliation(s)
- Megan A. Barela Hudgell
- Department of Biological Sciences, George Washington University, Washington, DC 20052, USA; (M.A.B.H.); (F.M.)
| | - Farhana Momtaz
- Department of Biological Sciences, George Washington University, Washington, DC 20052, USA; (M.A.B.H.); (F.M.)
| | - Abiha Jafri
- Department of Biological Sciences, George Washington University, Washington, DC 20052, USA; (M.A.B.H.); (F.M.)
| | - Max A. Alekseyev
- Department of Mathematics and the Computational Biology Institute, George Washington University, Washington, DC 20052, USA;
| | - L. Courtney Smith
- Department of Biological Sciences, George Washington University, Washington, DC 20052, USA; (M.A.B.H.); (F.M.)
| |
Collapse
|
40
|
Ryazansky SS, Chen C, Potters M, Naumenko AN, Lukyanchikova V, Masri RA, Brusentsov II, Karagodin DA, Yurchenko AA, Dos Anjos VL, Haba Y, Rose NH, Hoffman J, Guo R, Menna T, Kelley M, Ferrill E, Schultz KE, Qi Y, Sharma A, Deschamps S, Llaca V, Mao C, Murphy TD, Baricheva EM, Emrich S, Fritz ML, Benoit JB, Sharakhov IV, McBride CS, Tu Z, Sharakhova MV. The chromosome-scale genome assembly for the West Nile vector Culex quinquefasciatus uncovers patterns of genome evolution in mosquitoes. BMC Biol 2024; 22:16. [PMID: 38273363 PMCID: PMC10809549 DOI: 10.1186/s12915-024-01825-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 01/11/2024] [Indexed: 01/27/2024] Open
Abstract
BACKGROUND Understanding genome organization and evolution is important for species involved in transmission of human diseases, such as mosquitoes. Anophelinae and Culicinae subfamilies of mosquitoes show striking differences in genome sizes, sex chromosome arrangements, behavior, and ability to transmit pathogens. However, the genomic basis of these differences is not fully understood. METHODS In this study, we used a combination of advanced genome technologies such as Oxford Nanopore Technology sequencing, Hi-C scaffolding, Bionano, and cytogenetic mapping to develop an improved chromosome-scale genome assembly for the West Nile vector Culex quinquefasciatus. RESULTS We then used this assembly to annotate odorant receptors, odorant binding proteins, and transposable elements. A genomic region containing male-specific sequences on chromosome 1 and a polymorphic inversion on chromosome 3 were identified in the Cx. quinquefasciatus genome. In addition, the genome of Cx. quinquefasciatus was compared with the genomes of other mosquitoes such as malaria vectors An. coluzzi and An. albimanus, and the vector of arboviruses Ae. aegypti. Our work confirms significant expansion of the two chemosensory gene families in Cx. quinquefasciatus, as well as a significant increase and relocation of the transposable elements in both Cx. quinquefasciatus and Ae. aegypti relative to the Anophelines. Phylogenetic analysis clarifies the divergence time between the mosquito species. Our study provides new insights into chromosomal evolution in mosquitoes and finds that the X chromosome of Anophelinae and the sex-determining chromosome 1 of Culicinae have a significantly higher rate of evolution than autosomes. CONCLUSION The improved Cx. quinquefasciatus genome assembly uncovered new details of mosquito genome evolution and has the potential to speed up the development of novel vector control strategies.
Collapse
Affiliation(s)
- Sergei S Ryazansky
- Department of Entomology, Virginia Polytechnic and State University, Blacksburg, VA, USA
- Department of Molecular Genetics of Cell, NRC "Kurchatov Institute", Moscow, Russia
| | - Chujia Chen
- Genetics, Bioinformatics, Computational Biology Program, Virginia Polytechnic and State University, Blacksburg, VA, USA
| | - Mark Potters
- Department of Biochemistry, Virginia Polytechnic and State University, Blacksburg, USA
| | - Anastasia N Naumenko
- Department of Entomology, Virginia Polytechnic and State University, Blacksburg, VA, USA
- Department of Entomology, University of Maryland, College Park, MD, USA
| | - Varvara Lukyanchikova
- Department of Entomology, Virginia Polytechnic and State University, Blacksburg, VA, USA
- Group of Genomic Mechanisms of Development, Institute of Cytology and Genetics, Novosibirsk, Russia
- Laboratory of Structural and Functional Genomics, Novosibirsk State University, Novosibirsk, Russia
| | - Reem A Masri
- Department of Entomology, Virginia Polytechnic and State University, Blacksburg, VA, USA
| | - Ilya I Brusentsov
- Department of Entomology, Virginia Polytechnic and State University, Blacksburg, VA, USA
- Laboratory of Cell Differentiation Mechanisms, Institute of Cytology and Genetics, Novosibirsk, Russia
| | - Dmitriy A Karagodin
- Laboratory of Cell Differentiation Mechanisms, Institute of Cytology and Genetics, Novosibirsk, Russia
| | - Andrey A Yurchenko
- Department of Entomology, Virginia Polytechnic and State University, Blacksburg, VA, USA
| | - Vitor L Dos Anjos
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ, USA
| | - Yuki Haba
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ, USA
| | - Noah H Rose
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ, USA
| | - Jinna Hoffman
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Rong Guo
- Department of Entomology, University of Maryland, College Park, MD, USA
| | - Theresa Menna
- Department of Entomology, University of Maryland, College Park, MD, USA
| | - Melissa Kelley
- Department of Biological Sciences, University of Cincinnati, Cincinnati, OH, USA
| | - Emily Ferrill
- County of San Diego Vector Control Program, San Diego, CA, USA
| | - Karen E Schultz
- Mosquito and Vector Management District of Santa Barbara County, Santa Barbara, CA, USA
| | - Yumin Qi
- Department of Biochemistry, Virginia Polytechnic and State University, Blacksburg, USA
| | - Atashi Sharma
- Department of Biochemistry, Virginia Polytechnic and State University, Blacksburg, USA
| | | | | | - Chunhong Mao
- Biocomplexity Institute & Initiative University of Virginia, Charlottesville, VA, USA
| | - Terence D Murphy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Elina M Baricheva
- Laboratory of Cell Differentiation Mechanisms, Institute of Cytology and Genetics, Novosibirsk, Russia
| | - Scott Emrich
- Department of Electrical Engineering & Computer Science, the University of Tennessee, Knoxville, TN, USA
| | - Megan L Fritz
- Department of Entomology, University of Maryland, College Park, MD, USA
| | - Joshua B Benoit
- Department of Biological Sciences, University of Cincinnati, Cincinnati, OH, USA
| | - Igor V Sharakhov
- Department of Entomology, Virginia Polytechnic and State University, Blacksburg, VA, USA
- Fralin Life Sciences Institute, Virginia Polytechnic and State University, Blacksburg, VA, USA
- Department of Genetics and Cell Biology, Tomsk State University, Tomsk, Russia
| | - Carolyn S McBride
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ, USA
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
| | - Zhijian Tu
- Genetics, Bioinformatics, Computational Biology Program, Virginia Polytechnic and State University, Blacksburg, VA, USA
- Department of Biochemistry, Virginia Polytechnic and State University, Blacksburg, USA
- Fralin Life Sciences Institute, Virginia Polytechnic and State University, Blacksburg, VA, USA
| | - Maria V Sharakhova
- Department of Entomology, Virginia Polytechnic and State University, Blacksburg, VA, USA.
- Laboratory of Cell Differentiation Mechanisms, Institute of Cytology and Genetics, Novosibirsk, Russia.
- Fralin Life Sciences Institute, Virginia Polytechnic and State University, Blacksburg, VA, USA.
| |
Collapse
|
41
|
Serra Mari R, Schrinner S, Finkers R, Ziegler FMR, Arens P, Schmidt MHW, Usadel B, Klau GW, Marschall T. Haplotype-resolved assembly of a tetraploid potato genome using long reads and low-depth offspring data. Genome Biol 2024; 25:26. [PMID: 38243222 PMCID: PMC10797741 DOI: 10.1186/s13059-023-03160-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 12/27/2023] [Indexed: 01/21/2024] Open
Abstract
Potato is one of the world's major staple crops, and like many important crop plants, it has a polyploid genome. Polyploid haplotype assembly poses a major computational challenge. We introduce a novel strategy for the assembly of polyploid genomes and present an assembly of the autotetraploid potato cultivar Altus. Our method uses low-depth sequencing data from an offspring population to achieve chromosomal clustering and haplotype phasing on the assembly graph. Our approach generates high-quality assemblies of individual chromosomes with haplotype-specific sequence resolution of whole chromosome arms and can be applied in common breeding scenarios where collections of offspring are available.
Collapse
Affiliation(s)
- Rebecca Serra Mari
- Institute for Medical Biometry and Bioinformatics, Medical Faculty and University Hospital Düsseldorf, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
- Center for Digital Medicine, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Sven Schrinner
- Center for Digital Medicine, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
- Algorithmic Bioinformatics, Faculty of Mathematics and Natural Sciences, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Richard Finkers
- Gennovation B.V., Agro Business Park 10, 6708, PW, Wageningen, The Netherlands
- Plant Breeding, Wageningen University & Research, Wageningen, The Netherlands
| | - Freya Maria Rosemarie Ziegler
- Cluster of Excellence on Plant Sciences (CEPLAS), Heinrich Heine University Düsseldorf, Düsseldorf, Germany
- Forschungszentrum Jülich, Institute of Bio and Geosciences, Bioinformatics (IBG-4), Jülich, Germany
- Bioeconomy Science Center, c/o Forschungszentrum Jülich, Jülich, Germany
- Biological Data Science, Faculty of Mathematics and Natural Sciences, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Paul Arens
- Plant Breeding, Wageningen University & Research, Wageningen, The Netherlands
| | - Maximilian H-W Schmidt
- Cluster of Excellence on Plant Sciences (CEPLAS), Heinrich Heine University Düsseldorf, Düsseldorf, Germany
- Forschungszentrum Jülich, Institute of Bio and Geosciences, Bioinformatics (IBG-4), Jülich, Germany
| | - Björn Usadel
- Cluster of Excellence on Plant Sciences (CEPLAS), Heinrich Heine University Düsseldorf, Düsseldorf, Germany.
- Forschungszentrum Jülich, Institute of Bio and Geosciences, Bioinformatics (IBG-4), Jülich, Germany.
- Bioeconomy Science Center, c/o Forschungszentrum Jülich, Jülich, Germany.
- Biological Data Science, Faculty of Mathematics and Natural Sciences, Heinrich Heine University Düsseldorf, Düsseldorf, Germany.
| | - Gunnar W Klau
- Algorithmic Bioinformatics, Faculty of Mathematics and Natural Sciences, Heinrich Heine University Düsseldorf, Düsseldorf, Germany.
- Cluster of Excellence on Plant Sciences (CEPLAS), Heinrich Heine University Düsseldorf, Düsseldorf, Germany.
| | - Tobias Marschall
- Institute for Medical Biometry and Bioinformatics, Medical Faculty and University Hospital Düsseldorf, Heinrich Heine University Düsseldorf, Düsseldorf, Germany.
- Center for Digital Medicine, Heinrich Heine University Düsseldorf, Düsseldorf, Germany.
| |
Collapse
|
42
|
Wang XB, Lu HW, Liu QY, Li AL, Zhou HL, Zhang Y, Zhu TQ, Ruan J. An effective strategy for assembling the sex-limited chromosome. Gigascience 2024; 13:giae015. [PMID: 38626722 PMCID: PMC11020242 DOI: 10.1093/gigascience/giae015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 01/17/2024] [Accepted: 03/15/2024] [Indexed: 04/18/2024] Open
Abstract
BACKGROUND Most currently available reference genomes lack the sequence map of sex-limited (such as Y and W) chromosomes, which results in incomplete assemblies that hinder further research on sex chromosomes. Recent advancements in long-read sequencing and population sequencing have provided the opportunity to assemble sex-limited chromosomes without the traditional complicated experimental efforts. FINDINGS We introduce the first computational method, Sorting long Reads of Y or other sex-limited chromosome (SRY), which achieves improved assembly results compared to flow sorting. Specifically, SRY outperforms in the heterochromatic region and demonstrates comparable performance in other regions. Furthermore, SRY enhances the capabilities of the hybrid assembly software, resulting in improved continuity and accuracy. CONCLUSIONS Our method enables true complete genome assembly and facilitates downstream research of sex-limited chromosomes.
Collapse
Affiliation(s)
- Xiao-Bo Wang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
- The Shennong Laboratory/Institute of Crop Molecular Breeding, Henan Academy of Agricultural Sciences, Zhengzhou 450002, China
| | - Hong-Wei Lu
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
| | - Qing-You Liu
- Guangdong Provincial Key Laboratory of Animal Molecular Design and Precise Breeding, School of Life Science and Engineering, Foshan University, Foshan 528225, China
| | - A-Lun Li
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
| | - Hong-Ling Zhou
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
| | - Yong Zhang
- Key Laboratory of Zoological Systematics and Evolution & State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Tian-Qi Zhu
- National Center for Mathematics and Interdisciplinary Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
- Key Laboratory of Random Complex Structures and Data Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
| | - Jue Ruan
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
| |
Collapse
|
43
|
Li X, Yu S, Cheng Z, Chang X, Yun Y, Jiang M, Chen X, Wen X, Li H, Zhu W, Xu S, Xu Y, Wang X, Zhang C, Wu Q, Hu J, Lin Z, Aury JM, Van de Peer Y, Wang Z, Zhou X, Wang J, Lü P, Zhang L. Origin and evolution of the triploid cultivated banana genome. Nat Genet 2024; 56:136-142. [PMID: 38082204 DOI: 10.1038/s41588-023-01589-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 10/23/2023] [Indexed: 01/14/2024]
Abstract
Most fresh bananas belong to the Cavendish and Gros Michel subgroups. Here, we report chromosome-scale genome assemblies of Cavendish (1.48 Gb) and Gros Michel (1.33 Gb), defining three subgenomes, Ban, Dh and Ze, with Musa acuminata ssp. banksii, malaccensis and zebrina as their major ancestral contributors, respectively. The insertion of repeat sequences in the Fusarium oxysporum f. sp. cubense (Foc) tropical race 4 RGA2 (resistance gene analog 2) promoter was identified in most diploid and triploid bananas. We found that the receptor-like protein (RLP) locus, including Foc race 1-resistant genes, is absent in the Gros Michel Ze subgenome. We identified two NAP (NAC-like, activated by apetala3/pistillata) transcription factor homologs specifically and highly expressed in fruit that directly bind to the promoters of many fruit ripening genes and may be key regulators of fruit ripening. Our genome data should facilitate the breeding and super-domestication of bananas.
Collapse
Affiliation(s)
- Xiuxiu Li
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Haixia Institute of Science and Technology, College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Sheng Yu
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Zhihao Cheng
- Haikou Experimental Station, National Key Laboratory for Tropical Crop Breeding, Chinese Academy of Tropical Agricultural Sciences, Haikou, China
| | - Xiaojun Chang
- Laboratory of Medicinal Plant Biotechnology, School of Pharmaceutical Sciences, Zhejiang Chinese Medical University, Hangzhou, China
| | - Yingzi Yun
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Haixia Institute of Science and Technology, College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Mengwei Jiang
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Haixia Institute of Science and Technology, College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Xuequn Chen
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Haixia Institute of Science and Technology, College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Xiaohui Wen
- Zhejiang Provincial Key Laboratory of Horticultural Plant Integrative Biology, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, China
- Hainan Institute of Zhejiang University, Sanya, China
| | - Hua Li
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Haixia Institute of Science and Technology, College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Wenjun Zhu
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Haixia Institute of Science and Technology, College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Shiyao Xu
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Haixia Institute of Science and Technology, College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Yanbing Xu
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Haixia Institute of Science and Technology, College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Xianjun Wang
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Haixia Institute of Science and Technology, College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Chen Zhang
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Haixia Institute of Science and Technology, College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou, China
- Fuzhou Institute of Oceanography, Minjiang University, Fuzhou, China
| | - Qiong Wu
- Haikou Experimental Station, National Key Laboratory for Tropical Crop Breeding, Chinese Academy of Tropical Agricultural Sciences, Haikou, China
| | - Jin Hu
- Zhejiang Provincial Key Laboratory of Horticultural Plant Integrative Biology, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, China
- Hainan Institute of Zhejiang University, Sanya, China
| | - Zhenguo Lin
- Department of Biology, Saint Louis University, St. Louis, MO, USA
| | - Jean-Marc Aury
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, Evry, France
| | - Yves Van de Peer
- Department of Plant Biotechnology and Bioinformatics, Ghent University and VIB Center for Plant Systems Biology, Ghent, Belgium.
- Centre for Microbial Ecology and Genomics, Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, South Africa.
- College of Horticulture, Academy for Advanced Interdisciplinary Studies, Nanjing Agricultural University, Nanjing, China.
| | - Zonghua Wang
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Haixia Institute of Science and Technology, College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou, China.
- Fuzhou Institute of Oceanography, Minjiang University, Fuzhou, China.
| | - Xiaofan Zhou
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangdong Laboratory for Lingnan Modern Agriculture, Guangdong Province Key Laboratory of Microbial Signals and Disease Control, Integrative Microbiology Research Center, South China Agricultural University, Guangzhou, China.
| | | | - Peitao Lü
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Haixia Institute of Science and Technology, College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou, China.
| | - Liangsheng Zhang
- Zhejiang Provincial Key Laboratory of Horticultural Plant Integrative Biology, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, China.
- Hainan Institute of Zhejiang University, Sanya, China.
| |
Collapse
|
44
|
LoTempio J, Delot E, Vilain E. Benchmarking long-read genome sequence alignment tools for human genomics applications. PeerJ 2023; 11:e16515. [PMID: 38130927 PMCID: PMC10734412 DOI: 10.7717/peerj.16515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Accepted: 11/02/2023] [Indexed: 12/23/2023] Open
Abstract
Background The utility of long-read genome sequencing platforms has been shown in many fields including whole genome assembly, metagenomics, and amplicon sequencing. Less clear is the applicability of long reads to reference-guided human genomics, which is the foundation of genomic medicine. Here, we benchmark available platform-agnostic alignment tools on datasets from nanopore and single-molecule real-time platforms to understand their suitability in producing a genome representation. Results For this study, we leveraged publicly-available data from sample NA12878 generated on Oxford Nanopore and sample NA24385 on Pacific Biosciences platforms. We employed state of the art sequence alignment tools including GraphMap2, long-read aligner (LRA), Minimap2, CoNvex Gap-cost alignMents for Long Reads (NGMLR), and Winnowmap2. Minimap2 and Winnowmap2 were computationally lightweight enough for use at scale, while GraphMap2 was not. NGMLR took a long time and required many resources, but produced alignments each time. LRA was fast, but only worked on Pacific Biosciences data. Each tool widely disagreed on which reads to leave unaligned, affecting the end genome coverage and the number of discoverable breakpoints. No alignment tool independently resolved all large structural variants (1,001-100,000 base pairs) present in the Database of Genome Variants (DGV) for sample NA12878 or the truthset for NA24385. Conclusions These results suggest a combined approach is needed for LRS alignments for human genomics. Specifically, leveraging alignments from three tools will be more effective in generating a complete picture of genomic variability. It should be best practice to use an analysis pipeline that generates alignments with both Minimap2 and Winnowmap2 as they are lightweight and yield different views of the genome. Depending on the question at hand, the data available, and the time constraints, NGMLR and LRA are good options for a third tool. If computational resources and time are not a factor for a given case or experiment, NGMLR will provide another view, and another chance to resolve a case. LRA, while fast, did not work on the nanopore data for our cluster, but PacBio results were promising in that those computations completed faster than Minimap2. Due to its significant burden on computational resources and slow run time, Graphmap2 is not an ideal tool for exploration of a whole human genome generated on a long-read sequencing platform.
Collapse
Affiliation(s)
- Jonathan LoTempio
- Institute for Clinical and Translational Science, University of California, Irvine, CA, United States of America
- International Research Laboratory (IRL2006) “Epigenetics, Data, Politics (EpiDaPo)”, Centre National de la Recherche Scientifique, Washington, DC, United States of America
| | - Emmanuele Delot
- Center for Genetic Medicine Research, Children’s National Hospital, Washington, DC, United States of America
- Department of Genomics and Precision Medicine, George Washington University, Washington, DC, United States of America
| | - Eric Vilain
- Institute for Clinical and Translational Science, University of California, Irvine, CA, United States of America
- International Research Laboratory (IRL2006) “Epigenetics, Data, Politics (EpiDaPo)”, Centre National de la Recherche Scientifique, Washington, DC, United States of America
| |
Collapse
|
45
|
Jia P, Dong L, Yang X, Wang B, Bush SJ, Wang T, Lin J, Wang S, Zhao X, Xu T, Che Y, Dang N, Ren L, Zhang Y, Wang X, Liang F, Wang Y, Ruan J, Xia H, Zheng Y, Shi L, Lv Y, Wang J, Ye K. Haplotype-resolved assemblies and variant benchmark of a Chinese Quartet. Genome Biol 2023; 24:277. [PMID: 38049885 PMCID: PMC10694985 DOI: 10.1186/s13059-023-03116-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 11/21/2023] [Indexed: 12/06/2023] Open
Abstract
BACKGROUND Recent state-of-the-art sequencing technologies enable the investigation of challenging regions in the human genome and expand the scope of variant benchmarking datasets. Herein, we sequence a Chinese Quartet, comprising two monozygotic twin daughters and their biological parents, using four short and long sequencing platforms (Illumina, BGI, PacBio, and Oxford Nanopore Technology). RESULTS The long reads from the monozygotic twin daughters are phased into paternal and maternal haplotypes using the parent-child genetic map and for each haplotype. We also use long reads to generate haplotype-resolved whole-genome assemblies with completeness and continuity exceeding that of GRCh38. Using this Quartet, we comprehensively catalogue the human variant landscape, generating a dataset of 3,962,453 SNVs, 886,648 indels (< 50 bp), 9726 large deletions (≥ 50 bp), 15,600 large insertions (≥ 50 bp), 40 inversions, 31 complex structural variants, and 68 de novo mutations which are shared between the monozygotic twin daughters. Variants underrepresented in previous benchmarks owing to their complexity-including those located at long repeat regions, complex structural variants, and de novo mutations-are systematically examined in this study. CONCLUSIONS In summary, this study provides high-quality haplotype-resolved assemblies and a comprehensive set of benchmarking resources for two Chinese monozygotic twin samples which, relative to existing benchmarks, offers expanded genomic coverage and insight into complex variant categories.
Collapse
Affiliation(s)
- Peng Jia
- National Local Joint Engineering Research Center for Precision Surgery & Regenerative Medicine, Center for Mathematical Medical, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710061, China
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Lianhua Dong
- National Institute of Metrology, Beijing, 100029, China
| | - Xiaofei Yang
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
- School of Computer Science and Technology, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
- Genome Institute, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710061, China
| | - Bo Wang
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Stephen J Bush
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Tingjie Wang
- National Local Joint Engineering Research Center for Precision Surgery & Regenerative Medicine, Center for Mathematical Medical, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710061, China
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Jiadong Lin
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Songbo Wang
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Xixi Zhao
- National Local Joint Engineering Research Center for Precision Surgery & Regenerative Medicine, Center for Mathematical Medical, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710061, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
- School of Computer Science and Technology, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Tun Xu
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Yizhuo Che
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Ningxin Dang
- Genome Institute, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710061, China
| | - Luyao Ren
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China
| | - Yujing Zhang
- National Institute of Metrology, Beijing, 100029, China
| | - Xia Wang
- National Institute of Metrology, Beijing, 100029, China
| | - Fan Liang
- GrandOmics Biosciences, Beijing, 100089, China
| | - Yang Wang
- GrandOmics Biosciences, Beijing, 100089, China
| | - Jue Ruan
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
| | - Han Xia
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China
| | - Yi Lv
- National Local Joint Engineering Research Center for Precision Surgery & Regenerative Medicine, Center for Mathematical Medical, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710061, China.
| | - Jing Wang
- National Institute of Metrology, Beijing, 100029, China.
| | - Kai Ye
- National Local Joint Engineering Research Center for Precision Surgery & Regenerative Medicine, Center for Mathematical Medical, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710061, China.
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China.
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, 710049, China.
- Genome Institute, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, 710061, China.
- School of Life Science and Technology, Xi'an Jiaotong University, Xi'an 710049, China.
- Faculty of Science, Leiden University, Leiden, 2311EZ, The Netherlands.
| |
Collapse
|
46
|
Woolley SA, Salavati M, Clark EL. Recent advances in the genomic resources for sheep. Mamm Genome 2023; 34:545-558. [PMID: 37752302 PMCID: PMC10627984 DOI: 10.1007/s00335-023-10018-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Accepted: 08/30/2023] [Indexed: 09/28/2023]
Abstract
Sheep (Ovis aries) provide a vital source of protein and fibre to human populations. In coming decades, as the pressures associated with rapidly changing climates increase, breeding sheep sustainably as well as producing enough protein to feed a growing human population will pose a considerable challenge for sheep production across the globe. High quality reference genomes and other genomic resources can help to meet these challenges by: (1) informing breeding programmes by adding a priori information about the genome, (2) providing tools such as pangenomes for characterising and conserving global genetic diversity, and (3) improving our understanding of fundamental biology using the power of genomic information to link cell, tissue and whole animal scale knowledge. In this review we describe recent advances in the genomic resources available for sheep, discuss how these might help to meet future challenges for sheep production, and provide some insight into what the future might hold.
Collapse
Affiliation(s)
- Shernae A Woolley
- The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG, UK
| | - Mazdak Salavati
- The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG, UK
- Scotland's Rural College, Parkgate, Barony Campus, Dumfries, DG1 3NE, UK
| | - Emily L Clark
- The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG, UK.
| |
Collapse
|
47
|
Zou C, Sapkota S, Figueroa-Balderas R, Glaubitz J, Cantu D, Kingham BF, Sun Q, Cadle-Davidson L. A multitiered haplotype strategy to enhance phased assembly and fine mapping of a disease resistance locus. PLANT PHYSIOLOGY 2023; 193:2321-2336. [PMID: 37706526 DOI: 10.1093/plphys/kiad494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 07/10/2023] [Accepted: 08/17/2023] [Indexed: 09/15/2023]
Abstract
Fine mapping of quantitative trait loci (QTL) to dissect the genetic basis of traits of interest is essential to modern breeding practice. Here, we employed a multitiered haplotypic marker system to increase fine mapping accuracy by constructing a chromosome-level, haplotype-resolved parental genome, accurate detection of recombination sites, and allele-specific characterization of the transcriptome. In the first tier of this system, we applied the preexisting panel of 2,000 rhAmpSeq core genome markers that is transferable across the entire Vitis genus and provides a genomic resolution of 200 kb to 1 Mb. The second tier consisted of high-density haplotypic markers generated from Illumina skim sequencing data for samples enriched for relevant recombinations, increasing the potential resolution to hundreds of base pairs. We used this approach to dissect a novel Resistance to Plasmopara viticola-33 (RPV33) locus conferring resistance to grapevine downy mildew, narrowing the candidate region to only 0.46 Mb. In the third tier, we used allele-specific RNA-seq analysis to identify a cluster of 3 putative disease resistance RPP13-like protein 2 genes located tandemly in a nonsyntenic insertion as candidates for the disease resistance trait. In addition, combining the rhAmpSeq core genome haplotype markers and skim sequencing-derived high-density haplotype markers enabled chromosomal-level scaffolding and phasing of the grape Vitis × doaniana 'PI 588149' assembly, initially built solely from Pacific Biosciences (PacBio) high-fidelity (HiFi) reads, leading to the correction of 16 large-scale phasing errors. Our mapping strategy integrates high-density, phased genetic information with individual reference genomes to pinpoint the genetic basis of QTLs and will likely be widely adopted in highly heterozygous species.
Collapse
Affiliation(s)
- Cheng Zou
- BRC Bioinformatics Facility, Institute of Biotechnology, Cornell University, Ithaca, NY, 14853, USA
| | - Surya Sapkota
- School of Integrative Plant Science, Cornell AgriTech, Cornell University, Geneva, NY 14456, USA
- Grape Genetics Research Unit, USDA-ARS, Geneva, NY 14456, USA
| | - Rosa Figueroa-Balderas
- Department of Viticulture and Enology, University of California Davis, Davis, CA 95616, USA
| | - Jeff Glaubitz
- BRC Bioinformatics Facility, Institute of Biotechnology, Cornell University, Ithaca, NY, 14853, USA
| | - Dario Cantu
- Department of Viticulture and Enology, University of California Davis, Davis, CA 95616, USA
| | - Brewster F Kingham
- DNA Sequencing & Genotyping Center, Delaware Biotechnology Institute, University of Delaware, Newark, DE 19711, USA
| | - Qi Sun
- BRC Bioinformatics Facility, Institute of Biotechnology, Cornell University, Ithaca, NY, 14853, USA
| | - Lance Cadle-Davidson
- School of Integrative Plant Science, Cornell AgriTech, Cornell University, Geneva, NY 14456, USA
- Grape Genetics Research Unit, USDA-ARS, Geneva, NY 14456, USA
| |
Collapse
|
48
|
Jevit MJ, Castaneda C, Paria N, Das PJ, Miller D, Antczak DF, Kalbfleisch TS, Davis BW, Raudsepp T. Trio-binning of a hinny refines the comparative organization of the horse and donkey X chromosomes and reveals novel species-specific features. Sci Rep 2023; 13:20180. [PMID: 37978222 PMCID: PMC10656420 DOI: 10.1038/s41598-023-47583-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 11/14/2023] [Indexed: 11/19/2023] Open
Abstract
We generated single haplotype assemblies from a hinny hybrid which significantly improved the gapless contiguity for horse and donkey autosomal genomes and the X chromosomes. We added over 15 Mb of missing sequence to both X chromosomes, 60 Mb to donkey autosomes and corrected numerous errors in donkey and some in horse reference genomes. We resolved functionally important X-linked repeats: the DXZ4 macrosatellite and ampliconic Equine Testis Specific Transcript Y7 (ETSTY7). We pinpointed the location of the pseudoautosomal boundaries (PAB) and determined the size of the horse (1.8 Mb) and donkey (1.88 Mb) pseudoautosomal regions (PARs). We discovered distinct differences in horse and donkey PABs: a testis-expressed gene, XKR3Y, spans horse PAB with exons1-2 located in Y and exon3 in the X-Y PAR, whereas the donkey XKR3Y is Y-specific. DXZ4 had a similar ~ 8 kb monomer in both species with 10 copies in horse and 20 in donkey. We assigned hundreds of copies of ETSTY7, a sequence horizontally transferred from Parascaris and massively amplified in equids, to horse and donkey X chromosomes and three autosomes. The findings and products contribute to molecular studies of equid biology and advance research on X-linked conditions, sex chromosome regulation and evolution in equids.
Collapse
Affiliation(s)
- Matthew J Jevit
- School of Veterinary Medicine, Texas A&M University, College Station, TX, 77843, USA
| | - Caitlin Castaneda
- School of Veterinary Medicine, Texas A&M University, College Station, TX, 77843, USA
| | - Nandina Paria
- Texas Scottish Rite Hospital for Children, Dallas, TX, 75219, USA
| | - Pranab J Das
- ICAR-National Research Centre on Pig, Rani, Guwahati, Assam, 781131, India
| | - Donald Miller
- Baker Institute for Animal Health, Cornell University, Ithaca, NY, 14853, USA
| | - Douglas F Antczak
- Baker Institute for Animal Health, Cornell University, Ithaca, NY, 14853, USA
| | - Theodore S Kalbfleisch
- Maxwell H. Gluck Equine Research Center, University of Kentucky, Lexington, KY, 40546, USA
| | - Brian W Davis
- School of Veterinary Medicine, Texas A&M University, College Station, TX, 77843, USA.
| | - Terje Raudsepp
- School of Veterinary Medicine, Texas A&M University, College Station, TX, 77843, USA.
| |
Collapse
|
49
|
Delorean EE, Youngblood RC, Simpson SA, Schoonmaker AN, Scheffler BE, Rutter WB, Hulse-Kemp AM. Representing true plant genomes: haplotype-resolved hybrid pepper genome with trio-binning. FRONTIERS IN PLANT SCIENCE 2023; 14:1184112. [PMID: 38034563 PMCID: PMC10687446 DOI: 10.3389/fpls.2023.1184112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/11/2023] [Accepted: 10/17/2023] [Indexed: 12/02/2023]
Abstract
As sequencing costs decrease and availability of high fidelity long-read sequencing increases, generating experiment specific de novo genome assemblies becomes feasible. In many crop species, obtaining the genome of a hybrid or heterozygous individual is necessary for systems that do not tolerate inbreeding or for investigating important biological questions, such as hybrid vigor. However, most genome assembly methods that have been used in plants result in a merged single sequence representation that is not a true biologically accurate representation of either haplotype within a diploid individual. The resulting genome assembly is often fragmented and exhibits a mosaic of the two haplotypes, referred to as haplotype-switching. Important haplotype level information, such as causal mutations and structural variation is therefore lost causing difficulties in interpreting downstream analyses. To overcome this challenge, we have applied a method developed for animal genome assembly called trio-binning to an intra-specific hybrid of chili pepper (Capsicum annuum L. cv. HDA149 x Capsicum annuum L. cv. HDA330). We tested all currently available softwares for performing trio-binning, combined with multiple scaffolding technologies including Bionano to determine the optimal method of producing the best haplotype-resolved assembly. Ultimately, we produced highly contiguous biologically true haplotype-resolved genome assemblies for each parent, with scaffold N50s of 266.0 Mb and 281.3 Mb, with 99.6% and 99.8% positioned into chromosomes respectively. The assemblies captured 3.10 Gb and 3.12 Gb of the estimated 3.5 Gb chili pepper genome size. These assemblies represent the complete genome structure of the intraspecific hybrid, as well as the two parental genomes, and show measurable improvements over the currently available reference genomes. Our manuscript provides a valuable guide on how to apply trio-binning to other plant genomes.
Collapse
Affiliation(s)
- Emily E. Delorean
- Genomics and Bioinformatics Research Unit, USDA-ARS, Raleigh, NC, United States
- Crop and Soil Sciences Department, North Carolina State University, Raleigh, NC, United States
| | - Ramey C. Youngblood
- Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, Starkville, MS, United States
| | - Sheron A. Simpson
- Genomics and Bioinformatics Research Unit, United States Department of Agriculture - Agriculture Research Service (USDA-ARS), Stoneville, MS, United States
| | - Ashley N. Schoonmaker
- Crop and Soil Sciences Department, North Carolina State University, Raleigh, NC, United States
| | - Brian E. Scheffler
- Genomics and Bioinformatics Research Unit, United States Department of Agriculture - Agriculture Research Service (USDA-ARS), Stoneville, MS, United States
| | - William B. Rutter
- US Vegetable Laboratory, United States Department of Agriculture - Agriculture Research Service (USDA-ARS), Charleston, SC, United States
| | - Amanda M. Hulse-Kemp
- Genomics and Bioinformatics Research Unit, USDA-ARS, Raleigh, NC, United States
- Crop and Soil Sciences Department, North Carolina State University, Raleigh, NC, United States
| |
Collapse
|
50
|
Bredemeyer KR, Hillier L, Harris AJ, Hughes GM, Foley NM, Lawless C, Carroll RA, Storer JM, Batzer MA, Rice ES, Davis BW, Raudsepp T, O'Brien SJ, Lyons LA, Warren WC, Murphy WJ. Single-haplotype comparative genomics provides insights into lineage-specific structural variation during cat evolution. Nat Genet 2023; 55:1953-1963. [PMID: 37919451 PMCID: PMC10845050 DOI: 10.1038/s41588-023-01548-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Accepted: 09/20/2023] [Indexed: 11/04/2023]
Abstract
The role of structurally dynamic genomic regions in speciation is poorly understood due to challenges inherent in diploid genome assembly. Here we reconstructed the evolutionary dynamics of structural variation in five cat species by phasing the genomes of three interspecies F1 hybrids to generate near-gapless single-haplotype assemblies. We discerned that cat genomes have a paucity of segmental duplications relative to great apes, explaining their remarkable karyotypic stability. X chromosomes were hotspots of structural variation, including enrichment with inversions in a large recombination desert with characteristics of a supergene. The X-linked macrosatellite DXZ4 evolves more rapidly than 99.5% of the genome clarifying its role in felid hybrid incompatibility. Resolved sensory gene repertoires revealed functional copy number changes associated with ecomorphological adaptations, sociality and domestication. This study highlights the value of gapless genomes to reveal structural mechanisms underpinning karyotypic evolution, reproductive isolation and ecological niche adaptation.
Collapse
Affiliation(s)
- Kevin R Bredemeyer
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA
- Interdisciplinary Program in Genetics & Genomics, Texas A&M University, College Station, TX, USA
| | - LaDeana Hillier
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Andrew J Harris
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA
- Interdisciplinary Program in Genetics & Genomics, Texas A&M University, College Station, TX, USA
| | - Graham M Hughes
- School of Biology & Environmental Sciences, University College Dublin, Dublin, Ireland
| | - Nicole M Foley
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA
| | - Colleen Lawless
- School of Biology & Environmental Sciences, University College Dublin, Dublin, Ireland
| | - Rachel A Carroll
- Department of Animal Sciences, University of Missouri, Columbia, MO, USA
| | | | - Mark A Batzer
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA
| | - Edward S Rice
- Department of Animal Sciences, University of Missouri, Columbia, MO, USA
| | - Brian W Davis
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA
- Interdisciplinary Program in Genetics & Genomics, Texas A&M University, College Station, TX, USA
| | - Terje Raudsepp
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA
- Interdisciplinary Program in Genetics & Genomics, Texas A&M University, College Station, TX, USA
| | - Stephen J O'Brien
- Guy Harvey Oceanographic Center, Nova Southeastern University, Fort Lauderdale, FL, USA
| | - Leslie A Lyons
- Department of Veterinary Medicine & Surgery, University of Missouri, Columbia, MO, USA
| | - Wesley C Warren
- Department of Animal Sciences, University of Missouri, Columbia, MO, USA.
| | - William J Murphy
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA.
- Interdisciplinary Program in Genetics & Genomics, Texas A&M University, College Station, TX, USA.
| |
Collapse
|