1
|
Trujillo-Montenegro JH, Rodríguez Cubillos MJ, Loaiza CD, Quintero M, Espitia-Navarro HF, Salazar Villareal FA, Viveros Valens CA, González Barrios AF, De Vega J, Duitama J, Riascos JJ. Unraveling the Genome of a High Yielding Colombian Sugarcane Hybrid. FRONTIERS IN PLANT SCIENCE 2021; 12:694859. [PMID: 34484261 PMCID: PMC8414525 DOI: 10.3389/fpls.2021.694859] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Accepted: 06/07/2021] [Indexed: 05/04/2023]
Abstract
Recent developments in High Throughput Sequencing (HTS) technologies and bioinformatics, including improved read lengths and genome assemblers allow the reconstruction of complex genomes with unprecedented quality and contiguity. Sugarcane has one of the most complicated genomes among grassess with a haploid length of 1Gbp and a ploidies between 8 and 12. In this work, we present a genome assembly of the Colombian sugarcane hybrid CC 01-1940. Three types of sequencing technologies were combined for this assembly: PacBio long reads, Illumina paired short reads, and Hi-C reads. We achieved a median contig length of 34.94 Mbp and a total genome assembly of 903.2 Mbp. We annotated a total of 63,724 protein coding genes and performed a reconstruction and comparative analysis of the sucrose metabolism pathway. Nucleotide evolution measurements between orthologs with close species suggest that divergence between Saccharum officinarum and Saccharum spontaneum occurred <2 million years ago. Synteny analysis between CC 01-1940 and the S. spontaneum genome confirms the presence of translocation events between the species and a random contribution throughout the entire genome in current sugarcane hybrids. Analysis of RNA-Seq data from leaf and root tissue of contrasting sugarcane genotypes subjected to water stress treatments revealed 17,490 differentially expressed genes, from which 3,633 correspond to genes expressed exclusively in tolerant genotypes. We expect the resources presented here to serve as a source of information to improve the selection processes of new varieties of the breeding programs of sugarcane.
Collapse
Affiliation(s)
- Jhon Henry Trujillo-Montenegro
- Centro de Investigación de la Caña de Azúcar de Colombia (CENICAÑA), Cali, Colombia
- Research Group in Bioinformatics, Department of Computer Science, Faculty of Engineering, Universidad Del Valle,Cali, Colombia
| | - María Juliana Rodríguez Cubillos
- Grupo de Diseño de Productos y Procesos, Department of Chemical and Food Engineering, Faculty of Engineering, Universidad de los Andes, Bogotá, Colombia
| | | | - Manuel Quintero
- Centro de Investigación de la Caña de Azúcar de Colombia (CENICAÑA), Cali, Colombia
| | | | | | | | - Andrés Fernando González Barrios
- Grupo de Diseño de Productos y Procesos, Department of Chemical and Food Engineering, Faculty of Engineering, Universidad de los Andes, Bogotá, Colombia
| | - José De Vega
- Earlham Institute, Norwich Research Park, Norwich, United Kingdom
| | - Jorge Duitama
- Systems and Computing Engineering Department, Universidad de los Andes, Bogotá, Colombia
| | - John J. Riascos
- Centro de Investigación de la Caña de Azúcar de Colombia (CENICAÑA), Cali, Colombia
| |
Collapse
|
2
|
Miculan M, Nelissen H, Ben Hassen M, Marroni F, Inzé D, Pè ME, Dell’Acqua M. A forward genetics approach integrating genome-wide association study and expression quantitative trait locus mapping to dissect leaf development in maize (Zea mays). THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2021; 107:1056-1071. [PMID: 34087008 PMCID: PMC8519057 DOI: 10.1111/tpj.15364] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 05/31/2021] [Indexed: 05/13/2023]
Abstract
The characterization of the genetic basis of maize (Zea mays) leaf development may support breeding efforts to obtain plants with higher vigor and productivity. In this study, a mapping panel of 197 biparental and multiparental maize recombinant inbred lines (RILs) was analyzed for multiple leaf traits at the seedling stage. RNA sequencing was used to estimate the transcription levels of 29 573 gene models in RILs and to derive 373 769 single nucleotide polymorphisms (SNPs), and a forward genetics approach combining these data was used to pinpoint candidate genes involved in leaf development. First, leaf traits were correlated with gene expression levels to identify transcript-trait correlations. Then, leaf traits were associated with SNPs in a genome-wide association (GWA) study. An expression quantitative trait locus mapping approach was followed to associate SNPs with gene expression levels, prioritizing candidate genes identified based on transcript-trait correlations and GWAs. Finally, a network analysis was conducted to cluster all transcripts in 38 co-expression modules. By integrating forward genetics approaches, we identified 25 candidate genes highly enriched for specific functional categories, providing evidence supporting the role of vacuolar proton pumps, cell wall effectors, and vesicular traffic controllers in leaf growth. These results tackle the complexity of leaf trait determination and may support precision breeding in maize.
Collapse
Affiliation(s)
- Mara Miculan
- Institute of Life SciencesScuola Superiore Sant’AnnaPisa56127Italy
| | - Hilde Nelissen
- Department of Plant Biotechnology and BioinformaticsGhent UniversityGhent9052Belgium
- Center for Plant Systems Biology, VIBGhent9052Belgium
| | - Manel Ben Hassen
- Department of Plant Biotechnology and BioinformaticsGhent UniversityGhent9052Belgium
- Center for Plant Systems Biology, VIBGhent9052Belgium
| | - Fabio Marroni
- IGA Technology ServicesUdine33100Italy
- Department of Agricultural, FoodAT, Environmental and Animal Sciences (DI4A)University of UdineUdine33100Italy
| | - Dirk Inzé
- Department of Plant Biotechnology and BioinformaticsGhent UniversityGhent9052Belgium
- Center for Plant Systems Biology, VIBGhent9052Belgium
| | - Mario Enrico Pè
- Institute of Life SciencesScuola Superiore Sant’AnnaPisa56127Italy
| | | |
Collapse
|
3
|
Renny-Byfield S, Baumgarten A. Repetitive DNA content in the maize genome is uncoupled from population stratification at SNP loci. BMC Genomics 2020; 21:98. [PMID: 32000670 PMCID: PMC6993463 DOI: 10.1186/s12864-020-6517-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Accepted: 01/20/2020] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND Repetitive DNA is a major component of plant genomes and is thought to be a driver of evolutionary novelty. Describing variation in repeat content among individuals and between populations is key to elucidating the evolutionary significance of repetitive DNA. However, the cost of producing references genomes has limited large-scale intraspecific comparisons to a handful of model organisms where multiple reference genomes are available. RESULTS We examine repeat content variation in the genomes of 94 elite inbred maize lines using graph-based repeat clustering, a reference-free and rapid assay of repeat content. We examine population structure using genome-wide repeat profiles, and demonstrate the stiff-stalk and non-stiff-stalk heterotic populations are homogenous with regard to global repeat content. In contrast, and similar to previously reported results, the same individuals show clear differentiation, and aggregate into two populations when examining population structure using genome-wide SNPs. Additionally, we develop a novel kmer based technique to examine the chromosomal distribution of repeat clusters in silico and show a cluster dependent association with gene density. CONCLUSION Our results indicate global repeat content variation in the heterotic populations of maize has not diverged, and is uncoupled from population stratification at SNP loci. We show that repeat families exhibit divergent patterns with regard to chromosomal distribution, some repeat clusters accumulate in regions of high gene density, whereas others aggregate in regions of low gene density.
Collapse
|
5
|
Moll KM, Zhou P, Ramaraj T, Fajardo D, Devitt NP, Sadowsky MJ, Stupar RM, Tiffin P, Miller JR, Young ND, Silverstein KAT, Mudge J. Strategies for optimizing BioNano and Dovetail explored through a second reference quality assembly for the legume model, Medicago truncatula. BMC Genomics 2017; 18:578. [PMID: 28778149 PMCID: PMC5545040 DOI: 10.1186/s12864-017-3971-4] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2017] [Accepted: 07/31/2017] [Indexed: 12/16/2022] Open
Abstract
Background Third generation sequencing technologies, with sequencing reads in the tens- of kilo-bases, facilitate genome assembly by spanning ambiguous regions and improving continuity. This has been critical for plant genomes, which are difficult to assemble due to high repeat content, gene family expansions, segmental and tandem duplications, and polyploidy. Recently, high-throughput mapping and scaffolding strategies have further improved continuity. Together, these long-range technologies enable quality draft assemblies of complex genomes in a cost-effective and timely manner. Results Here, we present high quality genome assemblies of the model legume plant, Medicago truncatula (R108) using PacBio, Dovetail Chicago (hereafter, Dovetail) and BioNano technologies. To test these technologies for plant genome assembly, we generated five assemblies using all possible combinations and ordering of these three technologies in the R108 assembly. While the BioNano and Dovetail joins overlapped, they also showed complementary gains in continuity and join numbers. Both technologies spanned repetitive regions that PacBio alone was unable to bridge. Combining technologies, particularly Dovetail followed by BioNano, resulted in notable improvements compared to Dovetail or BioNano alone. A combination of PacBio, Dovetail, and BioNano was used to generate a high quality draft assembly of R108, a M. truncatula accession widely used in studies of functional genomics. As a test for the usefulness of the resulting genome sequence, the new R108 assembly was used to pinpoint breakpoints and characterize flanking sequence of a previously identified translocation between chromosomes 4 and 8, identifying more than 22.7 Mb of novel sequence not present in the earlier A17 reference assembly. Conclusions Adding Dovetail followed by BioNano data yielded complementary improvements in continuity over the original PacBio assembly. This strategy proved efficient and cost-effective for developing a quality draft assembly compared to traditional reference assemblies. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3971-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Karen M Moll
- National Center for Genome Resources, 2935 Rodeo Park Drive East, Santa Fe, NM, 87505, USA.,Montana State University, Center for Biofilm Engineering, Bozeman, MT, 59717, USA
| | - Peng Zhou
- Department of Plant Biology, University of Minnesota, Saint Paul, MN, USA
| | - Thiruvarangan Ramaraj
- National Center for Genome Resources, 2935 Rodeo Park Drive East, Santa Fe, NM, 87505, USA
| | - Diego Fajardo
- National Center for Genome Resources, 2935 Rodeo Park Drive East, Santa Fe, NM, 87505, USA
| | - Nicholas P Devitt
- National Center for Genome Resources, 2935 Rodeo Park Drive East, Santa Fe, NM, 87505, USA
| | - Michael J Sadowsky
- Department of Soil, Water & Climate, Plant and Microbial Biology and BioTechnology Institute, University of Minnesota, St. Paul, MN, USA
| | - Robert M Stupar
- Department of Agronomy and Plant Genetics, University of Minnesota, Saint Paul, MN, USA
| | - Peter Tiffin
- Department of Plant and Microbial Biology, University of Minnesota, Saint Paul, MN, USA
| | | | - Nevin D Young
- Department of Plant and Microbial Biology, University of Minnesota, Saint Paul, MN, USA
| | | | - Joann Mudge
- National Center for Genome Resources, 2935 Rodeo Park Drive East, Santa Fe, NM, 87505, USA.
| |
Collapse
|
6
|
Minio A, Lin J, Gaut BS, Cantu D. How Single Molecule Real-Time Sequencing and Haplotype Phasing Have Enabled Reference-Grade Diploid Genome Assembly of Wine Grapes. FRONTIERS IN PLANT SCIENCE 2017; 8:826. [PMID: 28567052 PMCID: PMC5434136 DOI: 10.3389/fpls.2017.00826] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/04/2017] [Accepted: 05/02/2017] [Indexed: 05/23/2023]
Affiliation(s)
- Andrea Minio
- Department of Viticulture and Enology, University of California, DavisDavis, CA, United States
| | - Jerry Lin
- Department of Viticulture and Enology, University of California, DavisDavis, CA, United States
| | - Brandon S. Gaut
- Department of Ecology and Evolutionary Biology, University of California, IrvineIrvine, CA, United States
| | - Dario Cantu
- Department of Viticulture and Enology, University of California, DavisDavis, CA, United States
- *Correspondence: Dario Cantu
| |
Collapse
|