1
|
Perumal S, Koh CS, Jin L, Buchwaldt M, Higgins EE, Zheng C, Sankoff D, Robinson SJ, Kagale S, Navabi ZK, Tang L, Horner KN, He Z, Bancroft I, Chalhoub B, Sharpe AG, Parkin IAP. A high-contiguity Brassica nigra genome localizes active centromeres and defines the ancestral Brassica genome. NATURE PLANTS 2020; 6:929-941. [PMID: 32782408 PMCID: PMC7419231 DOI: 10.1038/s41477-020-0735-y] [Citation(s) in RCA: 69] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/03/2020] [Accepted: 06/28/2020] [Indexed: 05/19/2023]
Abstract
It is only recently, with the advent of long-read sequencing technologies, that we are beginning to uncover previously uncharted regions of complex and inherently recursive plant genomes. To comprehensively study and exploit the genome of the neglected oilseed Brassica nigra, we generated two high-quality nanopore de novo genome assemblies. The N50 contig lengths for the two assemblies were 17.1 Mb (12 contigs), one of the best among 324 sequenced plant genomes, and 0.29 Mb (424 contigs), respectively, reflecting recent improvements in the technology. Comparison with a de novo short-read assembly corroborated genome integrity and quantified sequence-related error rates (0.2%). The contiguity and coverage allowed unprecedented access to low-complexity regions of the genome. Pericentromeric regions and coincidence of hypomethylation enabled localization of active centromeres and identified centromere-associated ALE family retro-elements that appear to have proliferated through relatively recent nested transposition events (<1 Ma). Genomic distances calculated based on synteny relationships were used to define a post-triplication Brassica-specific ancestral genome, and to calculate the extensive rearrangements that define the evolutionary distance separating B. nigra from its diploid relatives.
Collapse
Affiliation(s)
- Sampath Perumal
- Agriculture and Agri-Food Canada, Saskatoon, Saskatchewan, Canada
| | - Chu Shin Koh
- Global Institute for Food Security, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| | - Lingling Jin
- Department of Computing Science, Thompson Rivers University, Kamloops, British Columbia, Canada
| | - Miles Buchwaldt
- Agriculture and Agri-Food Canada, Saskatoon, Saskatchewan, Canada
| | - Erin E Higgins
- Agriculture and Agri-Food Canada, Saskatoon, Saskatchewan, Canada
| | - Chunfang Zheng
- Department of Mathematics and Statistics, University of Ottawa, Ottawa, Ontario, Canada
| | - David Sankoff
- Department of Mathematics and Statistics, University of Ottawa, Ottawa, Ontario, Canada
| | | | - Sateesh Kagale
- National Research Council Canada, Saskatoon, Saskatchewan, Canada
| | - Zahra-Katy Navabi
- Agriculture and Agri-Food Canada, Saskatoon, Saskatchewan, Canada
- Global Institute for Food Security, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| | - Lily Tang
- Agriculture and Agri-Food Canada, Saskatoon, Saskatchewan, Canada
| | - Kyla N Horner
- Agriculture and Agri-Food Canada, Saskatoon, Saskatchewan, Canada
| | - Zhesi He
- Department of Biology, University of York, York, UK
| | - Ian Bancroft
- Department of Biology, University of York, York, UK
| | - Boulos Chalhoub
- Institute of Crop Science, Zhejiang University, Hangzhou, China
| | - Andrew G Sharpe
- Global Institute for Food Security, University of Saskatchewan, Saskatoon, Saskatchewan, Canada.
| | | |
Collapse
|
2
|
Ouma WZ, Mejia-Guerra MK, Yilmaz A, Pareja-Tobes P, Li W, Doseff AI, Grotewold E. Important biological information uncovered in previously unaligned reads from chromatin immunoprecipitation experiments (ChIP-Seq). Sci Rep 2015; 5:8635. [PMID: 25727450 PMCID: PMC4345404 DOI: 10.1038/srep08635] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2014] [Accepted: 01/12/2015] [Indexed: 12/04/2022] Open
Abstract
Establishing the architecture of gene regulatory networks (GRNs) relies on chromatin immunoprecipitation followed by massively parallel sequencing (ChIP-Seq) methods that provide genome-wide transcription factor binding sites (TFBSs). ChIP-Seq furnishes millions of short reads that, after alignment, describe the genome-wide binding sites of a particular TF. However, in all organisms investigated an average of 40% of reads fail to align to the corresponding genome, with some datasets having as much as 80% of reads failing to align. We describe here the provenance of previously unaligned reads in ChIP-Seq experiments from animals and plants. We show that a substantial portion corresponds to sequences of bacterial and metazoan origin, irrespective of the ChIP-Seq chromatin source. Unforeseen was the finding that 30%–40% of unaligned reads were actually alignable. To validate these observations, we investigated the characteristics of the previously unaligned reads corresponding to TAL1, a human TF involved in lineage specification of hemopoietic cells. We show that, while unmapped ChIP-Seq read datasets contain foreign DNA sequences, additional TFBSs can be identified from the previously unaligned ChIP-Seq reads. Our results indicate that the re-evaluation of previously unaligned reads from ChIP-Seq experiments will significantly contribute to TF target identification and determination of emerging properties of GRNs.
Collapse
Affiliation(s)
- Wilberforce Zachary Ouma
- 1] Molecular, Cellular and Developmental Biology (MCDB) Graduate Program, The Ohio State University, Columbus, OH, USA [2] Center for Applied Plant Sciences (CAPS), The Ohio State University, Columbus, OH, USA
| | - Maria Katherine Mejia-Guerra
- 1] Molecular, Cellular and Developmental Biology (MCDB) Graduate Program, The Ohio State University, Columbus, OH, USA [2] Center for Applied Plant Sciences (CAPS), The Ohio State University, Columbus, OH, USA
| | - Alper Yilmaz
- Department of Bioengineering, Yildiz Technical University, Istanbul, Turkey
| | - Pablo Pareja-Tobes
- Oh no sequences! Research group, Era7 Information Technologies SLU, Granada, Spain
| | - Wei Li
- 1] Department of Molecular Genetics, The Ohio State University, Columbus, OH, USA [2] Department Physiology and Cell Biology, Heart and Lung Research Institute, The Ohio State University, Columbus, OH, USA
| | - Andrea I Doseff
- 1] Department of Molecular Genetics, The Ohio State University, Columbus, OH, USA [2] Department Physiology and Cell Biology, Heart and Lung Research Institute, The Ohio State University, Columbus, OH, USA
| | - Erich Grotewold
- 1] Center for Applied Plant Sciences (CAPS), The Ohio State University, Columbus, OH, USA [2] Department of Molecular Genetics, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
3
|
Choi HI, Waminal NE, Park HM, Kim NH, Choi BS, Park M, Choi D, Lim YP, Kwon SJ, Park BS, Kim HH, Yang TJ. Major repeat components covering one-third of the ginseng (Panax ginseng C.A. Meyer) genome and evidence for allotetraploidy. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2014; 77:906-16. [PMID: 24456463 DOI: 10.1111/tpj.12441] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2013] [Revised: 01/07/2014] [Accepted: 01/13/2014] [Indexed: 05/12/2023]
Abstract
Ginseng (Panax ginseng) is a famous medicinal herb, but the composition and structure of its genome are largely unknown. Here we characterized the major repeat components and inspected their distribution in the ginseng genome. By analyzing three repeat-rich bacterial artificial chromosome (BAC) sequences from ginseng, we identified complex insertion patterns of 34 long terminal repeat retrotransposons (LTR-RTs) and 11 LTR-RT derivatives accounting for more than 80% of the BAC sequences. The LTR-RTs were classified into three Ty3/gypsy (PgDel, PgTat and PgAthila) and two Ty1/Copia (PgTork and PgOryco) families. Mapping of 30-Gbp Illumina whole-genome shotgun reads to the BAC sequences revealed that these five LTR-RT families occupy at least 34% of the ginseng genome. The Ty3/Gypsy families were predominant, comprising 74 and 33% of the BAC sequences and the genome, respectively. In particular, the PgDel family accounted for 29% of the genome and presumably played major roles in enlargement of the size of the ginseng genome. Fluorescence in situ hybridization (FISH) revealed that the PgDel1 elements are distributed throughout the chromosomes along dispersed heterochromatic regions except for ribosomal DNA blocks. The intensity of the PgDel2 FISH signals was biased toward 24 out of 48 chromosomes. Unique gene probes showed two pairs of signals with different locations, one pair in subtelomeric regions on PgDel2-rich chromosomes and the other in interstitial regions on PgDel2-poor chromosomes, demonstrating allotetraploidy in ginseng. Our findings promote understanding of the evolution of the ginseng genome and of that of related species in the Araliaceae.
Collapse
Affiliation(s)
- Hong-Il Choi
- Department of Plant Science, Plant Genomics and Breeding Institute, and Research Institute for Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Korea
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
4
|
Zhao D, Jiang N. Nested insertions and accumulation of indels are negatively correlated with abundance of mutator-like transposable elements in maize and rice. PLoS One 2014; 9:e87069. [PMID: 24475224 PMCID: PMC3903597 DOI: 10.1371/journal.pone.0087069] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2013] [Accepted: 12/23/2013] [Indexed: 11/29/2022] Open
Abstract
Mutator-like transposable elements (MULEs) are widespread in plants and were first discovered in maize where there are a total of 12,900 MULEs. In comparison, rice, with a much smaller genome, harbors over 30,000 MULEs. Since maize and rice are close relatives, the differential amplification of MULEs raised an inquiry into the underlying mechanism. We hypothesize this is partly attributed to the differential copy number of autonomous MULEs with the potential to generate the transposase that is required for transposition. To this end, we mined the two genomes and detected 530 and 476 MULEs containing transposase sequences (candidate coding-MULEs) in maize and rice, respectively. Over 1/3 of the candidate coding-MULEs harbor nested insertions and the ratios are similar in the two genomes. Among the maize elements with nested insertions, 24% have insertions in coding regions and over half of them harbor two or more insertions. In contrast, only 12% of the rice elements have insertions in coding regions and 19% have multiple insertions, suggesting that nested insertions in maize are more disruptive. This is because most nested insertions in maize are from LTR retrotransposons, which are large in size and are prevalent in the maize genome. Our results suggest that the amplification of retrotransposons may limit the amplification of DNA transposons but not vice versa. In addition, more indels are detected among maize elements than rice elements whereas defects caused by point mutations are comparable between the two species. Taken together, more disruptive nested insertions combined with higher frequency of indels resulted in few (6%) coding-MULEs that may encode functional transposases in maize. In contrast, 35% of the coding-MULEs in rice retain putative intact transposase. This is in addition to the higher expression frequency of rice coding-MULEs, which may explain the higher occurrence of MULEs in rice than that in maize.
Collapse
Affiliation(s)
- Dongyan Zhao
- Department of Horticulture, Michigan State University, East Lansing, Michigan, United States of America
| | - Ning Jiang
- Department of Horticulture, Michigan State University, East Lansing, Michigan, United States of America
- * E-mail:
| |
Collapse
|
5
|
Raats D, Frenkel Z, Krugman T, Dodek I, Sela H, Simková H, Magni F, Cattonaro F, Vautrin S, Bergès H, Wicker T, Keller B, Leroy P, Philippe R, Paux E, Doležel J, Feuillet C, Korol A, Fahima T. The physical map of wheat chromosome 1BS provides insights into its gene space organization and evolution. Genome Biol 2013; 14:R138. [PMID: 24359668 PMCID: PMC4053865 DOI: 10.1186/gb-2013-14-12-r138] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2013] [Accepted: 12/20/2013] [Indexed: 11/16/2022] Open
Abstract
Background The wheat genome sequence is an essential tool for advanced genomic research and improvements. The generation of a high-quality wheat genome sequence is challenging due to its complex 17 Gb polyploid genome. To overcome these difficulties, sequencing through the construction of BAC-based physical maps of individual chromosomes is employed by the wheat genomics community. Here, we present the construction of the first comprehensive physical map of chromosome 1BS, and illustrate its unique gene space organization and evolution. Results Fingerprinted BAC clones were assembled into 57 long scaffolds, anchored and ordered with 2,438 markers, covering 83% of chromosome 1BS. The BAC-based chromosome 1BS physical map and gene order of the orthologous regions of model grass species were consistent, providing strong support for the reliability of the chromosome 1BS assembly. The gene space for chromosome 1BS spans the entire length of the chromosome arm, with 76% of the genes organized in small gene islands, accompanied by a two-fold increase in gene density from the centromere to the telomere. Conclusions This study provides new evidence on common and chromosome-specific features in the organization and evolution of the wheat genome, including a non-uniform distribution of gene density along the centromere-telomere axis, abundance of non-syntenic genes, the degree of colinearity with other grass genomes and a non-uniform size expansion along the centromere-telomere axis compared with other model cereal genomes. The high-quality physical map constructed in this study provides a solid basis for the assembly of a reference sequence of chromosome 1BS and for breeding applications.
Collapse
|
6
|
Gottlieb A, Müller HG, Massa AN, Wanjugi H, Deal KR, You FM, Xu X, Gu YQ, Luo MC, Anderson OD, Chan AP, Rabinowicz P, Devos KM, Dvorak J. Insular organization of gene space in grass genomes. PLoS One 2013; 8:e54101. [PMID: 23326580 PMCID: PMC3543359 DOI: 10.1371/journal.pone.0054101] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2012] [Accepted: 12/06/2012] [Indexed: 01/28/2023] Open
Abstract
Wheat and maize genes were hypothesized to be clustered into islands but the hypothesis was not statistically tested. The hypothesis is statistically tested here in four grass species differing in genome size, Brachypodium distachyon, Oryza sativa, Sorghum bicolor, and Aegilops tauschii. Density functions obtained under a model where gene locations follow a homogeneous Poisson process and thus are not clustered are compared with a model-free situation quantified through a non-parametric density estimate. A simple homogeneous Poisson model for gene locations is not rejected for the small O. sativa and B. distachyon genomes, indicating that genes are distributed largely uniformly in those species, but is rejected for the larger S. bicolor and Ae. tauschii genomes, providing evidence for clustering of genes into islands. It is proposed to call the gene islands “gene insulae” to distinguish them from other types of gene clustering that have been proposed. An average S. bicolor and Ae. tauschii insula is estimated to contain 3.7 and 3.9 genes with an average intergenic distance within an insula of 2.1 and 16.5 kb, respectively. Inter-insular distances are greater than 8 and 81 kb and average 15.1 and 205 kb, in S. bicolor and Ae. tauschii, respectively. A greater gene density observed in the distal regions of the Ae. tauschii chromosomes is shown to be primarily caused by shortening of inter-insular distances. The comparison of the four grass genomes suggests that gene locations are largely a function of a homogeneous Poisson process in small genomes. Nonrandom insertions of LTR retroelements during genome expansion creates gene insulae, which become less dense and further apart with the increase in genome size. High concordance in relative lengths of orthologous intergenic distances among the investigated genomes including the maize genome suggests functional constraints on gene distribution in the grass genomes.
Collapse
Affiliation(s)
- Andrea Gottlieb
- Department of Statistics, University of California Davis, Davis, California, United States of America
| | - Hans-Georg Müller
- Department of Statistics, University of California Davis, Davis, California, United States of America
| | - Alicia N. Massa
- Institute of Plant Breeding, Genetics and Genomics (Department of Crop and Soil Sciences), Department of Plant Biology, University of Georgia, Athens, Georgia, United States of America
| | - Humphrey Wanjugi
- USDA/ARS Western Research Center, Albany, California, United States of America
| | - Karin R. Deal
- Department of Plant Sciences, University of California Davis, Davis, California, United States of America
| | - Frank M. You
- Department of Plant Sciences, University of California Davis, Davis, California, United States of America
| | - Xiangyang Xu
- Institute of Plant Breeding, Genetics and Genomics (Department of Crop and Soil Sciences), Department of Plant Biology, University of Georgia, Athens, Georgia, United States of America
| | - Yong Q. Gu
- USDA/ARS Western Research Center, Albany, California, United States of America
| | - Ming-Cheng Luo
- Department of Plant Sciences, University of California Davis, Davis, California, United States of America
| | - Olin D. Anderson
- USDA/ARS Western Research Center, Albany, California, United States of America
| | - Agnes P. Chan
- The J. Craig Venter Institute, Rockville, Maryland, United States of America
| | - Pablo Rabinowicz
- Institute for Genome Sciences, and Department of Biochemistry and Molecular Biology, University of Maryland School of Medicine, Baltimore, Maryland, United States of America
| | - Katrien M. Devos
- Institute of Plant Breeding, Genetics and Genomics (Department of Crop and Soil Sciences), Department of Plant Biology, University of Georgia, Athens, Georgia, United States of America
| | - Jan Dvorak
- Department of Plant Sciences, University of California Davis, Davis, California, United States of America
- * E-mail:
| |
Collapse
|
7
|
Kronmiller BA, Wise RP. TEnest 2.0: computational annotation and visualization of nested transposable elements. Methods Mol Biol 2013; 1057:305-319. [PMID: 23918438 DOI: 10.1007/978-1-62703-568-2_22] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Grass genomes harbor a diverse and complex content of repeated sequences. Most of these repeats occur as abundant transposable elements (TEs), which present unique challenges to sequence, assemble, and annotate genomes. Multiple copies of Long Terminal Repeat (LTR) retrotransposons can hinder sequence assembly and also cause problems with gene annotation. TEs can also contain protein-encoding genes, the ancient remnants of which can mislead gene identification software if not correctly masked. Hence, accurate assembly is crucial for gene annotation. We present TEnest v2.0. TEnest computationally annotates and chronologically displays nested transposable elements. Utilizing organism-specific TE databases as a reference for reconstructing degraded TEs to their ancestral state, annotation of repeats is accomplished by iterative sequence alignment. Subsequently, an output consisting of a graphical display of the chronological nesting structure and coordinate positions of all TE insertions is the result. Both linux command line and Web versions of the TEnest software are available at www.wiselab.org and www.plantgdb.org/tool/, respectively.
Collapse
Affiliation(s)
- Brent A Kronmiller
- Department of Plant Pathology and Microbiology, Iowa State University, Ames, IA, USA
| | | |
Collapse
|
8
|
Choulet F, Wicker T, Rustenholz C, Paux E, Salse J, Leroy P, Schlub S, Le Paslier MC, Magdelenat G, Gonthier C, Couloux A, Budak H, Breen J, Pumphrey M, Liu S, Kong X, Jia J, Gut M, Brunel D, Anderson JA, Gill BS, Appels R, Keller B, Feuillet C. Megabase level sequencing reveals contrasted organization and evolution patterns of the wheat gene and transposable element spaces. THE PLANT CELL 2010; 22:1686-701. [PMID: 20581307 PMCID: PMC2910976 DOI: 10.1105/tpc.110.074187] [Citation(s) in RCA: 199] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/21/2010] [Revised: 05/26/2010] [Accepted: 06/08/2010] [Indexed: 05/18/2023]
Abstract
To improve our understanding of the organization and evolution of the wheat (Triticum aestivum) genome, we sequenced and annotated 13-Mb contigs (18.2 Mb) originating from different regions of its largest chromosome, 3B (1 Gb), and produced a 2x chromosome survey by shotgun Illumina/Solexa sequencing. All regions carried genes irrespective of their chromosomal location. However, gene distribution was not random, with 75% of them clustered into small islands containing three genes on average. A twofold increase of gene density was observed toward the telomeres likely due to high tandem and interchromosomal duplication events. A total of 3222 transposable elements were identified, including 800 new families. Most of them are complete but showed a highly nested structure spread over distances as large as 200 kb. A succession of amplification waves involving different transposable element families led to contrasted sequence compositions between the proximal and distal regions. Finally, with an estimate of 50,000 genes per diploid genome, our data suggest that wheat may have a higher gene number than other cereals. Indeed, comparisons with rice (Oryza sativa) and Brachypodium revealed that a high number of additional noncollinear genes are interspersed within a highly conserved ancestral grass gene backbone, supporting the idea of an accelerated evolution in the Triticeae lineages.
Collapse
Affiliation(s)
- Frédéric Choulet
- Institut National de la Recherche Agronomique, Université Blaise Pascal, Unité Mixte de Recherche 1095 Genetics Diversity and Ecophysiology of Cereals, F-63100 Clermont-Ferrand, France.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Zhou S, Wei F, Nguyen J, Bechner M, Potamousis K, Goldstein S, Pape L, Mehan MR, Churas C, Pasternak S, Forrest DK, Wise R, Ware D, Wing RA, Waterman MS, Livny M, Schwartz DC. A single molecule scaffold for the maize genome. PLoS Genet 2009; 5:e1000711. [PMID: 19936062 PMCID: PMC2774507 DOI: 10.1371/journal.pgen.1000711] [Citation(s) in RCA: 115] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2009] [Accepted: 10/05/2009] [Indexed: 11/18/2022] Open
Abstract
About 85% of the maize genome consists of highly repetitive sequences that are interspersed by low-copy, gene-coding sequences. The maize community has dealt with this genomic complexity by the construction of an integrated genetic and physical map (iMap), but this resource alone was not sufficient for ensuring the quality of the current sequence build. For this purpose, we constructed a genome-wide, high-resolution optical map of the maize inbred line B73 genome containing >91,000 restriction sites (averaging 1 site/∼23 kb) accrued from mapping genomic DNA molecules. Our optical map comprises 66 contigs, averaging 31.88 Mb in size and spanning 91.5% (2,103.93 Mb/∼2,300 Mb) of the maize genome. A new algorithm was created that considered both optical map and unfinished BAC sequence data for placing 60/66 (2,032.42 Mb) optical map contigs onto the maize iMap. The alignment of optical maps against numerous data sources yielded comprehensive results that proved revealing and productive. For example, gaps were uncovered and characterized within the iMap, the FPC (fingerprinted contigs) map, and the chromosome-wide pseudomolecules. Such alignments also suggested amended placements of FPC contigs on the maize genetic map and proactively guided the assembly of chromosome-wide pseudomolecules, especially within complex genomic regions. Lastly, we think that the full integration of B73 optical maps with the maize iMap would greatly facilitate maize sequence finishing efforts that would make it a valuable reference for comparative studies among cereals, or other maize inbred lines and cultivars. The maize genome contains abundant repeats interspersed by low-copy, gene-coding sequences that make it a challenge to sequence; consequently, current BAC sequence assemblies average 11 contigs per clone. The iMap deals with such complexity by the judicious integration of IBM genetic and B73 physical maps, but the B73 genome structure could differ from the IBM population because of genetic recombination and subsequent rearrangements. Accordingly, we report a genome-wide, high-resolution optical map of maize B73 genome that was constructed from the direct analysis of genomic DNA molecules without using genetic markers. The integration of optical and iMap resources with comparisons to FPC maps enabled a uniquely comprehensive and scalable assessment of a given BAC's sequence assembly, its placement within a FPC contig, and the location of this FPC contig within a chromosome-wide pseudomolecule. As such, the overall utility of the maize optical map for the validation of sequence assemblies has been significant and demonstrates the inherent advantages of single molecule platforms. Construction of the maize optical map represents the first physical map of a eukaryotic genome larger than 400 Mb that was created de novo from individual genomic DNA molecules.
Collapse
Affiliation(s)
- Shiguo Zhou
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Fusheng Wei
- Department of Plant Sciences, Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
| | - John Nguyen
- Departments of Mathematics, Biology, and Computer Science, University of Southern California, Los Angeles, California, United States of America
| | - Mike Bechner
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Konstantinos Potamousis
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Steve Goldstein
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Louise Pape
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Michael R. Mehan
- Departments of Mathematics, Biology, and Computer Science, University of Southern California, Los Angeles, California, United States of America
| | - Chris Churas
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Shiran Pasternak
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Dan K. Forrest
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Roger Wise
- Corn Insects and Crop Genetics Research, United States Department of Agriculture–Agricultural Research Service and Department of Plant Pathology, Iowa State University, Ames, Iowa, United States of America
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
- Plant, Soil, and Nutrition Research, United States Department of Agriculture–Agricultural Research Service, Ithaca, New York, United States of America
| | - Rod A. Wing
- Department of Plant Sciences, Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
| | - Michael S. Waterman
- Departments of Mathematics, Biology, and Computer Science, University of Southern California, Los Angeles, California, United States of America
| | - Miron Livny
- Computer Sciences Department, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - David C. Schwartz
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
- * E-mail:
| |
Collapse
|