1
|
Economical Productivity of Maize Genotypes under Different Herbicides Application in Two Contrasting Climatic Conditions. SUSTAINABILITY 2022. [DOI: 10.3390/su14095629] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Maize ranks first among worldwide production and an important source of human and animal feed. Its production can be affected by management practices and climatic conditions. The objective of this study was to estimate stability of yield and hundred grains weight of six maize genotypes during two growing seasons at two locations, subjected to four different treatments: T1 treatment—without herbicide, Control; T2 treatment—active substance Nicosulfuron and Motivell commercial preparation; T3 treatment—active substance Rimsulfuron and Tarot; and, T4 treatment—active substance Forasulfuron and Equip. Additive main effects and multiplicative interaction—AMMI model and genotype × environment interaction—GGE biplot were used to estimate GEI—genotype by environment interaction. The results showed that the influence of genotype (G), year (Y), locality (L), treatment (T) and all interaction on hundred grains weight were significant. The share of genotypes in the total phenotypic variance was 64.70%, while the share in total interaction was 26.88%. The share of IPCA1 in terms of G × T interaction was 50.6%, while share of IPCA2 was 44.74%, which comprised together 94.80% of interaction. The first IPCA1 axis showed high share in the total interaction, which indicates out significance of genotype in total variation and interaction, while high level of IPCA2 indicates a significant treatment effect. Genotype L-6 had the same mass of 100 grains (37.96 g) during both years of testing, while genotype L-1, with 4.46 g, had the largest difference between years. This clearly indicates the influence of genotype but also stress under the influence of sulfonylureas and environmental factors. The maize genotype with the highest values of hundred grains weight, L-5 and L-6, expressed the highest values of grain yield (4665 kg ha−1 and 4445 kg ha−1).
Collapse
|
2
|
Mukherjee K, Rossi M, Salmela L, Boucher C. Fast and efficient Rmap assembly using the Bi-labelled de Bruijn graph. Algorithms Mol Biol 2021; 16:6. [PMID: 34034751 PMCID: PMC8147420 DOI: 10.1186/s13015-021-00182-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Accepted: 04/13/2021] [Indexed: 11/10/2022] Open
Abstract
Genome wide optical maps are high resolution restriction maps that give a unique numeric representation to a genome. They are produced by assembling hundreds of thousands of single molecule optical maps, which are called Rmaps. Unfortunately, there are very few choices for assembling Rmap data. There exists only one publicly-available non-proprietary method for assembly and one proprietary software that is available via an executable. Furthermore, the publicly-available method, by Valouev et al. (Proc Natl Acad Sci USA 103(43):15770-15775, 2006), follows the overlap-layout-consensus (OLC) paradigm, and therefore, is unable to scale for relatively large genomes. The algorithm behind the proprietary method, Bionano Genomics' Solve, is largely unknown. In this paper, we extend the definition of bi-labels in the paired de Bruijn graph to the context of optical mapping data, and present the first de Bruijn graph based method for Rmap assembly. We implement our approach, which we refer to as RMAPPER, and compare its performance against the assembler of Valouev et al. (Proc Natl Acad Sci USA 103(43):15770-15775, 2006) and Solve by Bionano Genomics on data from three genomes: E. coli, human, and climbing perch fish (Anabas Testudineus). Our method was able to successfully run on all three genomes. The method of Valouev et al. (Proc Natl Acad Sci USA 103(43):15770-15775, 2006) only successfully ran on E. coli. Moreover, on the human genome RMAPPER was at least 130 times faster than Bionano Solve, used five times less memory and produced the highest genome fraction with zero mis-assemblies. Our software, RMAPPER is written in C++ and is publicly available under GNU General Public License at https://github.com/kingufl/Rmapper .
Collapse
|
3
|
Raeisi Dehkordi S, Luebeck J, Bafna V. FaNDOM: Fast nested distance-based seeding of optical maps. PATTERNS (NEW YORK, N.Y.) 2021; 2:100248. [PMID: 34027500 PMCID: PMC8134938 DOI: 10.1016/j.patter.2021.100248] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 03/08/2021] [Accepted: 04/01/2021] [Indexed: 12/25/2022]
Abstract
Optical mapping (OM) provides single-molecule readouts of fluorescently labeled sequence motifs on long fragments of DNA, resolved to nucleotide-level coordinates. With the advent of microfluidic technologies for analysis of DNA molecules, it is possible to inexpensively generate long OM data ( > 150 kbp) at high coverage. In addition to scaffolding for de novo assembly, OM data can be aligned to a reference genome for identification of genomic structural variants. We introduce FaNDOM (Fast Nested Distance Seeding of Optical Maps)-an optical map alignment tool that greatly reduces the search space of the alignment process. On four benchmark human datasets, FaNDOM was significantly (4-14×) faster than competing tools while maintaining comparable sensitivity and specificity. We used FaNDOM to map variants in three cancer cell lines and identified many biologically interesting structural variants, including deletions, duplications, gene fusions and gene-disrupting rearrangements. FaNDOM is publicly available at https://github.com/jluebeck/FaNDOM.
Collapse
Affiliation(s)
- Siavash Raeisi Dehkordi
- Department of Computer Science & Engineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Jens Luebeck
- Department of Computer Science & Engineering, University of California, San Diego, La Jolla, CA 92093, USA
- Bioinformatics & Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - Vineet Bafna
- Department of Computer Science & Engineering, University of California, San Diego, La Jolla, CA 92093, USA
| |
Collapse
|
4
|
Takahashi S, Oshige M, Katsura S. DNA Manipulation and Single-Molecule Imaging. Molecules 2021; 26:1050. [PMID: 33671359 PMCID: PMC7922115 DOI: 10.3390/molecules26041050] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 02/12/2021] [Accepted: 02/14/2021] [Indexed: 11/22/2022] Open
Abstract
DNA replication, repair, and recombination in the cell play a significant role in the regulation of the inheritance, maintenance, and transfer of genetic information. To elucidate the biomolecular mechanism in the cell, some molecular models of DNA replication, repair, and recombination have been proposed. These biological studies have been conducted using bulk assays, such as gel electrophoresis. Because in bulk assays, several millions of biomolecules are subjected to analysis, the results of the biological analysis only reveal the average behavior of a large number of biomolecules. Therefore, revealing the elementary biological processes of a protein acting on DNA (e.g., the binding of protein to DNA, DNA synthesis, the pause of DNA synthesis, and the release of protein from DNA) is difficult. Single-molecule imaging allows the analysis of the dynamic behaviors of individual biomolecules that are hidden during bulk experiments. Thus, the methods for single-molecule imaging have provided new insights into almost all of the aspects of the elementary processes of DNA replication, repair, and recombination. However, in an aqueous solution, DNA molecules are in a randomly coiled state. Thus, the manipulation of the physical form of the single DNA molecules is important. In this review, we provide an overview of the unique studies on DNA manipulation and single-molecule imaging to analyze the dynamic interaction between DNA and protein.
Collapse
Affiliation(s)
- Shunsuke Takahashi
- Division of Life Science and Engineering, School of Science and Engineering, Tokyo Denki University, Hatoyama-cho, Hiki-gun, Saitama 350-0394, Japan;
| | - Masahiko Oshige
- Department of Environmental Engineering Science, Graduate School of Science and Technology, Gunma University, Kiryu, Gunma 376-8515, Japan;
- Gunma University Center for Food Science and Wellness (GUCFW), Maebashi, Gunma 371-8510, Japan
| | - Shinji Katsura
- Department of Environmental Engineering Science, Graduate School of Science and Technology, Gunma University, Kiryu, Gunma 376-8515, Japan;
- Gunma University Center for Food Science and Wellness (GUCFW), Maebashi, Gunma 371-8510, Japan
| |
Collapse
|
5
|
Abid HZ, Young E, McCaffrey J, Raseley K, Varapula D, Wang HY, Piazza D, Mell J, Xiao M. Customized optical mapping by CRISPR-Cas9 mediated DNA labeling with multiple sgRNAs. Nucleic Acids Res 2021; 49:e8. [PMID: 33231685 PMCID: PMC7826249 DOI: 10.1093/nar/gkaa1088] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Revised: 10/16/2020] [Accepted: 10/27/2020] [Indexed: 01/01/2023] Open
Abstract
Whole-genome mapping technologies have been developed as a complementary tool to provide scaffolds for genome assembly and structural variation analysis (1,2). We recently introduced a novel DNA labeling strategy based on a CRISPR-Cas9 genome editing system, which can target any 20bp sequences. The labeling strategy is specifically useful in targeting repetitive sequences, and sequences not accessible to other labeling methods. In this report, we present customized mapping strategies that extend the applications of CRISPR-Cas9 DNA labeling. We first design a CRISPR-Cas9 labeling strategy to interrogate and differentiate the single allele differences in NGG protospacer adjacent motifs (PAM sequence). Combined with sequence motif labeling, we can pinpoint the single-base differences in highly conserved sequences. In the second strategy, we design mapping patterns across a genome by selecting sets of specific single-guide RNAs (sgRNAs) for labeling multiple loci of a genomic region or a whole genome. By developing and optimizing a single tube synthesis of multiple sgRNAs, we demonstrate the utility of CRISPR-Cas9 mapping with 162 sgRNAs targeting the 2Mb Haemophilus influenzae chromosome. These CRISPR-Cas9 mapping approaches could be particularly useful for applications in defining long-distance haplotypes and pinpointing the breakpoints in large structural variants in complex genomes and microbial mixtures.
Collapse
MESH Headings
- Alleles
- Base Sequence
- Benzoxazoles/analysis
- CRISPR-Cas Systems
- Chromosome Mapping/methods
- Chromosomes, Bacterial/genetics
- Computer Simulation
- Conserved Sequence/genetics
- DNA-Directed RNA Polymerases
- Drug Resistance, Bacterial/genetics
- Fluorescent Dyes/analysis
- Gene Editing/methods
- Genome, Bacterial
- Genome, Human
- Haemophilus influenzae/drug effects
- Haemophilus influenzae/genetics
- Haplotypes/genetics
- Humans
- Lab-On-A-Chip Devices
- Nalidixic Acid/pharmacology
- Novobiocin/pharmacology
- Nucleotide Motifs/genetics
- Polymorphism, Single Nucleotide
- Quinolinium Compounds/analysis
- RNA, Guide, CRISPR-Cas Systems/chemical synthesis
- RNA, Guide, CRISPR-Cas Systems/genetics
- Repetitive Sequences, Nucleic Acid/genetics
- Sequence Alignment
- Staining and Labeling/methods
- Viral Proteins
Collapse
Affiliation(s)
- Heba Z Abid
- School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA, USA
| | - Eleanor Young
- School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA, USA
| | - Jennifer McCaffrey
- School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA, USA
| | - Kaitlin Raseley
- School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA, USA
| | - Dharma Varapula
- School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA, USA
| | - Hung-Yi Wang
- School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA, USA
| | - Danielle Piazza
- School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA, USA
- Department of Microbiology and Immunology, College of Medicine, Drexel University, Philadelphia, PA, USA
- Center for Genomic Sciences, Institute of Molecular Medicine and Infectious Disease, Drexel University, Philadelphia, PA, USA
| | - Joshua Mell
- Department of Microbiology and Immunology, College of Medicine, Drexel University, Philadelphia, PA, USA
- Center for Genomic Sciences, Institute of Molecular Medicine and Infectious Disease, Drexel University, Philadelphia, PA, USA
| | - Ming Xiao
- School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA, USA
- Center for Genomic Sciences, Institute of Molecular Medicine and Infectious Disease, Drexel University, Philadelphia, PA, USA
| |
Collapse
|
6
|
Wegary D, Teklewold A, Prasanna BM, Ertiro BT, Alachiotis N, Negera D, Awas G, Abakemal D, Ogugo V, Gowda M, Semagn K. Molecular diversity and selective sweeps in maize inbred lines adapted to African highlands. Sci Rep 2019; 9:13490. [PMID: 31530852 PMCID: PMC6748982 DOI: 10.1038/s41598-019-49861-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2019] [Accepted: 08/28/2019] [Indexed: 11/08/2022] Open
Abstract
Little is known on maize germplasm adapted to the African highland agro-ecologies. In this study, we analyzed high-density genotyping by sequencing (GBS) data of 298 African highland adapted maize inbred lines to (i) assess the extent of genetic purity, genetic relatedness, and population structure, and (ii) identify genomic regions that have undergone selection (selective sweeps) in response to adaptation to highland environments. Nearly 91% of the pairs of inbred lines differed by 30-36% of the scored alleles, but only 32% of the pairs of the inbred lines had relative kinship coefficient <0.050, which suggests the presence of substantial redundancy in allelic composition that may be due to repeated use of fewer genetic backgrounds (source germplasm) during line development. Results from different genetic relatedness and population structure analyses revealed three different groups, which generally agrees with pedigree information and breeding history, but less so by heterotic groups and endosperm modification. We identified 944 single nucleotide polymorphic (SNP) markers that fell within 22 selective sweeps that harbored 265 protein-coding candidate genes of which some of the candidate genes had known functions. Details of the candidate genes with known functions and differences in nucleotide diversity among groups predicted based on multivariate methods have been discussed.
Collapse
Affiliation(s)
- Dagne Wegary
- International Maize and Wheat Improvement Center (CIMMYT) - Ethiopia Office, ILRI Campus, CMC Road, Gurd Sholla, P.O. Box 5689, Addis Ababa, Ethiopia
| | - Adefris Teklewold
- International Maize and Wheat Improvement Center (CIMMYT) - Ethiopia Office, ILRI Campus, CMC Road, Gurd Sholla, P.O. Box 5689, Addis Ababa, Ethiopia.
| | - Boddupalli M Prasanna
- International Maize and Wheat Improvement Center (CIMMYT), ICRAF House, United Nations Avenue, Gigiri, P.O. Box 1041-00621, Nairobi, Kenya
| | - Berhanu T Ertiro
- Bako National Maize Research Center, Ethiopian Institute of Agricultural Research (EIAR), Addis Ababa, Ethiopia
| | - Nikolaos Alachiotis
- Institute of Computer Science, Foundation for Research and Technology-Hellas, Nikolaou Plastira 100, 70013, Heraklion, Crete, Greece
| | - Demewez Negera
- International Maize and Wheat Improvement Center (CIMMYT) - Ethiopia Office, ILRI Campus, CMC Road, Gurd Sholla, P.O. Box 5689, Addis Ababa, Ethiopia
| | - Geremew Awas
- International Maize and Wheat Improvement Center (CIMMYT) - Ethiopia Office, ILRI Campus, CMC Road, Gurd Sholla, P.O. Box 5689, Addis Ababa, Ethiopia
| | - Demissew Abakemal
- Ambo Agricultural Research Center, P.O. Box 37, West Shoa, Ambo, Ethiopia
| | - Veronica Ogugo
- International Maize and Wheat Improvement Center (CIMMYT), ICRAF House, United Nations Avenue, Gigiri, P.O. Box 1041-00621, Nairobi, Kenya
| | - Manje Gowda
- International Maize and Wheat Improvement Center (CIMMYT), ICRAF House, United Nations Avenue, Gigiri, P.O. Box 1041-00621, Nairobi, Kenya
| | - Kassa Semagn
- International Maize and Wheat Improvement Center (CIMMYT), ICRAF House, United Nations Avenue, Gigiri, P.O. Box 1041-00621, Nairobi, Kenya.
- Africa Rice Center (AfricaRice), M'bé Research Station, 01 B.P. 2551, Bouaké 01, Côte d'Ivoire.
| |
Collapse
|
7
|
SNP-based mixed model association of growth- and yield-related traits in popcorn. PLoS One 2019; 14:e0218552. [PMID: 31237892 PMCID: PMC6592533 DOI: 10.1371/journal.pone.0218552] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Accepted: 06/04/2019] [Indexed: 12/26/2022] Open
Abstract
The identification of the genes responsible for complex traits is highly promising to accelerate crop breeding, but such information is still limited for popcorn. Thus, in the present study, a mixed linear model-based association analysis (MLMA) was applied for six important popcorn traits: plant and ear height, 100-grain weight, popping expansion, grain yield and expanded popcorn volume per hectare. To this end, 196 plants of the open-pollinated popcorn population UENF-14 were sampled, selfed (S1), and then genotyped with a panel of 10,507 single nucleotide polymorphisms (SNPs) markers distributed throughout the genome. The six traits were studied under two environments [Campos dos Goytacazes-RJ (ENV1) and Itaocara-RJ (ENV2)] in an incomplete block design. Based on the phenotypic data of the S1 progenies and on the genetic characteristics of the parents, the MLMA was performed. Thereafter, genes annotated in the MaizeGDB platform were screened for potential linkage disequilibrium with the SNPs associated to the six evaluated traits. Overall, seven and eight genes were identified as associated with the traits in ENV1 and ENV2, respectively, and proteins encoded by these genes were evaluated for their function. The results obtained here contribute to increase knowledge on the genetic architecture of the six evaluated traits and might be used for marker-assisted selection in breeding programs.
Collapse
|
8
|
Harper J, De Vega J, Swain S, Heavens D, Gasior D, Thomas A, Evans C, Lovatt A, Lister S, Thorogood D, Skøt L, Hegarty M, Blackmore T, Kudrna D, Byrne S, Asp T, Powell W, Fernandez-Fuentes N, Armstead I. Integrating a newly developed BAC-based physical mapping resource for Lolium perenne with a genome-wide association study across a L. perenne European ecotype collection identifies genomic contexts associated with agriculturally important traits. ANNALS OF BOTANY 2019; 123:977-992. [PMID: 30715119 PMCID: PMC6589518 DOI: 10.1093/aob/mcy230] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/25/2018] [Accepted: 11/28/2018] [Indexed: 05/27/2023]
Abstract
BACKGROUND AND AIMS Lolium perenne (perennial ryegrass) is the most widely cultivated forage and amenity grass species in temperate areas worldwide and there is a need to understand the genetic architectures of key agricultural traits and crop characteristics that deliver wider environmental services. Our aim was to identify genomic regions associated with agriculturally important traits by integrating a bacterial artificial chromosome (BAC)-based physical map with a genome-wide association study (GWAS). METHODS BAC-based physical maps for L. perenne were constructed from ~212 000 high-information-content fingerprints using Fingerprint Contig and Linear Topology Contig software. BAC clones were associated with both BAC-end sequences and a partial minimum tiling path sequence. A panel of 716 L. perenne diploid genotypes from 90 European accessions was assessed in the field over 2 years, and genotyped using a Lolium Infinium SNP array. The GWAS was carried out using a linear mixed model implemented in TASSEL, and extended genomic regions associated with significant markers were identified through integration with the physical map. KEY RESULTS Between ~3600 and 7500 physical map contigs were derived, depending on the software and probability thresholds used, and integrated with ~35 k sequenced BAC clones to develop a resource predicted to span the majority of the L. perenne genome. From the GWAS, eight different loci were significantly associated with heading date, plant width, plant biomass and water-soluble carbohydrate accumulation, seven of which could be associated with physical map contigs. This allowed the identification of a number of candidate genes. CONCLUSIONS Combining the physical mapping resource with the GWAS has allowed us to extend the search for candidate genes across larger regions of the L. perenne genome and identified a number of interesting gene model annotations. These physical maps will aid in validating future sequence-based assemblies of the L. perenne genome.
Collapse
Affiliation(s)
- J Harper
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, UK
| | - J De Vega
- Earlham Institute, Norwich Research Park, Norwich, UK
| | - S Swain
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, UK
| | - D Heavens
- Earlham Institute, Norwich Research Park, Norwich, UK
| | - D Gasior
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, UK
| | - A Thomas
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, UK
| | - C Evans
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, UK
| | - A Lovatt
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, UK
| | - S Lister
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, UK
| | - D Thorogood
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, UK
| | - L Skøt
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, UK
| | - M Hegarty
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, UK
| | - T Blackmore
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, UK
| | - D Kudrna
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, USA
| | - S Byrne
- Teagasc, Department of Crop Science, Carlow, Ireland
| | - T Asp
- Department of Molecular Biology and Genetics, Crop Genetics and Biotechnology, Aarhus University, Slagelse, Denmark
| | - W Powell
- Scotland’s Rural College, Edinburgh, UK
| | - N Fernandez-Fuentes
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, UK
| | - I Armstead
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, UK
| |
Collapse
|
9
|
Tiukova IA, Pettersson ME, Hoeppner MP, Olsen RA, Käller M, Nielsen J, Dainat J, Lantz H, Söderberg J, Passoth V. Chromosomal genome assembly of the ethanol production strain CBS 11270 indicates a highly dynamic genome structure in the yeast species Brettanomyces bruxellensis. PLoS One 2019; 14:e0215077. [PMID: 31042716 PMCID: PMC6493715 DOI: 10.1371/journal.pone.0215077] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2018] [Accepted: 03/26/2019] [Indexed: 12/30/2022] Open
Abstract
Here, we present the genome of the industrial ethanol production strain Brettanomyces bruxellensis CBS 11270. The nuclear genome was found to be diploid, containing four chromosomes with sizes of ranging from 2.2 to 4.0 Mbp. A 75 Kbp mitochondrial genome was also identified. Comparing the homologous chromosomes, we detected that 0.32% of nucleotides were polymorphic, i.e. formed single nucleotide polymorphisms (SNPs), 40.6% of them were found in coding regions (i.e. 0.13% of all nucleotides formed SNPs and were in coding regions). In addition, 8,538 indels were found. The total number of protein coding genes was 4897, of them, 4,284 were annotated on chromosomes; and the mitochondrial genome contained 18 protein coding genes. Additionally, 595 genes, which were annotated, were on contigs not associated with chromosomes. A number of genes was duplicated, most of them as tandem repeats, including a six-gene cluster located on chromosome 3. There were also examples of interchromosomal gene duplications, including a duplication of a six-gene cluster, which was found on both chromosomes 1 and 4. Gene copy number analysis suggested loss of heterozygosity for 372 genes. This may reflect adaptation to relatively harsh but constant conditions of continuous fermentation. Analysis of gene topology showed that most of these losses occurred in clusters of more than one gene, the largest cluster comprising 33 genes. Comparative analysis against the wine isolate CBS 2499 revealed 88,534 SNPs and 8,133 indels. Moreover, when the scaffolds of the CBS 2499 genome assembly were aligned against the chromosomes of CBS 11270, many of them aligned completely, some have chunks aligned to different chromosomes, and some were in fact rearranged. Our findings indicate a highly dynamic genome within the species B. bruxellensis and a tendency towards reduction of gene number in long-term continuous cultivation.
Collapse
Affiliation(s)
- Ievgeniia A. Tiukova
- Chalmers University of Technology, Department of Biology and Biological Engineering, Systems and Synthetic Biology, Göteborg, Sweden
- Swedish University of Agricultural Sciences, Department of Molecular Sciences, Uppsala, Sweden
| | - Mats E. Pettersson
- Uppsala University, Department of Medical Biochemistry and Microbiology, Uppsala, Sweden
| | - Marc P. Hoeppner
- Uppsala University, Department of Medical Biochemistry and Microbiology, Uppsala, Sweden
- National Bioinformatics Infrastructure Sweden (NBIS), Uppsala, Sweden
- Christian-Albrechts-University of Kiel, Institute of Clinical Molecular Biology, Kiel, Germany
| | - Remi-Andre Olsen
- Science for Life Laboratory, Division of Gene Technology, School of Biotechnology, Royal Institute of Technology (KTH), Solna, Sweden
| | - Max Käller
- Royal Institute of Technology, Biotechnology and Health, School of Engineering Sciences in Chemistry, SciLifeLab, Stockholm, Sweden
- Stockholm University, Department of Biochemistry and Biophysics, SciLifeLab, Stockholm, Sweden
| | - Jens Nielsen
- Chalmers University of Technology, Department of Biology and Biological Engineering, Systems and Synthetic Biology, Göteborg, Sweden
| | - Jacques Dainat
- Uppsala University, Department of Medical Biochemistry and Microbiology, Uppsala, Sweden
- National Bioinformatics Infrastructure Sweden (NBIS), Uppsala, Sweden
| | - Henrik Lantz
- Uppsala University, Department of Medical Biochemistry and Microbiology, Uppsala, Sweden
- National Bioinformatics Infrastructure Sweden (NBIS), Uppsala, Sweden
| | - Jonas Söderberg
- Uppsala University, Department of Cell and Molecular Biology, Molecular Evolution, Uppsala, Sweden
| | - Volkmar Passoth
- Swedish University of Agricultural Sciences, Department of Molecular Sciences, Uppsala, Sweden
| |
Collapse
|
10
|
Mukherjee K, Washimkar D, Muggli MD, Salmela L, Boucher C. Error correcting optical mapping data. Gigascience 2018; 7:5005021. [PMID: 29846578 PMCID: PMC6007263 DOI: 10.1093/gigascience/giy061] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2017] [Accepted: 05/16/2018] [Indexed: 12/31/2022] Open
Abstract
Optical mapping is a unique system that is capable of producing high-resolution, high-throughput genomic map data that gives information about the structure of a genome . Recently it has been used for scaffolding contigs and for assembly validation for large-scale sequencing projects, including the maize, goat, and Amborella genomes. However, a major impediment in the use of this data is the variety and quantity of errors in the raw optical mapping data, which are called Rmaps. The challenges associated with using Rmap data are analogous to dealing with insertions and deletions in the alignment of long reads. Moreover, they are arguably harder to tackle since the data are numerical and susceptible to inaccuracy. We develop cOMet to error correct Rmap data, which to the best of our knowledge is the only optical mapping error correction method. Our experimental results demonstrate that cOMet has high prevision and corrects 82.49% of insertion errors and 77.38% of deletion errors in Rmap data generated from the Escherichia coli K-12 reference genome. Out of the deletion errors corrected, 98.26% are true errors. Similarly, out of the insertion errors corrected, 82.19% are true errors. It also successfully scales to large genomes, improving the quality of 78% and 99% of the Rmaps in the plum and goat genomes, respectively. Last, we show the utility of error correction by demonstrating how it improves the assembly of Rmap data. Error corrected Rmap data results in an assembly that is more contiguous and covers a larger fraction of the genome.
Collapse
Affiliation(s)
- Kingshuk Mukherjee
- Department of Computer and Information Science and Engineering, University of Florida, Gainesville
| | - Darshan Washimkar
- Department of Computer Science, Colorado State University, Fort Collins
| | - Martin D Muggli
- Department of Computer Science, Colorado State University, Fort Collins
| | - Leena Salmela
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki
| | - Christina Boucher
- Department of Computer and Information Science and Engineering, University of Florida, Gainesville
| |
Collapse
|
11
|
Gaiero P, Šimková H, Vrána J, Santiñaque FF, López-Carro B, Folle GA, van de Belt J, Peters SA, Doležel J, de Jong H. Intact DNA purified from flow-sorted nuclei unlocks the potential of next-generation genome mapping and assembly in Solanum species. MethodsX 2018; 5:328-336. [PMID: 30046519 PMCID: PMC6058011 DOI: 10.1016/j.mex.2018.03.009] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2017] [Accepted: 03/31/2018] [Indexed: 12/21/2022] Open
Abstract
Next-generation genome mapping through nanochannels (Bionano optical mapping) of plant genomes brings genome assemblies to the ‘nearly-finished’ level for reliable and detailed gene annotations and assessment of structural variations. Despite the recent progress in its development, researchers face the technical challenges of obtaining sufficient high molecular weight (HMW) nuclear DNA due to cell walls which are difficult to disrupt and to the presence of cytoplasmic polyphenols and polysaccharides that co-precipitate or are covalently bound to DNA and might cause oxidation and/or affect the access of nicking enzymes to DNA, preventing downstream applications. Here we describe important improvements for obtaining HMW DNA that we tested on Solanum crops and wild relatives. The methods that we further elaborated and refined focus on Improving flexibility of using different tissues as source materials, like fast-growing root tips and young leaves from seedlings or in vitro plantlets. Obtaining nuclei suspensions through either lab homogenizers or by chopping. Increasing flow sorting efficiency using DAPI (4′,6-diamidino-2-phenylindole) and PI (propidium iodide) DNA stains, with different lasers (UV or 488 nm) and sorting platforms such as the FACSAria and FACSVantage flow sorters, thus making it appropriate for more laboratories working on plant genomics.
The obtained nuclei are embedded into agarose plugs for processing and isolating uncontaminated HMW DNA, which is a prerequisite for nanochannel-based next-generation optical mapping strategies.
Collapse
Affiliation(s)
- Paola Gaiero
- Faculty of Agronomy, University of the Republic, Montevideo, Uruguay
- Laboratory of Genetics, Wageningen University & Research, Wageningen, The Netherlands
| | - Hana Šimková
- Centre of Plant Structural and Functional Genomics, Institute of Experimental Botany, Olomouc, Czech Republic
| | - Jan Vrána
- Centre of Plant Structural and Functional Genomics, Institute of Experimental Botany, Olomouc, Czech Republic
| | - Federico F. Santiñaque
- Flow Cytometry and Cell Sorting Core, Instituto de Investigaciones Biológicas Clemente Estable (IIBCE), Montevideo, Uruguay
| | - Beatriz López-Carro
- Flow Cytometry and Cell Sorting Core, Instituto de Investigaciones Biológicas Clemente Estable (IIBCE), Montevideo, Uruguay
| | - Gustavo A. Folle
- Flow Cytometry and Cell Sorting Core, Instituto de Investigaciones Biológicas Clemente Estable (IIBCE), Montevideo, Uruguay
| | - José van de Belt
- Laboratory of Genetics, Wageningen University & Research, Wageningen, The Netherlands
| | - Sander A. Peters
- Applied Bioinformatics, Department of Bioscience, Wageningen University & Research, Wageningen, The Netherlands
| | - Jaroslav Doležel
- Centre of Plant Structural and Functional Genomics, Institute of Experimental Botany, Olomouc, Czech Republic
| | - Hans de Jong
- Laboratory of Genetics, Wageningen University & Research, Wageningen, The Netherlands
- Corresponding author: Laboratory of Genetics, Wageningen University Research, Droevendaalsesteeg 1, P.O. Box 16, 6708 PB, Wageningen, The Netherlands.
| |
Collapse
|
12
|
You FM, Xiao J, Li P, Yao Z, Jia G, He L, Zhu T, Luo MC, Wang X, Deyholos MK, Cloutier S. Chromosome-scale pseudomolecules refined by optical, physical and genetic maps in flax. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2018; 95:371-384. [PMID: 29681136 DOI: 10.1111/tpj.13944] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/23/2018] [Revised: 03/19/2018] [Accepted: 03/22/2018] [Indexed: 05/19/2023]
Abstract
Genomes of varying sizes have been sequenced with next-generation sequencing platforms. However, most reference sequences include draft unordered scaffolds containing chimeras caused by mis-scaffolding. A BioNano genome (BNG) optical map was constructed to improve the previously sequenced flax genome (Linum usitatissimum L., 2n = 30, about 373 Mb), which consisted of 3852 scaffolds larger than 1 kb and totalling 300.6 Mb. The high-resolution BNG map of cv. CDC Bethune totalled 317 Mb and consisted of 251 BNG contigs with an N50 of 2.15 Mb. A total of 622 scaffolds (286.6 Mb, 94.9%) aligned to 211 BNG contigs (298.6 Mb, 94.2%). Of those, 99 scaffolds, diagnosed to contain assembly errors, were refined into 225 new scaffolds. Using the newly refined scaffold sequences and the validated bacterial artificial chromosome-based physical map of CDC Bethune, the 211 BNG contigs were scaffolded into 94 super-BNG contigs (N50 of 6.64 Mb) that were further assigned to the 15 flax chromosomes using the genetic map. The pseudomolecules total about 316 Mb, with individual chromosomes of 15.6 to 29.4 Mb, and cover 97% of the annotated genes. Evidence from the chromosome-scale pseudomolecules suggests that flax has undergone palaeopolyploidization and mesopolyploidization events, followed by rearrangements and deletions or fusion of chromosome arms from an ancient progenitor with a haploid chromosome number of eight.
Collapse
Affiliation(s)
- Frank M You
- Morden Research and Development Centre, Agriculture and Agri-Food Canada, Morden, MB, R6M 1Y5, Canada
| | - Jin Xiao
- Morden Research and Development Centre, Agriculture and Agri-Food Canada, Morden, MB, R6M 1Y5, Canada
- State Key Lab of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, 210095, China
| | - Pingchuan Li
- Morden Research and Development Centre, Agriculture and Agri-Food Canada, Morden, MB, R6M 1Y5, Canada
| | - Zhen Yao
- Morden Research and Development Centre, Agriculture and Agri-Food Canada, Morden, MB, R6M 1Y5, Canada
| | - Gaofeng Jia
- Morden Research and Development Centre, Agriculture and Agri-Food Canada, Morden, MB, R6M 1Y5, Canada
- Crop Development Centre, University of Saskatchewan, 51 Campus Drive, Saskatoon, SK, S7N 5A8, Canada
| | - Liqiang He
- Morden Research and Development Centre, Agriculture and Agri-Food Canada, Morden, MB, R6M 1Y5, Canada
| | - Tingting Zhu
- Department of Plant Sciences, University of California, Davis, CA, 95616, USA
| | - Ming-Cheng Luo
- Department of Plant Sciences, University of California, Davis, CA, 95616, USA
| | - Xiue Wang
- State Key Lab of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, 210095, China
| | | | - Sylvie Cloutier
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON, K1A 0C6, Canada
| |
Collapse
|
13
|
Maschmann A, Masters C, Davison M, Lallman J, Thompson D, Kounovsky-Shafer KL. Determining if DNA Stained with a Cyanine Dye Can Be Digested with Restriction Enzymes. J Vis Exp 2018. [PMID: 29443093 DOI: 10.3791/57141] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
Visualization of DNA for fluorescence microscopy utilizes a variety of dyes such as cyanine dyes. These dyes are utilized due to their high affinity and sensitivity for DNA. In order to determine if the DNA molecules are full length after the completion of the experiment, a method is required to determine if the stained molecules are full length by digesting DNA with restriction enzymes. However, stained DNA may inhibit the enzymes, so a method is needed to determine what enzymes one could use for fluorochrome stained DNA. In this method, DNA is stained with a cyanine dye overnight to allow the dye and DNA to equilibrate. Next, stained DNA is digested with a restriction enzyme, loaded into a gel and electrophoresed. The experimental DNA digest bands are compared to an in silico digest to determine the restriction enzyme activity. If there is the same number of bands as expected, then the reaction is complete. More bands than expected indicate partial digestion and less bands indicate incomplete digestion. The advantage of this method is its simplicity and it uses equipment that a scientist would need for a restriction enzyme assay and gel electrophoresis. A limitation of this method is that the enzymes available to most scientists are commercially available enzymes; however, any restriction enzymes could be used.
Collapse
Affiliation(s)
| | - Cody Masters
- Department of Chemistry, University of Nebraska - Kearney
| | | | - Joshua Lallman
- Department of Chemistry, University of Nebraska - Kearney
| | - Drew Thompson
- Department of Chemistry, University of Nebraska - Kearney
| | | |
Collapse
|
14
|
Wang YG, Fu FL, Yu HQ, Hu T, Zhang YY, Tao Y, Zhu JK, Zhao Y, Li WC. Interaction network of core ABA signaling components in maize. PLANT MOLECULAR BIOLOGY 2018; 96:245-263. [PMID: 29344831 DOI: 10.1007/s11103-017-0692-7] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2017] [Accepted: 12/06/2017] [Indexed: 05/08/2023]
Abstract
We defined a comprehensive core ABA signaling network in monocot maize, including the gene expression, subcellular localization and interaction network of ZmPYLs, ZmPP2Cs, ZmSnRK2s and the putative substrates. The phytohormone abscisic acid (ABA) plays an important role in plant developmental processes and abiotic stress responses. In Arabidopsis, ABA is sensed by the PYL ABA receptors, which leads to binding of the PP2C protein phosphatase and activation of the SnRK2 protein kinases. These components functioning diversely and redundantly in ABA signaling are little known in maize. Using Arabidopsis pyl112458 and snrk2.2/3/6 mutants, we identified several ABA-responsive ZmPYLs and ZmSnRK2s, and also ZmPP2Cs. We showed the gene expression, subcellular localization and interaction network of ZmPYLs, ZmPP2Cs, and ZmSnRK2s, and the isolation of putative ZmSnRK2 substrates by mass spectrometry in monocot maize. We found that the ABA dependency of PYL-PP2C interactions is contingent on the identity of the PP2Cs. Among 238 candidate substrates for ABA-activated protein kinases, 69 are putative ZmSnRK2 substrates. Besides homologs of previously reported putative AtSnRK2 substrates, 23 phosphoproteins have not been discovered in the dicot Arabidopsis. Thus, we have defined a comprehensive core ABA signaling network in monocot maize and shed new light on ABA signaling.
Collapse
Affiliation(s)
- Ying-Ge Wang
- Maize Research Institute, Sichuan Agricultural University, Chengdu, 611130, Sichuan, China
| | - Feng-Ling Fu
- Maize Research Institute, Sichuan Agricultural University, Chengdu, 611130, Sichuan, China
| | - Hao-Qiang Yu
- Maize Research Institute, Sichuan Agricultural University, Chengdu, 611130, Sichuan, China
| | - Tao Hu
- Shanghai Center for Plant Stress Biology, and CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, 200032, China
| | - Yuan-Yuan Zhang
- Maize Research Institute, Sichuan Agricultural University, Chengdu, 611130, Sichuan, China
| | - Yi Tao
- Maize Research Institute, Sichuan Agricultural University, Chengdu, 611130, Sichuan, China
| | - Jian-Kang Zhu
- Shanghai Center for Plant Stress Biology, and CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, 200032, China
| | - Yang Zhao
- Shanghai Center for Plant Stress Biology, and CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, 200032, China.
| | - Wan-Chen Li
- Maize Research Institute, Sichuan Agricultural University, Chengdu, 611130, Sichuan, China.
| |
Collapse
|
15
|
Abstract
In optical DNA mapping technologies sequence-specific intensity variations (DNA barcodes) along stretched and stained DNA molecules are produced. These “fingerprints” of the underlying DNA sequence have a resolution of the order one kilobasepairs and the stretching of the DNA molecules are performed by surface adsorption or nano-channel setups. A post-processing challenge for nano-channel based methods, due to local and global random movement of the DNA molecule during imaging, is how to align different time frames in order to produce reproducible time-averaged DNA barcodes. The current solutions to this challenge are computationally rather slow. With high-throughput applications in mind, we here introduce a parameter-free method for filtering a single time frame noisy barcode (snap-shot optical map), measured in a fraction of a second. By using only a single time frame barcode we circumvent the need for post-processing alignment. We demonstrate that our method is successful at providing filtered barcodes which are less noisy and more similar to time averaged barcodes. The method is based on the application of a low-pass filter on a single noisy barcode using the width of the Point Spread Function of the system as a unique, and known, filtering parameter. We find that after applying our method, the Pearson correlation coefficient (a real number in the range from -1 to 1) between the single time-frame barcode and the time average of the aligned kymograph increases significantly, roughly by 0.2 on average. By comparing to a database of more than 3000 theoretical plasmid barcodes we show that the capabilities to identify plasmids is improved by filtering single time-frame barcodes compared to the unfiltered analogues. Since snap-shot experiments and computational time using our method both are less than a second, this study opens up for high throughput optical DNA mapping with improved reproducibility.
Collapse
|
16
|
Yuan Y, Bayer PE, Batley J, Edwards D. Improvements in Genomic Technologies: Application to Crop Genomics. Trends Biotechnol 2017; 35:547-558. [DOI: 10.1016/j.tibtech.2017.02.009] [Citation(s) in RCA: 54] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2016] [Revised: 02/10/2017] [Accepted: 02/14/2017] [Indexed: 12/13/2022]
|
17
|
Whole-Genome Restriction Mapping by "Subhaploid"-Based RAD Sequencing: An Efficient and Flexible Approach for Physical Mapping and Genome Scaffolding. Genetics 2017; 206:1237-1250. [PMID: 28468906 PMCID: PMC5500127 DOI: 10.1534/genetics.117.200303] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2017] [Accepted: 04/17/2017] [Indexed: 11/18/2022] Open
Abstract
Assembly of complex genomes using short reads remains a major challenge, which usually yields highly fragmented assemblies. Generation of ultradense linkage maps is promising for anchoring such assemblies, but traditional linkage mapping methods are hindered by the infrequency and unevenness of meiotic recombination that limit attainable map resolution. Here we develop a sequencing-based "in vitro" linkage mapping approach (called RadMap), where chromosome breakage and segregation are realized by generating hundreds of "subhaploid" fosmid/bacterial-artificial-chromosome clone pools, and by restriction site-associated DNA sequencing of these clone pools to produce an ultradense whole-genome restriction map to facilitate genome scaffolding. A bootstrap-based minimum spanning tree algorithm is developed for grouping and ordering of genome-wide markers and is implemented in a user-friendly, integrated software package (AMMO). We perform extensive analyses to validate the power and accuracy of our approach in the model plant Arabidopsis thaliana and human. We also demonstrate the utility of RadMap for enhancing the contiguity of a variety of whole-genome shotgun assemblies generated using either short Illumina reads (300 bp) or long PacBio reads (6-14 kb), with up to 15-fold improvement of N50 (∼816 kb-3.7 Mb) and high scaffolding accuracy (98.1-98.5%). RadMap outperforms BioNano and Hi-C when input assembly is highly fragmented (contig N50 = 54 kb). RadMap can capture wide-range contiguity information and provide an efficient and flexible tool for high-resolution physical mapping and scaffolding of highly fragmented assemblies.
Collapse
|
18
|
Pan Y, Wang X, Liu L, Wang H, Luo M. Whole Genome Mapping with Feature Sets from High-Throughput Sequencing Data. PLoS One 2016; 11:e0161583. [PMID: 27611682 PMCID: PMC5017645 DOI: 10.1371/journal.pone.0161583] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2016] [Accepted: 08/08/2016] [Indexed: 11/19/2022] Open
Abstract
A good physical map is essential to guide sequence assembly in de novo whole genome sequencing, especially when sequences are produced by high-throughput sequencing such as next-generation-sequencing (NGS) technology. We here present a novel method, Feature sets-based Genome Mapping (FGM). With FGM, physical map and draft whole genome sequences can be generated, anchored and integrated using the same data set of NGS sequences, independent of restriction digestion. Method model was created and parameters were inspected by simulations using the Arabidopsis genome sequence. In the simulations, when ~4.8X genome BAC library including 4,096 clones was used to sequence the whole genome, ~90% of clones were successfully connected to physical contigs, and 91.58% of genome sequences were mapped and connected to chromosomes. This method was experimentally verified using the existing physical map and genome sequence of rice. Of 4,064 clones covering 115 Mb sequence selected from ~3 tiles of 3 chromosomes of a rice draft physical map, 3,364 clones were reconstructed into physical contigs and 98 Mb sequences were integrated into the 3 chromosomes. The physical map-integrated draft genome sequences can provide permanent frameworks for eventually obtaining high-quality reference sequences by targeted sequencing, gap filling and combining other sequences.
Collapse
Affiliation(s)
- Yonglong Pan
- National Key Laboratory of Crop Genetic Improvement and College of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070, China
| | - Xiaoming Wang
- National Key Laboratory of Crop Genetic Improvement and College of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070, China
| | - Lin Liu
- National Key Laboratory of Crop Genetic Improvement and College of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070, China
| | - Hao Wang
- National Key Laboratory of Crop Genetic Improvement and College of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070, China
| | - Meizhong Luo
- National Key Laboratory of Crop Genetic Improvement and College of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070, China
- * E-mail:
| |
Collapse
|
19
|
Chaney L, Sharp AR, Evans CR, Udall JA. Genome Mapping in Plant Comparative Genomics. TRENDS IN PLANT SCIENCE 2016; 21:770-780. [PMID: 27289181 DOI: 10.1016/j.tplants.2016.05.004] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/04/2016] [Revised: 04/27/2016] [Accepted: 05/12/2016] [Indexed: 05/10/2023]
Abstract
Genome mapping produces fingerprints of DNA sequences to construct a physical map of the whole genome. It provides contiguous, long-range information that complements and, in some cases, replaces sequencing data. Recent advances in genome-mapping technology will better allow researchers to detect large (>1kbp) structural variations between plant genomes. Some molecular and informatics complications need to be overcome for this novel technology to achieve its full utility. This technology will be useful for understanding phenotype responses due to DNA rearrangements and will yield insights into genome evolution, particularly in polyploids. In this review, we outline recent advances in genome-mapping technology, including the processes required for data collection and analysis, and applications in plant comparative genomics.
Collapse
Affiliation(s)
- Lindsay Chaney
- Plant and Wildlife Sciences Department, Brigham Young University, Provo, UT 84602, USA
| | - Aaron R Sharp
- Plant and Wildlife Sciences Department, Brigham Young University, Provo, UT 84602, USA
| | - Carrie R Evans
- Plant and Wildlife Sciences Department, Brigham Young University, Provo, UT 84602, USA
| | - Joshua A Udall
- Plant and Wildlife Sciences Department, Brigham Young University, Provo, UT 84602, USA.
| |
Collapse
|
20
|
Handrick V, Robert CAM, Ahern KR, Zhou S, Machado RAR, Maag D, Glauser G, Fernandez-Penny FE, Chandran JN, Rodgers-Melnik E, Schneider B, Buckler ES, Boland W, Gershenzon J, Jander G, Erb M, Köllner TG. Biosynthesis of 8-O-Methylated Benzoxazinoid Defense Compounds in Maize. THE PLANT CELL 2016; 28:1682-700. [PMID: 27317675 PMCID: PMC4981128 DOI: 10.1105/tpc.16.00065] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/02/2016] [Accepted: 06/14/2016] [Indexed: 05/04/2023]
Abstract
Benzoxazinoids are important defense compounds in grasses. Here, we investigated the biosynthesis and biological roles of the 8-O-methylated benzoxazinoids, DIM2BOA-Glc and HDM2BOA-Glc. Using quantitative trait locus mapping and heterologous expression, we identified a 2-oxoglutarate-dependent dioxygenase (BX13) that catalyzes the conversion of DIMBOA-Glc into a new benzoxazinoid intermediate (TRIMBOA-Glc) by an uncommon reaction involving a hydroxylation and a likely ortho-rearrangement of a methoxy group. TRIMBOA-Glc is then converted to DIM2BOA-Glc by a previously described O-methyltransferase BX7. Furthermore, we identified an O-methyltransferase (BX14) that converts DIM2BOA-Glc to HDM2BOA-Glc. The role of these enzymes in vivo was demonstrated by characterizing recombinant inbred lines, including Oh43, which has a point mutation in the start codon of Bx13 and lacks both DIM2BOA-Glc and HDM2BOA-Glc, and Il14H, which has an inactive Bx14 allele and lacks HDM2BOA-Glc in leaves. Experiments with near-isogenic maize lines derived from crosses between B73 and Oh43 revealed that the absence of DIM2BOA-Glc and HDM2BOA-Glc does not alter the constitutive accumulation or deglucosylation of other benzoxazinoids. The growth of various chewing herbivores was not significantly affected by the absence of BX13-dependent metabolites, while aphid performance increased, suggesting that DIM2BOA-Glc and/or HDM2BOA-Glc provide specific protection against phloem feeding insects.
Collapse
Affiliation(s)
| | | | - Kevin R Ahern
- Boyce Thompson Institute for Plant Research, Ithaca, New York 14853
| | - Shaoqun Zhou
- Boyce Thompson Institute for Plant Research, Ithaca, New York 14853
| | | | - Daniel Maag
- Institute of Biology, University of Neuchatel, 2009 Neuchatel, Switzerland
| | - Gaetan Glauser
- Institute of Biology, University of Neuchatel, 2009 Neuchatel, Switzerland
| | | | - Jima N Chandran
- Max Planck Institute for Chemical Ecology, 07745 Jena, Germany
| | - Eli Rodgers-Melnik
- Institute for Genomic Diversity, Cornell University, Ithaca, New York 14853
| | - Bernd Schneider
- Max Planck Institute for Chemical Ecology, 07745 Jena, Germany
| | - Edward S Buckler
- Institute for Genomic Diversity, Cornell University, Ithaca, New York 14853
| | - Wilhelm Boland
- Max Planck Institute for Chemical Ecology, 07745 Jena, Germany
| | | | - Georg Jander
- Boyce Thompson Institute for Plant Research, Ithaca, New York 14853
| | - Matthias Erb
- Institute of Plant Sciences, University of Bern, 3013 Bern, Switzerland
| | | |
Collapse
|
21
|
Staňková H, Hastie AR, Chan S, Vrána J, Tulpová Z, Kubaláková M, Visendi P, Hayashi S, Luo M, Batley J, Edwards D, Doležel J, Šimková H. BioNano genome mapping of individual chromosomes supports physical mapping and sequence assembly in complex plant genomes. PLANT BIOTECHNOLOGY JOURNAL 2016; 14:1523-31. [PMID: 26801360 PMCID: PMC5066648 DOI: 10.1111/pbi.12513] [Citation(s) in RCA: 43] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/04/2015] [Revised: 11/12/2015] [Accepted: 11/13/2015] [Indexed: 05/09/2023]
Abstract
The assembly of a reference genome sequence of bread wheat is challenging due to its specific features such as the genome size of 17 Gbp, polyploid nature and prevalence of repetitive sequences. BAC-by-BAC sequencing based on chromosomal physical maps, adopted by the International Wheat Genome Sequencing Consortium as the key strategy, reduces problems caused by the genome complexity and polyploidy, but the repeat content still hampers the sequence assembly. Availability of a high-resolution genomic map to guide sequence scaffolding and validate physical map and sequence assemblies would be highly beneficial to obtaining an accurate and complete genome sequence. Here, we chose the short arm of chromosome 7D (7DS) as a model to demonstrate for the first time that it is possible to couple chromosome flow sorting with genome mapping in nanochannel arrays and create a de novo genome map of a wheat chromosome. We constructed a high-resolution chromosome map composed of 371 contigs with an N50 of 1.3 Mb. Long DNA molecules achieved by our approach facilitated chromosome-scale analysis of repetitive sequences and revealed a ~800-kb array of tandem repeats intractable to current DNA sequencing technologies. Anchoring 7DS sequence assemblies obtained by clone-by-clone sequencing to the 7DS genome map provided a valuable tool to improve the BAC-contig physical map and validate sequence assembly on a chromosome-arm scale. Our results indicate that creating genome maps for the whole wheat genome in a chromosome-by-chromosome manner is feasible and that they will be an affordable tool to support the production of improved pseudomolecules.
Collapse
Affiliation(s)
- Helena Staňková
- Institute of Experimental Botany, Centre of the Region Haná for Biotechnological and Agricultural Research, Olomouc, Czech Republic
| | | | - Saki Chan
- BioNano Genomics, San Diego, CA, USA
| | - Jan Vrána
- Institute of Experimental Botany, Centre of the Region Haná for Biotechnological and Agricultural Research, Olomouc, Czech Republic
| | - Zuzana Tulpová
- Institute of Experimental Botany, Centre of the Region Haná for Biotechnological and Agricultural Research, Olomouc, Czech Republic
| | - Marie Kubaláková
- Institute of Experimental Botany, Centre of the Region Haná for Biotechnological and Agricultural Research, Olomouc, Czech Republic
| | - Paul Visendi
- Australian Centre for Plant Functional Genomics, University of Queensland, Brisbane, QLD, Australia
| | - Satomi Hayashi
- School of Agriculture and Food Sciences, University of Queensland, Brisbane, QLD, Australia
| | - Mingcheng Luo
- Department of Plant Sciences, University of California, Davis, CA, USA
| | - Jacqueline Batley
- School of Agriculture and Food Sciences, University of Queensland, Brisbane, QLD, Australia
- School of Plant Biology, University of Western Australia, Crawley, WA, Australia
| | - David Edwards
- School of Plant Biology, University of Western Australia, Crawley, WA, Australia
| | - Jaroslav Doležel
- Institute of Experimental Botany, Centre of the Region Haná for Biotechnological and Agricultural Research, Olomouc, Czech Republic
| | - Hana Šimková
- Institute of Experimental Botany, Centre of the Region Haná for Biotechnological and Agricultural Research, Olomouc, Czech Republic
| |
Collapse
|
22
|
Friedrich SM, Zec HC, Wang TH. Analysis of single nucleic acid molecules in micro- and nano-fluidics. LAB ON A CHIP 2016; 16:790-811. [PMID: 26818700 PMCID: PMC4767527 DOI: 10.1039/c5lc01294e] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Nucleic acid analysis has enhanced our understanding of biological processes and disease progression, elucidated the association of genetic variants and disease, and led to the design and implementation of new treatment strategies. These diverse applications require analysis of a variety of characteristics of nucleic acid molecules: size or length, detection or quantification of specific sequences, mapping of the general sequence structure, full sequence identification, analysis of epigenetic modifications, and observation of interactions between nucleic acids and other biomolecules. Strategies that can detect rare or transient species, characterize population distributions, and analyze small sample volumes enable the collection of richer data from biosamples. Platforms that integrate micro- and nano-fluidic operations with high sensitivity single molecule detection facilitate manipulation and detection of individual nucleic acid molecules. In this review, we will highlight important milestones and recent advances in single molecule nucleic acid analysis in micro- and nano-fluidic platforms. We focus on assessment modalities for single nucleic acid molecules and highlight the role of micro- and nano-structures and fluidic manipulation. We will also briefly discuss future directions and the current limitations and obstacles impeding even faster progress toward these goals.
Collapse
Affiliation(s)
- Sarah M Friedrich
- Biomedical Engineering Department, Johns Hopkins University, Baltimore, MD 21218, USA.
| | - Helena C Zec
- Mechanical Engineering Department, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Tza-Huei Wang
- Biomedical Engineering Department, Johns Hopkins University, Baltimore, MD 21218, USA. and Mechanical Engineering Department, Johns Hopkins University, Baltimore, MD 21218, USA
| |
Collapse
|
23
|
Francki MG, Hayton S, Gummer JPA, Rawlinson C, Trengove RD. Metabolomic profiling and genomic analysis of wheat aneuploid lines to identify genes controlling biochemical pathways in mature grain. PLANT BIOTECHNOLOGY JOURNAL 2016; 14:649-60. [PMID: 26032167 DOI: 10.1111/pbi.12410] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/05/2015] [Revised: 05/01/2015] [Accepted: 05/05/2015] [Indexed: 05/11/2023]
Abstract
Metabolomics is becoming an increasingly important tool in plant genomics to decipher the function of genes controlling biochemical pathways responsible for trait variation. Although theoretical models can integrate genes and metabolites for trait variation, biological networks require validation using appropriate experimental genetic systems. In this study, we applied an untargeted metabolite analysis to mature grain of wheat homoeologous group 3 ditelosomic lines, selected compounds that showed significant variation between wheat lines Chinese Spring and at least one ditelosomic line, tracked the genes encoding enzymes of their biochemical pathway using the wheat genome survey sequence and determined the genetic components underlying metabolite variation. A total of 412 analytes were resolved in the wheat grain metabolome, and principal component analysis indicated significant differences in metabolite profiles between Chinese Spring and each ditelosomic lines. The grain metabolome identified 55 compounds positively matched against a mass spectral library where the majority showed significant differences between Chinese Spring and at least one ditelosomic line. Trehalose and branched-chain amino acids were selected for detailed investigation, and it was expected that if genes encoding enzymes directly related to their biochemical pathways were located on homoeologous group 3 chromosomes, then corresponding ditelosomic lines would have a significant reduction in metabolites compared with Chinese Spring. Although a proportion showed a reduction, some lines showed significant increases in metabolites, indicating that genes directly and indirectly involved in biosynthetic pathways likely regulate the metabolome. Therefore, this study demonstrated that wheat aneuploid lines are suitable experimental genetic system to validate metabolomics-genomics networks.
Collapse
Affiliation(s)
- Michael G Francki
- Department of Agriculture and Food Western Australia, Grains Industry, South Perth, WA, Australia
- State Agricultural Biotechnology Centre, Murdoch University, Murdoch, WA, Australia
| | - Sarah Hayton
- Separation Science and Metabolomics Laboratory, Research and Development, Murdoch University, Murdoch, WA, Australia
| | - Joel P A Gummer
- Separation Science and Metabolomics Laboratory, Research and Development, Murdoch University, Murdoch, WA, Australia
- School of Veterinary and Life Sciences, Murdoch University, Murdoch, WA, Australia
- Metabolomics Australia, Murdoch University Node, Murdoch, WA, Australia
| | - Catherine Rawlinson
- Separation Science and Metabolomics Laboratory, Research and Development, Murdoch University, Murdoch, WA, Australia
- Metabolomics Australia, Murdoch University Node, Murdoch, WA, Australia
| | - Robert D Trengove
- Separation Science and Metabolomics Laboratory, Research and Development, Murdoch University, Murdoch, WA, Australia
- School of Veterinary and Life Sciences, Murdoch University, Murdoch, WA, Australia
- Metabolomics Australia, Murdoch University Node, Murdoch, WA, Australia
| |
Collapse
|
24
|
Yuan B, Liu P, Gupta A, Beck CR, Tejomurtula A, Campbell IM, Gambin T, Simmons AD, Withers MA, Harris RA, Rogers J, Schwartz DC, Lupski JR. Comparative Genomic Analyses of the Human NPHP1 Locus Reveal Complex Genomic Architecture and Its Regional Evolution in Primates. PLoS Genet 2015; 11:e1005686. [PMID: 26641089 PMCID: PMC4671654 DOI: 10.1371/journal.pgen.1005686] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2015] [Accepted: 10/29/2015] [Indexed: 11/30/2022] Open
Abstract
Many loci in the human genome harbor complex genomic structures that can result in susceptibility to genomic rearrangements leading to various genomic disorders. Nephronophthisis 1 (NPHP1, MIM# 256100) is an autosomal recessive disorder that can be caused by defects of NPHP1; the gene maps within the human 2q13 region where low copy repeats (LCRs) are abundant. Loss of function of NPHP1 is responsible for approximately 85% of the NPHP1 cases—about 80% of such individuals carry a large recurrent homozygous NPHP1 deletion that occurs via nonallelic homologous recombination (NAHR) between two flanking directly oriented ~45 kb LCRs. Published data revealed a non-pathogenic inversion polymorphism involving the NPHP1 gene flanked by two inverted ~358 kb LCRs. Using optical mapping and array-comparative genomic hybridization, we identified three potential novel structural variant (SV) haplotypes at the NPHP1 locus that may protect a haploid genome from the NPHP1 deletion. Inter-species comparative genomic analyses among primate genomes revealed massive genomic changes during evolution. The aggregated data suggest that dynamic genomic rearrangements occurred historically within the NPHP1 locus and generated SV haplotypes observed in the human population today, which may confer differential susceptibility to genomic instability and the NPHP1 deletion within a personal genome. Our study documents diverse SV haplotypes at a complex LCR-laden human genomic region. Comparative analyses provide a model for how this complex region arose during primate evolution, and studies among humans suggest that intra-species polymorphism may potentially modulate an individual’s susceptibility to acquiring disease-associated alleles. Genomic instability due to the intrinsic sequence architecture of the genome, such as low copy repeats (LCRs), is a major contributor to de novo mutations that can occur in the process of human genome evolution. LCRs can mediate genomic rearrangements associated with genomic disorders by acting as substrates for nonallelic homologous recombination. Juvenile-onset nephronophthisis 1 is the most frequent genetic cause of renal failure in children. An LCR-mediated, homozygous common recurrent deletion encompassing NPHP1 is found in the majority of affected subjects, while heterozygous deletion representing the nephronophthisis 1 recessive carrier state is frequently observed amongst world populations. Interestingly, the human NPHP1 locus is located proximal to the head-to-head fusion site of two ancestral chromosomes that occurred in the great apes, which resulted in a reduction of chromosome number from 48 in nonhuman primates to the current 46 in humans. In this study, we characterized and provided evidence for the diverse genomic architecture at the NPHP1 locus and potential structural variant haplotypes in the human population. Furthermore, our analyses of primate genomes shed light on the massive changes of genomic architecture at the human NPHP1 locus and delineated a model for the emergence of the LCRs during primate evolution.
Collapse
Affiliation(s)
- Bo Yuan
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Pengfei Liu
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Aditya Gupta
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics and The UW-Biotechnology Center, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Christine R. Beck
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Anusha Tejomurtula
- Graduate Program in Diagnostic Genetics, School of Health Professions, University of Texas MD Anderson Cancer Center, Houston, Texas, United States of America
| | - Ian M. Campbell
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Tomasz Gambin
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Alexandra D. Simmons
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Marjorie A. Withers
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - R. Alan Harris
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America
| | - Jeffrey Rogers
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America
| | - David C. Schwartz
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics and The UW-Biotechnology Center, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - James R. Lupski
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Pediatrics, Baylor College of Medicine, Houston, Texas, United States of America
- Texas Children’s Hospital, Houston, Texas, United States of America
- * E-mail:
| |
Collapse
|
25
|
Mendelowitz LM, Schwartz DC, Pop M. Maligner: a fast ordered restriction map aligner. Bioinformatics 2015; 32:1016-22. [PMID: 26637292 DOI: 10.1093/bioinformatics/btv711] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2015] [Accepted: 12/01/2015] [Indexed: 12/28/2022] Open
Abstract
MOTIVATION The Optical Mapping System discovers structural variants and potentiates sequence assembly of genomes via scaffolding and comparisons that globally validate or correct sequence assemblies. Despite its utility, there are few publicly available tools for aligning optical mapping datasets. RESULTS Here we present software, named 'Maligner', for the alignment of both single molecule restriction maps (Rmaps) and in silico restriction maps of sequence contigs to a reference. Maligner provides two modes of alignment: an efficient, sensitive dynamic programming implementation that scales to large eukaryotic genomes, and a faster indexed based implementation for finding alignments with unmatched sites in the reference but not the query. We compare our software to other publicly available tools on Rmap datasets and show that Maligner finds more correct alignments in comparable runtime. Lastly, we introduce the M-Score statistic for normalizing alignment scores across restriction maps and demonstrate its utility for selecting high quality alignments. AVAILABILITY AND IMPLEMENTATION The Maligner software is written in C ++ and is available at https://github.com/LeeMendelowitz/maligner under the GNU General Public License. CONTACT mpop@umiacs.umd.edu.
Collapse
Affiliation(s)
- Lee M Mendelowitz
- Center for Bioinformatics and Computational Biology, Applied Math & Statistics, and Scientific Computation
| | - David C Schwartz
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, USA and the UW-Biotechnology Center, University of Wisconsin-Madison, WI 53706, USA
| | - Mihai Pop
- Center for Bioinformatics and Computational Biology, Applied Math & Statistics, and Scientific Computation, Department of Computer Science, University of Maryland, College Park, MD 20742, USA and
| |
Collapse
|
26
|
Olsen RA, Bunikis I, Tiukova I, Holmberg K, Lötstedt B, Pettersson OV, Passoth V, Käller M, Vezzi F. De novo assembly of Dekkera bruxellensis: a multi technology approach using short and long-read sequencing and optical mapping. Gigascience 2015; 4:56. [PMID: 26617983 PMCID: PMC4661999 DOI: 10.1186/s13742-015-0094-1] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2015] [Accepted: 11/04/2015] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND It remains a challenge to perform de novo assembly using next-generation sequencing (NGS). Despite the availability of multiple sequencing technologies and tools (e.g., assemblers) it is still difficult to assemble new genomes at chromosome resolution (i.e., one sequence per chromosome). Obtaining high quality draft assemblies is extremely important in the case of yeast genomes to better characterise major events in their evolutionary history. The aim of this work is two-fold: on the one hand we want to show how combining different and somewhat complementary technologies is key to improving assembly quality and correctness, and on the other hand we present a de novo assembly pipeline we believe to be beneficial to core facility bioinformaticians. To demonstrate both the effectiveness of combining technologies and the simplicity of the pipeline, here we present the results obtained using the Dekkera bruxellensis genome. METHODS In this work we used short-read Illumina data and long-read PacBio data combined with the extreme long-range information from OpGen optical maps in the task of de novo genome assembly and finishing. Moreover, we developed NouGAT, a semi-automated pipeline for read-preprocessing, de novo assembly and assembly evaluation, which was instrumental for this work. RESULTS We obtained a high quality draft assembly of a yeast genome, resolved on a chromosomal level. Furthermore, this assembly was corrected for mis-assembly errors as demonstrated by resolving a large collapsed repeat and by receiving higher scores by assembly evaluation tools. With the inclusion of PacBio data we were able to fill about 5 % of the optical mapped genome not covered by the Illumina data.
Collapse
Affiliation(s)
- Remi-Andre Olsen
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Box 1031, 171 21 Solna, Sweden
| | - Ignas Bunikis
- Uppsala Genome Center, NGI/SciLifeLab, Department of Immunology, Genetics and Pathology, Uppsala University, BMC, Box 815, SE-752 37 Uppsala, Sweden
| | - Ievgeniia Tiukova
- Department of Microbiology, Swedish University of Agricultural Sciences, Box 7025, SE-75007 Uppsala, Sweden
| | - Kicki Holmberg
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Box 1031, 171 21 Solna, Sweden
| | - Britta Lötstedt
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Box 1031, 171 21 Solna, Sweden
| | - Olga Vinnere Pettersson
- Uppsala Genome Center, NGI/SciLifeLab, Department of Immunology, Genetics and Pathology, Uppsala University, BMC, Box 815, SE-752 37 Uppsala, Sweden
| | - Volkmar Passoth
- Department of Microbiology, Swedish University of Agricultural Sciences, Box 7025, SE-75007 Uppsala, Sweden
| | - Max Käller
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Box 1031, 171 21 Solna, Sweden
| | - Francesco Vezzi
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Box 1031, 171 21 Solna, Sweden
| |
Collapse
|
27
|
Abstract
Optical Mapping is an established single-molecule, whole-genome analysis system, which has been used to gain a comprehensive understanding of genomic structure and to study structural variation of complex genomes. A critical component of Optical Mapping system is the image processing module, which extracts single molecule restriction maps from image datasets of immobilized, restriction digested and fluorescently stained large DNA molecules. In this review, we describe robust and efficient image processing techniques to process these massive datasets and extract accurate restriction maps in the presence of noise, ambiguity and confounding artifacts. We also highlight a few applications of the Optical Mapping system.
Collapse
Affiliation(s)
- Prabu Ravindran
- Laboratory of Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics and Biotechnology Center, University of Wisconsin, 425 Henry Mall, Madison, USA
| | - Aditya Gupta
- Laboratory of Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics and Biotechnology Center, University of Wisconsin, 425 Henry Mall, Madison, USA
| |
Collapse
|
28
|
Muggli MD, Puglisi SJ, Ronen R, Boucher C. Misassembly detection using paired-end sequence reads and optical mapping data. Bioinformatics 2015; 31:i80-8. [PMID: 26072512 PMCID: PMC4542784 DOI: 10.1093/bioinformatics/btv262] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Motivation: A crucial problem in genome assembly is the discovery and correction of misassembly errors in draft genomes. We develop a method called misSEQuel that enhances the quality of draft genomes by identifying misassembly errors and their breakpoints using paired-end sequence reads and optical mapping data. Our method also fulfills the critical need for open source computational methods for analyzing optical mapping data. We apply our method to various assemblies of the loblolly pine, Francisella tularensis, rice and budgerigar genomes. We generated and used stimulated optical mapping data for loblolly pine and F.tularensis and used real optical mapping data for rice and budgerigar. Results: Our results demonstrate that we detect more than 54% of extensively misassembled contigs and more than 60% of locally misassembled contigs in assemblies of F.tularensis and between 31% and 100% of extensively misassembled contigs and between 57% and 73% of locally misassembled contigs in assemblies of loblolly pine. Using the real optical mapping data, we correctly identified 75% of extensively misassembled contigs and 100% of locally misassembled contigs in rice, and 77% of extensively misassembled contigs and 80% of locally misassembled contigs in budgerigar. Availability and implementation:misSEQuel can be used as a post-processing step in combination with any genome assembler and is freely available at http://www.cs.colostate.edu/seq/. Contact:muggli@cs.colostate.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Martin D Muggli
- Department of Computer Science, Colorado State University, Fort Collins, CO 80526, USA, Department of Computer Science, University of Helsinki, Finland and Bioinformatics Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - Simon J Puglisi
- Department of Computer Science, Colorado State University, Fort Collins, CO 80526, USA, Department of Computer Science, University of Helsinki, Finland and Bioinformatics Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - Roy Ronen
- Department of Computer Science, Colorado State University, Fort Collins, CO 80526, USA, Department of Computer Science, University of Helsinki, Finland and Bioinformatics Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - Christina Boucher
- Department of Computer Science, Colorado State University, Fort Collins, CO 80526, USA, Department of Computer Science, University of Helsinki, Finland and Bioinformatics Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| |
Collapse
|
29
|
Zhou S, Goldstein S, Place M, Bechner M, Patino D, Potamousis K, Ravindran P, Pape L, Rincon G, Hernandez-Ortiz J, Medrano JF, Schwartz DC. A clone-free, single molecule map of the domestic cow (Bos taurus) genome. BMC Genomics 2015; 16:644. [PMID: 26314885 PMCID: PMC4551733 DOI: 10.1186/s12864-015-1823-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2015] [Accepted: 08/07/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The cattle (Bos taurus) genome was originally selected for sequencing due to its economic importance and unique biology as a model organism for understanding other ruminants, or mammals. Currently, there are two cattle genome sequence assemblies (UMD3.1 and Btau4.6) from groups using dissimilar assembly algorithms, which were complemented by genetic and physical map resources. However, past comparisons between these assemblies revealed substantial differences. Consequently, such discordances have engendered ambiguities when using reference sequence data, impacting genomic studies in cattle and motivating construction of a new optical map resource--BtOM1.0--to guide comparisons and improvements to the current sequence builds. Accordingly, our comprehensive comparisons of BtOM1.0 against the UMD3.1 and Btau4.6 sequence builds tabulate large-to-immediate scale discordances requiring mediation. RESULTS The optical map, BtOM1.0, spanning the B. taurus genome (Hereford breed, L1 Dominette 01449) was assembled from an optical map dataset consisting of 2,973,315 (439 X; raw dataset size before assembly) single molecule optical maps (Rmaps; 1 Rmap = 1 restriction mapped DNA molecule) generated by the Optical Mapping System. The BamHI map spans 2,575.30 Mb and comprises 78 optical contigs assembled by a combination of iterative (using the reference sequence: UMD3.1) and de novo assembly techniques. BtOM1.0 is a high-resolution physical map featuring an average restriction fragment size of 8.91 Kb. Comparisons of BtOM1.0 vs. UMD3.1, or Btau4.6, revealed that Btau4.6 presented far more discordances (7,463) vs. UMD3.1 (4,754). Overall, we found that Btau4.6 presented almost double the number of discordances than UMD3.1 across most of the 6 categories of sequence vs. map discrepancies, which are: COMPLEX (misassembly), DELs (extraneous sequences), INSs (missing sequences), ITs (Inverted/Translocated sequences), ECs (extra restriction cuts) and MCs (missing restriction cuts). CONCLUSION Alignments of UMD3.1 and Btau4.6 to BtOM1.0 reveal discordances commensurate with previous reports, and affirm the NCBI's current designation of UMD3.1 sequence assembly as the "reference assembly" and the Btau4.6 as the "alternate assembly." The cattle genome optical map, BtOM1.0, when used as a comprehensive and largely independent guide, will greatly assist improvements to existing sequence builds, and later serve as an accurate physical scaffold for studies concerning the comparative genomics of cattle breeds.
Collapse
Affiliation(s)
- Shiguo Zhou
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, and the UW Biotechnology Center, University of Wisconsin-Madison, 425 Henry Mall, Madison, WI, 53706, USA.
| | - Steve Goldstein
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, and the UW Biotechnology Center, University of Wisconsin-Madison, 425 Henry Mall, Madison, WI, 53706, USA.
| | - Michael Place
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, and the UW Biotechnology Center, University of Wisconsin-Madison, 425 Henry Mall, Madison, WI, 53706, USA.
| | - Michael Bechner
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, and the UW Biotechnology Center, University of Wisconsin-Madison, 425 Henry Mall, Madison, WI, 53706, USA.
| | - Diego Patino
- Departamento de Materiales, Facultad de Minas, Universidad Nacional de Colombia, Sede Medellin, Calle 75 # 79A-51, Bloque M17, Medellin, Colombia, SA.
| | - Konstantinos Potamousis
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, and the UW Biotechnology Center, University of Wisconsin-Madison, 425 Henry Mall, Madison, WI, 53706, USA.
| | - Prabu Ravindran
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, and the UW Biotechnology Center, University of Wisconsin-Madison, 425 Henry Mall, Madison, WI, 53706, USA.
| | - Louise Pape
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, and the UW Biotechnology Center, University of Wisconsin-Madison, 425 Henry Mall, Madison, WI, 53706, USA.
| | - Gonzalo Rincon
- Department of Animal Science, University of California-Davis, Davis, CA, 95616, USA.
| | - Juan Hernandez-Ortiz
- Departamento de Materiales, Facultad de Minas, Universidad Nacional de Colombia, Sede Medellin, Calle 75 # 79A-51, Bloque M17, Medellin, Colombia, SA.
| | - Juan F Medrano
- Department of Animal Science, University of California-Davis, Davis, CA, 95616, USA.
| | - David C Schwartz
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, and the UW Biotechnology Center, University of Wisconsin-Madison, 425 Henry Mall, Madison, WI, 53706, USA.
| |
Collapse
|
30
|
Single-Molecule Real-Time Sequencing Combined with Optical Mapping Yields Completely Finished Fungal Genome. mBio 2015; 6:mBio.00936-15. [PMID: 26286689 PMCID: PMC4542186 DOI: 10.1128/mbio.00936-15] [Citation(s) in RCA: 115] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Next-generation sequencing (NGS) technologies have increased the scalability, speed, and resolution of genomic sequencing and, thus, have revolutionized genomic studies. However, eukaryotic genome sequencing initiatives typically yield considerably fragmented genome assemblies. Here, we assessed various state-of-the-art sequencing and assembly strategies in order to produce a contiguous and complete eukaryotic genome assembly, focusing on the filamentous fungus Verticillium dahliae. Compared with Illumina-based assemblies of the V. dahliae genome, hybrid assemblies that also include PacBio-generated long reads establish superior contiguity. Intriguingly, provided that sufficient sequence depth is reached, assemblies solely based on PacBio reads outperform hybrid assemblies and even result in fully assembled chromosomes. Furthermore, the addition of optical map data allowed us to produce a gapless and complete V. dahliae genome assembly of the expected eight chromosomes from telomere to telomere. Consequently, we can now study genomic regions that were previously not assembled or poorly assembled, including regions that are populated by repetitive sequences, such as transposons, allowing us to fully appreciate an organism’s biological complexity. Our data show that a combination of PacBio-generated long reads and optical mapping can be used to generate complete and gapless assemblies of fungal genomes. Studying whole-genome sequences has become an important aspect of biological research. The advent of next-generation sequencing (NGS) technologies has nowadays brought genomic science within reach of most research laboratories, including those that study nonmodel organisms. However, most genome sequencing initiatives typically yield (highly) fragmented genome assemblies. Nevertheless, considerable relevant information related to genome structure and evolution is likely hidden in those nonassembled regions. Here, we investigated a diverse set of strategies to obtain gapless genome assemblies, using the genome of a typical ascomycete fungus as the template. Eventually, we were able to show that a combination of PacBio-generated long reads and optical mapping yields a gapless telomere-to-telomere genome assembly, allowing in-depth genome analyses to facilitate functional studies into an organism’s biology.
Collapse
|
31
|
Faino L, Seidl MF, Datema E, van den Berg GCM, Janssen A, Wittenberg AHJ, Thomma BPHJ. Single-Molecule Real-Time Sequencing Combined with Optical Mapping Yields Completely Finished Fungal Genome. mBio 2015. [PMID: 26286689 DOI: 10.1128/mbio.00936-915] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/30/2023] Open
Abstract
UNLABELLED Next-generation sequencing (NGS) technologies have increased the scalability, speed, and resolution of genomic sequencing and, thus, have revolutionized genomic studies. However, eukaryotic genome sequencing initiatives typically yield considerably fragmented genome assemblies. Here, we assessed various state-of-the-art sequencing and assembly strategies in order to produce a contiguous and complete eukaryotic genome assembly, focusing on the filamentous fungus Verticillium dahliae. Compared with Illumina-based assemblies of the V. dahliae genome, hybrid assemblies that also include PacBio-generated long reads establish superior contiguity. Intriguingly, provided that sufficient sequence depth is reached, assemblies solely based on PacBio reads outperform hybrid assemblies and even result in fully assembled chromosomes. Furthermore, the addition of optical map data allowed us to produce a gapless and complete V. dahliae genome assembly of the expected eight chromosomes from telomere to telomere. Consequently, we can now study genomic regions that were previously not assembled or poorly assembled, including regions that are populated by repetitive sequences, such as transposons, allowing us to fully appreciate an organism's biological complexity. Our data show that a combination of PacBio-generated long reads and optical mapping can be used to generate complete and gapless assemblies of fungal genomes. IMPORTANCE Studying whole-genome sequences has become an important aspect of biological research. The advent of next-generation sequencing (NGS) technologies has nowadays brought genomic science within reach of most research laboratories, including those that study nonmodel organisms. However, most genome sequencing initiatives typically yield (highly) fragmented genome assemblies. Nevertheless, considerable relevant information related to genome structure and evolution is likely hidden in those nonassembled regions. Here, we investigated a diverse set of strategies to obtain gapless genome assemblies, using the genome of a typical ascomycete fungus as the template. Eventually, we were able to show that a combination of PacBio-generated long reads and optical mapping yields a gapless telomere-to-telomere genome assembly, allowing in-depth genome analyses to facilitate functional studies into an organism's biology.
Collapse
Affiliation(s)
- Luigi Faino
- Laboratory of Phytopathology, Wageningen University, Wageningen, The Netherlands
| | - Michael F Seidl
- Laboratory of Phytopathology, Wageningen University, Wageningen, The Netherlands
| | | | | | | | | | - Bart P H J Thomma
- Laboratory of Phytopathology, Wageningen University, Wageningen, The Netherlands
| |
Collapse
|
32
|
Zhu Y, Xu J, Sun C, Zhou S, Xu H, Nelson DR, Qian J, Song J, Luo H, Xiang L, Li Y, Xu Z, Ji A, Wang L, Lu S, Hayward A, Sun W, Li X, Schwartz DC, Wang Y, Chen S. Chromosome-level genome map provides insights into diverse defense mechanisms in the medicinal fungus Ganoderma sinense. Sci Rep 2015; 5:11087. [PMID: 26046933 PMCID: PMC4457147 DOI: 10.1038/srep11087] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2015] [Accepted: 05/14/2015] [Indexed: 11/30/2022] Open
Abstract
Fungi have evolved powerful genomic and chemical defense systems to protect themselves against genetic destabilization and other organisms. However, the precise molecular basis involved in fungal defense remain largely unknown in Basidiomycetes. Here the complete genome sequence, as well as DNA methylation patterns and small RNA transcriptomes, was analyzed to provide a holistic overview of secondary metabolism and defense processes in the model medicinal fungus, Ganoderma sinense. We reported the 48.96 Mb genome sequence of G. sinense, consisting of 12 chromosomes and encoding 15,688 genes. More than thirty gene clusters involved in the biosynthesis of secondary metabolites, as well as a large array of genes responsible for their transport and regulation were highlighted. In addition, components of genome defense mechanisms, namely repeat-induced point mutation (RIP), DNA methylation and small RNA-mediated gene silencing, were revealed in G. sinense. Systematic bioinformatic investigation of the genome and methylome suggested that RIP and DNA methylation combinatorially maintain G. sinense genome stability by inactivating invasive genetic material and transposable elements. The elucidation of the G. sinense genome and epigenome provides an unparalleled opportunity to advance our understanding of secondary metabolism and fungal defense mechanisms.
Collapse
Affiliation(s)
- Yingjie Zhu
- 1] Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100700, China [2] Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences &Peking Union Medical College, Beijing 100193, China
| | - Jiang Xu
- 1] Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100700, China [2] Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences &Peking Union Medical College, Beijing 100193, China
| | - Chao Sun
- Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences &Peking Union Medical College, Beijing 100193, China
| | - Shiguo Zhou
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Haibin Xu
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - David R Nelson
- Department of Microbiology, Immunology and Biochemistry, University of Tennessee Health Science Center, Memphis, Tennessee 38163, USA
| | - Jun Qian
- Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences &Peking Union Medical College, Beijing 100193, China
| | - Jingyuan Song
- Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences &Peking Union Medical College, Beijing 100193, China
| | - Hongmei Luo
- Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences &Peking Union Medical College, Beijing 100193, China
| | - Li Xiang
- Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences &Peking Union Medical College, Beijing 100193, China
| | - Ying Li
- Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences &Peking Union Medical College, Beijing 100193, China
| | - Zhichao Xu
- Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences &Peking Union Medical College, Beijing 100193, China
| | - Aijia Ji
- Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences &Peking Union Medical College, Beijing 100193, China
| | - Lizhi Wang
- Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences &Peking Union Medical College, Beijing 100193, China
| | - Shanfa Lu
- Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences &Peking Union Medical College, Beijing 100193, China
| | - Alice Hayward
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, Australia, 4072
| | - Wei Sun
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Xiwen Li
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - David C Schwartz
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Yitao Wang
- State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences, University of Macau, Macau, 999078, China
| | - Shilin Chen
- 1] Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100700, China [2] Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences &Peking Union Medical College, Beijing 100193, China
| |
Collapse
|
33
|
Abstract
The current genomic revolution was made possible by joint advances in genome sequencing technologies and computational approaches for analyzing sequence data. The close interaction between biologists and computational scientists is perhaps most apparent in the development of approaches for sequencing entire genomes, a feat that would not be possible without sophisticated computational tools called genome assemblers (short for genome sequence assemblers). Here, we survey the key developments in algorithms for assembling genome sequences since the development of the first DNA sequencing methods more than 35 years ago.
Collapse
Affiliation(s)
- Jared T Simpson
- Ontario Institute for Cancer Research, Toronto, Ontario M5G 0A3, Canada;
| | | |
Collapse
|
34
|
Abstract
Optical mapping has been widely used to improve de novo plant genome assemblies, including rice, maize, Medicago, Amborella, tomato and wheat, with more genomes in the pipeline. Optical mapping provides long-range information of the genome and can more easily identify large structural variations. The ability of optical mapping to assay long single DNA molecules nicely complements short-read sequencing which is more suitable for the identification of small and short-range variants. Direct use of optical mapping to study population-level genetic diversity is currently limited to microbial strain typing and human diversity studies. Nonetheless, optical mapping shows great promise in the study of plant trait development, domestication and polyploid evolution. Here we review the current applications and future prospects of optical mapping in the field of plant comparative genomics.
Collapse
Affiliation(s)
- Haibao Tang
- Center for Genomics and Biotechnology, Fujian Agriculture and Forestry University, Fuzhou, 350002, Fujian People's Republic of China ; School of Plant Sciences, iPlant Collaborative, University of Arizona, Tucson, AZ 85721 USA
| | - Eric Lyons
- School of Plant Sciences, iPlant Collaborative, University of Arizona, Tucson, AZ 85721 USA
| | | |
Collapse
|
35
|
Abstract
Optical mapping has been widely used to improve de novo plant genome assemblies, including rice, maize, Medicago, Amborella, tomato and wheat, with more genomes in the pipeline. Optical mapping provides long-range information of the genome and can more easily identify large structural variations. The ability of optical mapping to assay long single DNA molecules nicely complements short-read sequencing which is more suitable for the identification of small and short-range variants. Direct use of optical mapping to study population-level genetic diversity is currently limited to microbial strain typing and human diversity studies. Nonetheless, optical mapping shows great promise in the study of plant trait development, domestication and polyploid evolution. Here we review the current applications and future prospects of optical mapping in the field of plant comparative genomics.
Collapse
|
36
|
Tang H, Zhang X, Miao C, Zhang J, Ming R, Schnable JC, Schnable PS, Lyons E, Lu J. ALLMAPS: robust scaffold ordering based on multiple maps. Genome Biol 2015; 16:3. [PMID: 25583564 PMCID: PMC4305236 DOI: 10.1186/s13059-014-0573-1] [Citation(s) in RCA: 240] [Impact Index Per Article: 26.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2014] [Accepted: 12/15/2014] [Indexed: 11/29/2022] Open
Abstract
The ordering and orientation of genomic scaffolds to reconstruct chromosomes is an essential step during de novo genome assembly. Because this process utilizes various mapping techniques that each provides an independent line of evidence, a combination of multiple maps can improve the accuracy of the resulting chromosomal assemblies. We present ALLMAPS, a method capable of computing a scaffold ordering that maximizes colinearity across a collection of maps. ALLMAPS is robust against common mapping errors, and generates sequences that are maximally concordant with the input maps. ALLMAPS is a useful tool in building high-quality genome assemblies. ALLMAPS is available at: https://github.com/tanghaibao/jcvi/wiki/ALLMAPS.
Collapse
Affiliation(s)
- Haibao Tang
- Center for Genomics and Biotechnology, Fujian Agriculture and Forestry University, Fuzhou, 350002, Fujian Province, China. .,School of Plant Sciences, iPlant Collaborative, University of Arizona, Tucson, AZ, 85721, USA. .,Data2Bio LLC, 2079 Roy J. Carver Co-Lab, Ames, Iowa, 50011, USA.
| | - Xingtan Zhang
- J. Craig Venter Institute, 9704 Medical Center Dr, Rockville, MD, 20850, USA.
| | - Chenyong Miao
- Center for Genomics and Biotechnology, Fujian Agriculture and Forestry University, Fuzhou, 350002, Fujian Province, China.
| | - Jisen Zhang
- Center for Genomics and Biotechnology, Fujian Agriculture and Forestry University, Fuzhou, 350002, Fujian Province, China.
| | - Ray Ming
- Center for Genomics and Biotechnology, Fujian Agriculture and Forestry University, Fuzhou, 350002, Fujian Province, China.
| | - James C Schnable
- Data2Bio LLC, 2079 Roy J. Carver Co-Lab, Ames, Iowa, 50011, USA. .,Department of Agronomy and Horticulture, University of Nebraska, Lincoln, NE, 68588, USA.
| | - Patrick S Schnable
- Data2Bio LLC, 2079 Roy J. Carver Co-Lab, Ames, Iowa, 50011, USA. .,Department of Agronomy, Iowa State University, Ames, IA, 50011, USA.
| | - Eric Lyons
- School of Plant Sciences, iPlant Collaborative, University of Arizona, Tucson, AZ, 85721, USA.
| | - Jianguo Lu
- Heilongjiang River Fisheries Research Institute, Harbin, 150070, China.
| |
Collapse
|
37
|
Abstract
Maize occupies dual roles as both (a) one of the big-three grain species (along with rice and wheat) responsible for providing more than half of the calories consumed around the world, and (b) a model system for plant genetics and cytogenetics dating back to the origin of the field of genetics in the early twentieth century. The long history of genetic investigation in this species combined with modern genomic and quantitative genetic data has provided particular insight into the characteristics of genes linked to phenotypes and how these genes differ from many other sequences in plant genomes that are not easily distinguishable based on molecular data alone. These recent results suggest that the number of genes in plants that make significant contributions to phenotype may be lower than the number of genes defined by current molecular criteria, and also indicate that syntenic conservation has been underemphasized as a marker for gene function.
Collapse
Affiliation(s)
- James C Schnable
- Department of Agronomy and Horticulture, University of Nebraska, Lincoln, Nebraska 68583;
| |
Collapse
|
38
|
Mendelowitz L, Pop M. Computational methods for optical mapping. Gigascience 2014; 3:33. [PMID: 25671093 PMCID: PMC4323141 DOI: 10.1186/2047-217x-3-33] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2014] [Accepted: 12/02/2014] [Indexed: 11/10/2022] Open
Abstract
Optical mapping and newer genome mapping technologies based on nicking enzymes provide low resolution but long-range genomic information. The optical mapping technique has been successfully used for assessing the quality of genome assemblies and for detecting large-scale structural variants and rearrangements that cannot be detected using current paired end sequencing protocols. Here, we review several algorithms and methods for building consensus optical maps and aligning restriction patterns to a reference map, as well as methods for using optical maps with sequence assemblies.
Collapse
Affiliation(s)
- Lee Mendelowitz
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD USA ; Applied Math & Statistics, and Scientific Computation, University of Maryland, College Park, MD USA
| | - Mihai Pop
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD USA ; Department of Computer Science, University of Maryland, College Park, MD USA
| |
Collapse
|
39
|
Wei K, Pan S. Maize protein phosphatase gene family: identification and molecular characterization. BMC Genomics 2014; 15:773. [PMID: 25199535 PMCID: PMC4169795 DOI: 10.1186/1471-2164-15-773] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2014] [Accepted: 09/03/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Protein phosphatases (PPs) play critical roles in various cellular processes through the reversible protein phosphorylation that dictates many signal transduction pathways among organisms. Recently, PPs in Arabidopsis and rice have been identified, while the whole complement of PPs in maize is yet to be reported. RESULTS In this study, we have identified 159 PP-encoding genes in the maize genome. Phylogenetic analyses categorized the ZmPP gene family into 3 classes (PP2C, PTP, and PP2A) with considerable conservation among classes. Similar intron/exon structural patterns were observed in the same classes. Moreover, detailed gene structures and duplicative events were then researched. The expression profiles of ZmPPs under different developmental stages and abiotic stresses (including salt, drought, and cold) were analyzed using microarray and RNA-seq data. A total of 152 members were detected in 18 different tissues representing distinct stages of maize plant developments. Under salt stress, one gene was significantly up-expressed in seed root (SR) and one gene was down-expressed in primary root (PR) and crown root (CR), respectively. As for drought stress condition, 13 genes were found to be differentially expressed in leaf, out of which 10 were up-regulated and 3 exhibited down-regulation. Additionally, 13 up-regulated and 3 down-regulated genes were found in cold-tolerant line ETH-DH7. Furthermore, real-time PCR was used to confirm the expression patterns of ZmPPs. CONCLUSIONS Our results provide new insights into the phylogenetic relationships and characteristic functions of maize PPs and will be useful in studies aimed at revealing the global regulatory network in maize abiotic stress responses, thereby contributing to the maize molecular breeding with enhanced quality traits.
Collapse
Affiliation(s)
- Kaifa Wei
- School of Biological Sciences and Biotechnology, Minnan Normal University, Zhangzhou 363000, China.
| | | |
Collapse
|
40
|
Bilinski P, Distor K, Gutierrez-Lopez J, Mendoza GM, Shi J, Dawe RK, Ross-Ibarra J. Diversity and evolution of centromere repeats in the maize genome. Chromosoma 2014; 124:57-65. [PMID: 25190528 DOI: 10.1007/s00412-014-0483-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2014] [Revised: 07/21/2014] [Accepted: 08/11/2014] [Indexed: 10/24/2022]
Abstract
Centromere repeats are found in most eukaryotes and play a critical role in kinetochore formation. Though centromere repeats exhibit considerable diversity both within and among species, little is understood about the mechanisms that drive centromere repeat evolution. Here, we use maize as a model to investigate how a complex history involving polyploidy, fractionation, and recent domestication has impacted the diversity of the maize centromeric repeat CentC. We first validate the existence of long tandem arrays of repeats in maize and other taxa in the genus Zea. Although we find considerable sequence diversity among CentC copies genome-wide, genetic similarity among repeats is highest within these arrays, suggesting that tandem duplications are the primary mechanism for the generation of new copies. Nonetheless, clustering analyses identify similar sequences among distant repeats, and simulations suggest that this pattern may be due to homoplasious mutation. Although the two ancestral subgenomes of maize have contributed nearly equal numbers of centromeres, our analysis shows that the majority of all CentC repeats derive from one of the parental genomes, with an even stronger bias when examining the largest assembled contiguous clusters. Finally, by comparing maize with its wild progenitor teosinte, we find that the abundance of CentC likely decreased after domestication, while the pericentromeric repeat Cent4 has drastically increased.
Collapse
Affiliation(s)
- Paul Bilinski
- Department of Plant Sciences, University of California Davis, Davis, CA, 95616, USA
| | | | | | | | | | | | | |
Collapse
|
41
|
Wang YG, Yu HQ, Zhang YY, Lai CX, She YH, Li WC, Fu FL. Interaction between abscisic acid receptor PYL3 and protein phosphatase type 2C in response to ABA signaling in maize. Gene 2014; 549:179-85. [PMID: 25091169 DOI: 10.1016/j.gene.2014.08.001] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2014] [Revised: 07/31/2014] [Accepted: 08/01/2014] [Indexed: 11/26/2022]
Abstract
Abscisic acid (ABA) is a ubiquitous hormone that regulates plant growth, development and responses to environmental stresses. In recent researches, pyrabactin resistance 1-like protein (PYL) and protein phosphatase type 2C (PP2C) were identified as the direct receptor and the second component of ABA signaling pathway, respectively. However, a lot of PYL and PP2C members were found in Arabidopsis and several other plants. Some of them were found not to be involved in ABA signaling. Because of the complex diversity of the genome, few documents have been available on the molecular details of the ABA signal perception system in maize. In the present study, we conducted bioinformatics analysis to find out the candidates (ZmPYL3 and ZmPP2C16) of the PYL and PP2C members most probably involved in ABA signaling in maize, cloned their encoding genes (ZmPYL3 and ZmPP2C16), verified the interaction between these two proteins in response to exogenous ABA induction by yeast two-hybrid assay and bimolecular fluorescence complementation, and investigated the expression patterns of these two genes under the induction of exogenous ABA by real-time fluorescence quantitative PCR. The results indicated that the ZmPYL3 and ZmPP2C16 proteins interacted in vitro and in vivo in response to the induction of exogenous ABA. The downregulated expression of the ZmPYL3 gene and the upregulated expression of the ZmPP2C16 gene are responsive to the induction of exogenous ABA. The ZmPYL3 and ZmPP2C16 proteins are the most probable members of the receptors and the second components of ABA signaling pathway, respectively.
Collapse
Affiliation(s)
- Ying-Ge Wang
- Maize Research Institute, Sichuan Agricultural University, Chengdu, Sichuan 611130, PR China
| | - Hao-Qiang Yu
- Maize Research Institute, Sichuan Agricultural University, Chengdu, Sichuan 611130, PR China
| | - Yuan-Yuan Zhang
- Maize Research Institute, Sichuan Agricultural University, Chengdu, Sichuan 611130, PR China
| | - Cong-Xian Lai
- Maize Research Institute, Sichuan Agricultural University, Chengdu, Sichuan 611130, PR China
| | - Yue-Hui She
- Agronomy Faculty, Sichuan Agricultural University, Chengdu, Sichuan 611130, PR China
| | - Wan-Chen Li
- Maize Research Institute, Sichuan Agricultural University, Chengdu, Sichuan 611130, PR China.
| | - Feng-Ling Fu
- Maize Research Institute, Sichuan Agricultural University, Chengdu, Sichuan 611130, PR China.
| |
Collapse
|
42
|
Fluorescence in situ hybridization and optical mapping to correct scaffold arrangement in the tomato genome. G3-GENES GENOMES GENETICS 2014; 4:1395-405. [PMID: 24879607 PMCID: PMC4132171 DOI: 10.1534/g3.114.011197] [Citation(s) in RCA: 63] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The order and orientation (arrangement) of all 91 sequenced scaffolds in the 12 pseudomolecules of the recently published tomato (Solanum lycopersicum, 2n = 2x = 24) genome sequence were positioned based on marker order in a high-density linkage map. Here, we report the arrangement of these scaffolds determined by two independent physical methods, bacterial artificial chromosome–fluorescence in situ hybridization (BAC-FISH) and optical mapping. By localizing BACs at the ends of scaffolds to spreads of tomato synaptonemal complexes (pachytene chromosomes), we showed that 45 scaffolds, representing one-third of the tomato genome, were arranged differently than predicted by the linkage map. These scaffolds occur mostly in pericentric heterochromatin where 77% of the tomato genome is located and where linkage mapping is less accurate due to reduced crossing over. Although useful for only part of the genome, optical mapping results were in complete agreement with scaffold arrangement by FISH but often disagreed with scaffold arrangement based on the linkage map. The scaffold arrangement based on FISH and optical mapping changes the positions of hundreds of markers in the linkage map, especially in heterochromatin. These results suggest that similar errors exist in pseudomolecules from other large genomes that have been assembled using only linkage maps to predict scaffold arrangement, and these errors can be corrected using FISH and/or optical mapping. Of note, BAC-FISH also permits estimates of the sizes of gaps between scaffolds, and unanchored BACs are often visualized by FISH in gaps between scaffolds and thus represent starting points for filling these gaps.
Collapse
|
43
|
Tang H, Krishnakumar V, Bidwell S, Rosen B, Chan A, Zhou S, Gentzbittel L, Childs KL, Yandell M, Gundlach H, Mayer KFX, Schwartz DC, Town CD. An improved genome release (version Mt4.0) for the model legume Medicago truncatula. BMC Genomics 2014. [PMID: 24767513 DOI: 10.1186/1471-216415-312] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/24/2023] Open
Abstract
BACKGROUND Medicago truncatula, a close relative of alfalfa, is a preeminent model for studying nitrogen fixation, symbiosis, and legume genomics. The Medicago sequencing project began in 2003 with the goal to decipher sequences originated from the euchromatic portion of the genome. The initial sequencing approach was based on a BAC tiling path, culminating in a BAC-based assembly (Mt3.5) as well as an in-depth analysis of the genome published in 2011. RESULTS Here we describe a further improved and refined version of the M. truncatula genome (Mt4.0) based on de novo whole genome shotgun assembly of a majority of Illumina and 454 reads using ALLPATHS-LG. The ALLPATHS-LG scaffolds were anchored onto the pseudomolecules on the basis of alignments to both the optical map and the genotyping-by-sequencing (GBS) map. The Mt4.0 pseudomolecules encompass ~360 Mb of actual sequences spanning 390 Mb of which ~330 Mb align perfectly with the optical map, presenting a drastic improvement over the BAC-based Mt3.5 which only contained 70% sequences (~250 Mb) of the current version. Most of the sequences and genes that previously resided on the unanchored portion of Mt3.5 have now been incorporated into the Mt4.0 pseudomolecules, with the exception of ~28 Mb of unplaced sequences. With regard to gene annotation, the genome has been re-annotated through our gene prediction pipeline, which integrates EST, RNA-seq, protein and gene prediction evidences. A total of 50,894 genes (31,661 high confidence and 19,233 low confidence) are included in Mt4.0 which overlapped with ~82% of the gene loci annotated in Mt3.5. Of the remaining genes, 14% of the Mt3.5 genes have been deprecated to an "unsupported" status and 4% are absent from the Mt4.0 predictions. CONCLUSIONS Mt4.0 and its associated resources, such as genome browsers, BLAST-able datasets and gene information pages, can be found on the JCVI Medicago web site (http://www.jcvi.org/medicago). The assembly and annotation has been deposited in GenBank (BioProject: PRJNA10791). The heavily curated chromosomal sequences and associated gene models of Medicago will serve as a better reference for legume biology and comparative genomics.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | - Christopher D Town
- J, Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD, USA.
| |
Collapse
|
44
|
Tang H, Krishnakumar V, Bidwell S, Rosen B, Chan A, Zhou S, Gentzbittel L, Childs KL, Yandell M, Gundlach H, Mayer KFX, Schwartz DC, Town CD. An improved genome release (version Mt4.0) for the model legume Medicago truncatula. BMC Genomics 2014; 15:312. [PMID: 24767513 PMCID: PMC4234490 DOI: 10.1186/1471-2164-15-312] [Citation(s) in RCA: 260] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2014] [Accepted: 04/22/2014] [Indexed: 11/10/2022] Open
Abstract
Background Medicago truncatula, a close relative of alfalfa, is a preeminent model for studying nitrogen fixation, symbiosis, and legume genomics. The Medicago sequencing project began in 2003 with the goal to decipher sequences originated from the euchromatic portion of the genome. The initial sequencing approach was based on a BAC tiling path, culminating in a BAC-based assembly (Mt3.5) as well as an in-depth analysis of the genome published in 2011. Results Here we describe a further improved and refined version of the M. truncatula genome (Mt4.0) based on de novo whole genome shotgun assembly of a majority of Illumina and 454 reads using ALLPATHS-LG. The ALLPATHS-LG scaffolds were anchored onto the pseudomolecules on the basis of alignments to both the optical map and the genotyping-by-sequencing (GBS) map. The Mt4.0 pseudomolecules encompass ~360 Mb of actual sequences spanning 390 Mb of which ~330 Mb align perfectly with the optical map, presenting a drastic improvement over the BAC-based Mt3.5 which only contained 70% sequences (~250 Mb) of the current version. Most of the sequences and genes that previously resided on the unanchored portion of Mt3.5 have now been incorporated into the Mt4.0 pseudomolecules, with the exception of ~28 Mb of unplaced sequences. With regard to gene annotation, the genome has been re-annotated through our gene prediction pipeline, which integrates EST, RNA-seq, protein and gene prediction evidences. A total of 50,894 genes (31,661 high confidence and 19,233 low confidence) are included in Mt4.0 which overlapped with ~82% of the gene loci annotated in Mt3.5. Of the remaining genes, 14% of the Mt3.5 genes have been deprecated to an “unsupported” status and 4% are absent from the Mt4.0 predictions. Conclusions Mt4.0 and its associated resources, such as genome browsers, BLAST-able datasets and gene information pages, can be found on the JCVI Medicago web site (http://www.jcvi.org/medicago). The assembly and annotation has been deposited in GenBank (BioProject: PRJNA10791). The heavily curated chromosomal sequences and associated gene models of Medicago will serve as a better reference for legume biology and comparative genomics.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | - Christopher D Town
- J, Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD, USA.
| |
Collapse
|
45
|
Tang H, Krishnakumar V, Bidwell S, Rosen B, Chan A, Zhou S, Gentzbittel L, Childs KL, Yandell M, Gundlach H, Mayer KFX, Schwartz DC, Town CD. An improved genome release (version Mt4.0) for the model legume Medicago truncatula. BMC Genomics 2014. [PMID: 24767513 DOI: 10.1186/1471‐2164‐15‐312] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Medicago truncatula, a close relative of alfalfa, is a preeminent model for studying nitrogen fixation, symbiosis, and legume genomics. The Medicago sequencing project began in 2003 with the goal to decipher sequences originated from the euchromatic portion of the genome. The initial sequencing approach was based on a BAC tiling path, culminating in a BAC-based assembly (Mt3.5) as well as an in-depth analysis of the genome published in 2011. RESULTS Here we describe a further improved and refined version of the M. truncatula genome (Mt4.0) based on de novo whole genome shotgun assembly of a majority of Illumina and 454 reads using ALLPATHS-LG. The ALLPATHS-LG scaffolds were anchored onto the pseudomolecules on the basis of alignments to both the optical map and the genotyping-by-sequencing (GBS) map. The Mt4.0 pseudomolecules encompass ~360 Mb of actual sequences spanning 390 Mb of which ~330 Mb align perfectly with the optical map, presenting a drastic improvement over the BAC-based Mt3.5 which only contained 70% sequences (~250 Mb) of the current version. Most of the sequences and genes that previously resided on the unanchored portion of Mt3.5 have now been incorporated into the Mt4.0 pseudomolecules, with the exception of ~28 Mb of unplaced sequences. With regard to gene annotation, the genome has been re-annotated through our gene prediction pipeline, which integrates EST, RNA-seq, protein and gene prediction evidences. A total of 50,894 genes (31,661 high confidence and 19,233 low confidence) are included in Mt4.0 which overlapped with ~82% of the gene loci annotated in Mt3.5. Of the remaining genes, 14% of the Mt3.5 genes have been deprecated to an "unsupported" status and 4% are absent from the Mt4.0 predictions. CONCLUSIONS Mt4.0 and its associated resources, such as genome browsers, BLAST-able datasets and gene information pages, can be found on the JCVI Medicago web site (http://www.jcvi.org/medicago). The assembly and annotation has been deposited in GenBank (BioProject: PRJNA10791). The heavily curated chromosomal sequences and associated gene models of Medicago will serve as a better reference for legume biology and comparative genomics.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | - Christopher D Town
- J, Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD, USA.
| |
Collapse
|
46
|
Sequencing, assembling, and correcting draft genomes using recombinant populations. G3-GENES GENOMES GENETICS 2014; 4:669-79. [PMID: 24531727 PMCID: PMC4059239 DOI: 10.1534/g3.114.010264] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Current de novo whole-genome sequencing approaches often are inadequate for organisms lacking substantial preexisting genetic data. Problems with these methods are manifest as: large numbers of scaffolds that are not ordered within chromosomes or assigned to individual chromosomes, misassembly of allelic sequences as separate loci when the individual(s) being sequenced are heterozygous, and the collapse of recently duplicated sequences into a single locus, regardless of levels of heterozygosity. Here we propose a new approach for producing de novo whole-genome sequences—which we call recombinant population genome construction—that solves many of the problems encountered in standard genome assembly and that can be applied in model and nonmodel organisms. Our approach takes advantage of next-generation sequencing technologies to simultaneously barcode and sequence a large number of individuals from a recombinant population. The sequences of all recombinants can be combined to create an initial de novo assembly, followed by the use of individual recombinant genotypes to correct assembly splitting/collapsing and to order and orient scaffolds within linkage groups. Recombinant population genome construction can rapidly accelerate the transformation of nonmodel species into genome-enabled systems by simultaneously producing a high-quality genome assembly and providing genomic tools (e.g., high-confidence single-nucleotide polymorphisms) for immediate applications. In populations segregating for important functional traits, this approach also enables simultaneous mapping of quantitative trait loci. We demonstrate our method using simulated Illumina data from a recombinant population of Caenorhabditis elegans and show that the method can produce a high-fidelity, high-quality genome assembly for both parents of the cross.
Collapse
|
47
|
Ariyadasa R, Mascher M, Nussbaumer T, Schulte D, Frenkel Z, Poursarebani N, Zhou R, Steuernagel B, Gundlach H, Taudien S, Felder M, Platzer M, Himmelbach A, Schmutzer T, Hedley PE, Muehlbauer GJ, Scholz U, Korol A, Mayer KF, Waugh R, Langridge P, Graner A, Stein N. A sequence-ready physical map of barley anchored genetically by two million single-nucleotide polymorphisms. PLANT PHYSIOLOGY 2014; 164:412-23. [PMID: 24243933 PMCID: PMC3875818 DOI: 10.1104/pp.113.228213] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/09/2013] [Accepted: 11/13/2013] [Indexed: 05/18/2023]
Abstract
Barley (Hordeum vulgare) is an important cereal crop and a model species for Triticeae genomics. To lay the foundation for hierarchical map-based sequencing, a genome-wide physical map of its large and complex 5.1 billion-bp genome was constructed by high-information content fingerprinting of almost 600,000 bacterial artificial chromosomes representing 14-fold haploid genome coverage. The resultant physical map comprises 9,265 contigs with a cumulative size of 4.9 Gb representing 96% of the physical length of the barley genome. The reliability of the map was verified through extensive genetic marker information and the analysis of topological networks of clone overlaps. A minimum tiling path of 66,772 minimally overlapping clones was defined that will serve as a template for hierarchical clone-by-clone map-based shotgun sequencing. We integrated whole-genome shotgun sequence data from the individuals of two mapping populations with published bacterial artificial chromosome survey sequence information to genetically anchor the physical map. This novel approach in combination with the comprehensive whole-genome shotgun sequence data sets allowed us to independently validate and improve a previously reported physical and genetic framework. The resources developed in this study will underpin fine-mapping and cloning of agronomically important genes and the assembly of a draft genome sequence.
Collapse
|
48
|
|
49
|
Chamala S, Chanderbali AS, Der JP, Lan T, Walts B, Albert VA, dePamphilis CW, Leebens-Mack J, Rounsley S, Schuster SC, Wing RA, Xiao N, Moore R, Soltis PS, Soltis DE, Barbazuk WB. Assembly and Validation of the Genome of the Nonmodel Basal Angiosperm Amborella. Science 2013; 342:1516-7. [DOI: 10.1126/science.1241130] [Citation(s) in RCA: 77] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
50
|
Karafiátová M, Bartoš J, Kopecký D, Ma L, Sato K, Houben A, Stein N, Doležel J. Mapping nonrecombining regions in barley using multicolor FISH. Chromosome Res 2013; 21:739-51. [DOI: 10.1007/s10577-013-9380-x] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2013] [Revised: 08/26/2013] [Accepted: 08/30/2013] [Indexed: 12/22/2022]
|