1
|
Moeckel C, Mareboina M, Konnaris MA, Chan CS, Mouratidis I, Montgomery A, Chantzi N, Pavlopoulos GA, Georgakopoulos-Soares I. A survey of k-mer methods and applications in bioinformatics. Comput Struct Biotechnol J 2024; 23:2289-2303. [PMID: 38840832 PMCID: PMC11152613 DOI: 10.1016/j.csbj.2024.05.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 05/14/2024] [Accepted: 05/15/2024] [Indexed: 06/07/2024] Open
Abstract
The rapid progression of genomics and proteomics has been driven by the advent of advanced sequencing technologies, large, diverse, and readily available omics datasets, and the evolution of computational data processing capabilities. The vast amount of data generated by these advancements necessitates efficient algorithms to extract meaningful information. K-mers serve as a valuable tool when working with large sequencing datasets, offering several advantages in computational speed and memory efficiency and carrying the potential for intrinsic biological functionality. This review provides an overview of the methods, applications, and significance of k-mers in genomic and proteomic data analyses, as well as the utility of absent sequences, including nullomers and nullpeptides, in disease detection, vaccine development, therapeutics, and forensic science. Therefore, the review highlights the pivotal role of k-mers in addressing current genomic and proteomic problems and underscores their potential for future breakthroughs in research.
Collapse
Affiliation(s)
- Camille Moeckel
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Manvita Mareboina
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Maxwell A. Konnaris
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Candace S.Y. Chan
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
| | - Ioannis Mouratidis
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Penn State University, University Park, Pennsylvania, USA
| | - Austin Montgomery
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Nikol Chantzi
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | | | - Ilias Georgakopoulos-Soares
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Penn State University, University Park, Pennsylvania, USA
| |
Collapse
|
2
|
Repetitive Sequence Barcode Probe for Karyotype Analysis in Tripidium arundinaceum. Int J Mol Sci 2022; 23:ijms23126726. [PMID: 35743180 PMCID: PMC9224303 DOI: 10.3390/ijms23126726] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2022] [Revised: 06/10/2022] [Accepted: 06/14/2022] [Indexed: 11/17/2022] Open
Abstract
The barcode probe is a convenient and efficient tool for molecular cytogenetics. Tripidium arundinaceum, as a polyploid wild allied genus of Saccharum, is a useful genetic resource that confers biotic and abiotic stress resistance for sugarcane breeding. Unfortunately, the basic cytogenetic information is still unclear due to the complex genome. We constructed the Cot-20 library for screening moderately and highly repetitive sequences from T. arundinaceum, and the chromosomal distribution of these repetitive sequences was explored. We used the barcode of repetitive sequence probes to distinguish the ten chromosome types of T. arundinaceum by fluorescence in situ hybridization (FISH) with Ea-0907, Ea-0098, and 45S rDNA. Furthermore, the distinction among homology chromosomes based on repetitive sequences was constructed in T. arundinaceum by the repeated FISH using the barcode probes including Ea-0663, Ea-0267, EaCent, 5S rDNA, Ea-0265, Ea-0070, and 45S rDNA. We combined these probes to distinguish 37 different chromosome types, suggesting that the repetitive sequences may have different distributions on homologous chromosomes of T. arundinaceum. In summary, this method provide a basis for the development of similar applications for cytogenetic analysis in other species.
Collapse
|
3
|
Santoro D, Pellegrina L, Comin M, Vandin F. SPRISS: Approximating Frequent K-mers by Sampling Reads, and Applications. Bioinformatics 2022; 38:3343-3350. [PMID: 35583271 PMCID: PMC9237683 DOI: 10.1093/bioinformatics/btac180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 02/25/2022] [Accepted: 05/16/2022] [Indexed: 11/29/2022] Open
Abstract
Motivation The extraction of k-mers is a fundamental component in many complex analyses of large next-generation sequencing datasets, including reads classification in genomics and the characterization of RNA-seq datasets. The extraction of all k-mers and their frequencies is extremely demanding in terms of running time and memory, owing to the size of the data and to the exponential number of k-mers to be considered. However, in several applications, only frequent k-mers, which are k-mers appearing in a relatively high proportion of the data, are required by the analysis. Results In this work, we present SPRISS, a new efficient algorithm to approximate frequent k-mers and their frequencies in next-generation sequencing data. SPRISS uses a simple yet powerful reads sampling scheme, which allows to extract a representative subset of the dataset that can be used, in combination with any k-mer counting algorithm, to perform downstream analyses in a fraction of the time required by the analysis of the whole data, while obtaining comparable answers. Our extensive experimental evaluation demonstrates the efficiency and accuracy of SPRISS in approximating frequent k-mers, and shows that it can be used in various scenarios, such as the comparison of metagenomic datasets, the identification of discriminative k-mers, and SNP (single nucleotide polymorphism) genotyping, to extract insights in a fraction of the time required by the analysis of the whole dataset. Availability and implementation SPRISS [a preliminary version (Santoro et al., 2021) of this work was presented at RECOMB 2021] is available at https://github.com/VandinLab/SPRISS. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Diego Santoro
- Department of Information Engineering, University of Padova, Padova, 35131, Italy
| | - Leonardo Pellegrina
- Department of Information Engineering, University of Padova, Padova, 35131, Italy
| | - Matteo Comin
- Department of Information Engineering, University of Padova, Padova, 35131, Italy
| | - Fabio Vandin
- Department of Information Engineering, University of Padova, Padova, 35131, Italy
| |
Collapse
|
4
|
Lin G, He C, Zheng J, Koo DH, Le H, Zheng H, Tamang TM, Lin J, Liu Y, Zhao M, Hao Y, McFraland F, Wang B, Qin Y, Tang H, McCarty DR, Wei H, Cho MJ, Park S, Kaeppler H, Kaeppler SM, Liu Y, Springer N, Schnable PS, Wang G, White FF, Liu S. Chromosome-level genome assembly of a regenerable maize inbred line A188. Genome Biol 2021; 22:175. [PMID: 34108023 PMCID: PMC8188678 DOI: 10.1186/s13059-021-02396-x] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2020] [Accepted: 05/28/2021] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND The maize inbred line A188 is an attractive model for elucidation of gene function and improvement due to its high embryogenic capacity and many contrasting traits to the first maize reference genome, B73, and other elite lines. The lack of a genome assembly of A188 limits its use as a model for functional studies. RESULTS Here, we present a chromosome-level genome assembly of A188 using long reads and optical maps. Comparison of A188 with B73 using both whole-genome alignments and read depths from sequencing reads identify approximately 1.1 Gb of syntenic sequences as well as extensive structural variation, including a 1.8-Mb duplication containing the Gametophyte factor1 locus for unilateral cross-incompatibility, and six inversions of 0.7 Mb or greater. Increased copy number of carotenoid cleavage dioxygenase 1 (ccd1) in A188 is associated with elevated expression during seed development. High ccd1 expression in seeds together with low expression of yellow endosperm 1 (y1) reduces carotenoid accumulation, accounting for the white seed phenotype of A188. Furthermore, transcriptome and epigenome analyses reveal enhanced expression of defense pathways and altered DNA methylation patterns of the embryonic callus. CONCLUSIONS The A188 genome assembly provides a high-resolution sequence for a complex genome species and a foundational resource for analyses of genome variation and gene function in maize. The genome, in comparison to B73, contains extensive intra-species structural variations and other genetic differences. Expression and network analyses identify discrete profiles for embryonic callus and other tissues.
Collapse
Affiliation(s)
- Guifang Lin
- Department of Plant Pathology, Kansas State University, 4024 Throckmorton Center, Manhattan, KS, 66506-5502, USA
| | - Cheng He
- Department of Plant Pathology, Kansas State University, 4024 Throckmorton Center, Manhattan, KS, 66506-5502, USA
| | - Jun Zheng
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Dal-Hoe Koo
- Department of Plant Pathology, Kansas State University, 4024 Throckmorton Center, Manhattan, KS, 66506-5502, USA
| | - Ha Le
- Department of Plant Pathology, Kansas State University, 4024 Throckmorton Center, Manhattan, KS, 66506-5502, USA
| | - Huakun Zheng
- Department of Plant Pathology, Kansas State University, 4024 Throckmorton Center, Manhattan, KS, 66506-5502, USA
| | - Tej Man Tamang
- Department of Horticulture and Natural Resources, Kansas State University, Manhattan, KS, 66506-5502, USA
| | - Jinguang Lin
- Department of Plant Pathology, Kansas State University, 4024 Throckmorton Center, Manhattan, KS, 66506-5502, USA
- Present Address, Corvallis, OR, 97330, USA
| | - Yan Liu
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Mingxia Zhao
- Department of Plant Pathology, Kansas State University, 4024 Throckmorton Center, Manhattan, KS, 66506-5502, USA
| | - Yangfan Hao
- Department of Plant Pathology, Kansas State University, 4024 Throckmorton Center, Manhattan, KS, 66506-5502, USA
| | - Frank McFraland
- Department of Agronomy, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Bo Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Yang Qin
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Haibao Tang
- Center for Genomics and Biotechnology and Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fuzhou, 350002, Fujian, China
| | - Donald R McCarty
- Department of Horticulture, University of Florida, Gainesville, FL, 32611-0680, USA
| | - Hairong Wei
- College of Forest Resources and Environmental Science, Michigan Technological University, Houghton, MI, 49931, USA
| | - Myeong-Je Cho
- Innovative Genomics Institute, University of California-Berkeley, Sunnyvale, CA, 94704, USA
| | - Sunghun Park
- Department of Horticulture and Natural Resources, Kansas State University, Manhattan, KS, 66506-5502, USA
| | - Heidi Kaeppler
- Department of Agronomy, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Shawn M Kaeppler
- Department of Agronomy, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Yunjun Liu
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Nathan Springer
- Department of Plant Biology, University of Minnesota, Saint Paul, MN, 55108, USA
| | - Patrick S Schnable
- Department of Agronomy, Iowa State University, Ames, IA, 50011-3605, USA
| | - Guoying Wang
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Frank F White
- Department of Plant Pathology, University of Florida, Gainesville, FL, 32611-0680, USA
| | - Sanzhen Liu
- Department of Plant Pathology, Kansas State University, 4024 Throckmorton Center, Manhattan, KS, 66506-5502, USA.
| |
Collapse
|
5
|
He C, Lin G, Wei H, Tang H, White FF, Valent B, Liu S. Factorial estimating assembly base errors using k-mer abundance difference (KAD) between short reads and genome assembled sequences. NAR Genom Bioinform 2020; 2:lqaa075. [PMID: 33575622 PMCID: PMC7671381 DOI: 10.1093/nargab/lqaa075] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2020] [Revised: 08/02/2020] [Accepted: 09/01/2020] [Indexed: 12/25/2022] Open
Abstract
Genome sequences provide genomic maps with a single-base resolution for exploring genetic contents. Sequencing technologies, particularly long reads, have revolutionized genome assemblies for producing highly continuous genome sequences. However, current long-read sequencing technologies generate inaccurate reads that contain many errors. Some errors are retained in assembled sequences, which are typically not completely corrected by using either long reads or more accurate short reads. The issue commonly exists, but few tools are dedicated for computing error rates or determining error locations. In this study, we developed a novel approach, referred to as k-mer abundance difference (KAD), to compare the inferred copy number of each k-mer indicated by short reads and the observed copy number in the assembly. Simple KAD metrics enable to classify k-mers into categories that reflect the quality of the assembly. Specifically, the KAD method can be used to identify base errors and estimate the overall error rate. In addition, sequence insertion and deletion as well as sequence redundancy can also be detected. Collectively, KAD is valuable for quality evaluation of genome assemblies and, potentially, provides a diagnostic tool to aid in precise error correction. KAD software has been developed to facilitate public uses.
Collapse
Affiliation(s)
- Cheng He
- Department of Plant Pathology, Kansas State University, 4024 Throckmorton Center, Manhattan, KS 66506-5502, USA
| | - Guifang Lin
- Department of Plant Pathology, Kansas State University, 4024 Throckmorton Center, Manhattan, KS 66506-5502, USA
| | - Hairong Wei
- College of Forest Resources and Environmental Science, Michigan Technological University, Houghton, MI 49931, USA
| | - Haibao Tang
- Center for Genomics and Biotechnology and Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fujian 350002, China
| | - Frank F White
- Department of Plant Pathology, University of Florida, Gainesville, FL 32611-0680, USA
| | - Barbara Valent
- Department of Plant Pathology, Kansas State University, 4024 Throckmorton Center, Manhattan, KS 66506-5502, USA
| | - Sanzhen Liu
- Department of Plant Pathology, Kansas State University, 4024 Throckmorton Center, Manhattan, KS 66506-5502, USA
| |
Collapse
|
6
|
Beier S, Ulpinnis C, Schwalbe M, Münch T, Hoffie R, Koeppel I, Hertig C, Budhagatapalli N, Hiekel S, Pathi KM, Hensel G, Grosse M, Chamas S, Gerasimova S, Kumlehn J, Scholz U, Schmutzer T. Kmasker plants - a tool for assessing complex sequence space in plant species. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 102:631-642. [PMID: 31823436 DOI: 10.1111/tpj.14645] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Revised: 11/27/2019] [Accepted: 11/28/2019] [Indexed: 06/10/2023]
Abstract
Many plant genomes display high levels of repetitive sequences. The assembly of these complex genomes using short high-throughput sequence reads is still a challenging task. Underestimation or disregard of repeat complexity in these datasets can easily misguide downstream analysis. Detection of repetitive regions by k-mer counting methods has proved to be reliable. Easy-to-use applications utilizing k-mer counting are in high demand, especially in the domain of plants. We present Kmasker plants, a tool that uses k-mer count information as an assistant throughout the analytical workflow of genome data that is provided as a command-line and web-based solution. Beside its core competence to screen and mask repetitive sequences, we have integrated features that enable comparative studies between different cultivars or closely related species and methods that estimate target specificity of guide RNAs for application of site-directed mutagenesis using Cas9 endonuclease. In addition, we have set up a web service for Kmasker plants that maintains pre-computed indices for 10 of the economically most important cultivated plants. Source code for Kmasker plants has been made publically available at https://github.com/tschmutzer/kmasker. The web service is accessible at https://kmasker.ipk-gatersleben.de.
Collapse
Affiliation(s)
- Sebastian Beier
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Chris Ulpinnis
- Leibniz Institute of Plant Biochemistry, Bioinformatics and Scientific Data, 06120, Halle, Germany
| | - Markus Schwalbe
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Thomas Münch
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Robert Hoffie
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Iris Koeppel
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Christian Hertig
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Nagaveni Budhagatapalli
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Stefan Hiekel
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Krishna M Pathi
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Goetz Hensel
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Martin Grosse
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Sindy Chamas
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Sophia Gerasimova
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Jochen Kumlehn
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Uwe Scholz
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Thomas Schmutzer
- Department of Natural Sciences III, Institute for Agricultural and Nutritional Sciences, Martin Luther University Halle-Wittenberg, 06120, Halle, Germany
| |
Collapse
|
7
|
On the Close Relatedness of Two Rice-Parasitic Root-Knot Nematode Species and the Recent Expansion of Meloidogyne graminicola in Southeast Asia. Genes (Basel) 2019; 10:genes10020175. [PMID: 30823612 PMCID: PMC6410229 DOI: 10.3390/genes10020175] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2018] [Revised: 02/13/2019] [Accepted: 02/20/2019] [Indexed: 12/20/2022] Open
Abstract
Meloidogyne graminicola is a facultative meiotic parthenogenetic root-knot nematode (RKN) that seriously threatens agriculture worldwide. We have little understanding of its origin, genomic structure, and intraspecific diversity. Such information would offer better knowledge of how this nematode successfully damages rice in many different environments. Previous studies on nuclear ribosomal DNA (nrDNA) suggested a close phylogenetic relationship between M. graminicola and Meloidogyne oryzae, despite their different modes of reproduction and geographical distribution. In order to clarify the evolutionary history of these two species and explore their molecular intraspecific diversity, we sequenced the genome of 12 M. graminicola isolates, representing populations of worldwide origins, and two South American isolates of M. oryzae. k-mer analysis of their nuclear genome and the detection of divergent homologous genomic sequences indicate that both species show a high proportion of heterozygous sites (ca. 1–2%), which had never been previously reported in facultative meiotic parthenogenetic RKNs. These analyses also point to a distinct ploidy level in each species, compatible with a diploid M. graminicola and a triploid M. oryzae. Phylogenetic analyses of mitochondrial genomes and three nuclear genomic sequences confirm close relationships between these two species, with M. graminicola being a putative parent of M. oryzae. In addition, comparative mitogenomics of those 12 M. graminicola isolates with a Chinese published isolate reveal only 15 polymorphisms that are phylogenetically non-informative. Eight mitotypes are distinguished, the most common one being shared by distant populations from Asia and America. This low intraspecific diversity, coupled with a lack of phylogeographic signal, suggests a recent worldwide expansion of M. graminicola.
Collapse
|
8
|
Hoang PNT, Michael TP, Gilbert S, Chu P, Motley ST, Appenroth KJ, Schubert I, Lam E. Generating a high-confidence reference genome map of the Greater Duckweed by integration of cytogenomic, optical mapping, and Oxford Nanopore technologies. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2018; 96:670-684. [PMID: 30054939 DOI: 10.1111/tpj.14049] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/16/2018] [Revised: 06/29/2018] [Accepted: 07/06/2018] [Indexed: 06/08/2023]
Abstract
Duckweeds are the fastest growing angiosperms and have the potential to become a new generation of sustainable crops. Although a seed plant, Spirodela polyrhiza clones rarely flower and multiply mainly through vegetative propagation. Whole-genome sequencing using different approaches and clones yielded two reference maps. One for clone 9509, supported in its assembly by optical mapping of single DNA molecules, and one for clone 7498, supported by cytogenetic assignment of 96 fingerprinted bacterial artificial chromosomes (BACs) to its 20 chromosomes. However, these maps differ in the composition of several individual chromosome models. We validated both maps further to resolve these differences and addressed whether they could be due to chromosome rearrangements in different clones. For this purpose, we applied sequential multicolor fluorescence in situ hybridization (mcFISH) to seven S. polyrhiza clones, using 106 BACs that were mapped onto the 39 pseudomolecules for clone 7498. Furthermore we integrated high-depth Oxford Nanopore (ON) sequence data for clone 9509 to validate and revise the previously assembled chromosome models. We found no major structural rearrangements between these seven clones, identified seven chimeric pseudomolecules and Illumina assembly errors in the previous maps, respectively. A new S. polyrhiza genome map with high contiguity was produced with the ON sequence data and genome-wide synteny analysis supported the occurrence of two Whole Genome Duplication events during its evolution. This work generated a high confidence genome map for S. polyrhiza at the chromosome scale, and illustrates the complementarity of independent approaches to produce whole-genome assemblies in the absence of a genetic map.
Collapse
Affiliation(s)
- Phuong N T Hoang
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Stadt Seeland, D-06466, Germany
- Dalat University, Lamdong Province, Vietnam
| | | | - Sarah Gilbert
- Department of Plant Biology, Rutgers the State University of New Jersey, New Brunswick, NJ, 08901, USA
| | - Philomena Chu
- Department of Plant Biology, Rutgers the State University of New Jersey, New Brunswick, NJ, 08901, USA
| | | | - Klaus J Appenroth
- Department of Plant Physiology, Matthias-Schleiden-Institute, Friedrich-Schiller- University of Jena, Jena, D-07743, Germany
| | - Ingo Schubert
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Stadt Seeland, D-06466, Germany
| | - Eric Lam
- Department of Plant Biology, Rutgers the State University of New Jersey, New Brunswick, NJ, 08901, USA
| |
Collapse
|
9
|
Hu Y, Ren J, Peng Z, Umana AA, Le H, Danilova T, Fu J, Wang H, Robertson A, Hulbert SH, White FF, Liu S. Analysis of Extreme Phenotype Bulk Copy Number Variation (XP-CNV) Identified the Association of rp1 with Resistance to Goss's Wilt of Maize. FRONTIERS IN PLANT SCIENCE 2018; 9:110. [PMID: 29479358 PMCID: PMC5812337 DOI: 10.3389/fpls.2018.00110] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2017] [Accepted: 01/19/2018] [Indexed: 05/19/2023]
Abstract
Goss's wilt (GW) of maize is caused by the Gram-positive bacterium Clavibacter michiganensis subsp. nebraskensis (Cmn) and has spread in recent years throughout the Great Plains, posing a threat to production. The genetic basis of plant resistance is unknown. Here, a simple method for quantifying disease symptoms was developed and used to select cohorts of highly resistant and highly susceptible lines known as extreme phenotypes (XP). Copy number variation (CNV) analyses using whole genome sequences of bulked XP revealed 141 genes containing CNV between the two XP groups. The CNV genes include the previously identified common rust resistant locus rp1. Multiple Rp1 accessions with distinct rp1 haplotypes in an otherwise susceptible accession exhibited hypersensitive responses upon inoculation. GW provides an excellent system for the genetic dissection of diseases caused by closely related subspecies of C. michiganesis. Further work will facilitate breeding strategies to control GW and provide needed insight into the resistance mechanism of important related diseases such as bacterial canker of tomato and bacterial ring rot of potato.
Collapse
Affiliation(s)
- Ying Hu
- Department of Plant Pathology, Kansas State University, Manhattan, KS, United States
| | - Jie Ren
- Department of Plant Pathology, Kansas State University, Manhattan, KS, United States
| | - Zhao Peng
- Department of Plant Pathology, University of Florida, Gainesville, FL, United States
| | - Arnoldo A. Umana
- Department of Plant Pathology, Kansas State University, Manhattan, KS, United States
| | - Ha Le
- Department of Plant Pathology, Kansas State University, Manhattan, KS, United States
| | - Tatiana Danilova
- Department of Plant Pathology, Kansas State University, Manhattan, KS, United States
| | - Junjie Fu
- Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Haiyan Wang
- Department of Statistics, Kansas State University, Manhattan, KS, United States
| | - Alison Robertson
- Department of Plant Pathology and Microbiology, Iowa State University, Ames, IA, United States
| | - Scot H. Hulbert
- Department of Plant Pathology, Washington State University, Pullman, WA, United States
| | - Frank F. White
- Department of Plant Pathology, University of Florida, Gainesville, FL, United States
| | - Sanzhen Liu
- Department of Plant Pathology, Kansas State University, Manhattan, KS, United States
| |
Collapse
|
10
|
|