1
|
Kim MS, Lee T, Baek J, Kim JH, Kim C, Jeong SC. Genome assembly of the popular Korean soybean cultivar Hwangkeum. G3 (BETHESDA, MD.) 2021; 11:jkab272. [PMID: 34568925 PMCID: PMC8496230 DOI: 10.1093/g3journal/jkab272] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/18/2021] [Accepted: 07/27/2021] [Indexed: 01/01/2023]
Abstract
Massive resequencing efforts have been undertaken to catalog allelic variants in major crop species including soybean, but the scope of the information for genetic variation often depends on short sequence reads mapped to the extant reference genome. Additional de novo assembled genome sequences provide a unique opportunity to explore a dispensable genome fraction in the pan-genome of a species. Here, we report the de novo assembly and annotation of Hwangkeum, a popular soybean cultivar in Korea. The assembly was constructed using PromethION nanopore sequencing data and two genetic maps and was then error-corrected using Illumina short-reads and PacBio SMRT reads. The 933.12 Mb assembly was annotated as containing 79,870 transcripts for 58,550 genes using RNA-Seq data and the public soybean annotation set. Comparison of the Hwangkeum assembly with the Williams 82 soybean reference genome sequence (Wm82.a2.v1) revealed 1.8 million single-nucleotide polymorphisms, 0.5 million indels, and 25 thousand putative structural variants. However, there was no natural megabase-scale chromosomal rearrangement. Incidentally, by adding two novel subfamilies, we found that soybean contains four clearly separated subfamilies of centromeric satellite repeats. Analyses of satellite repeats and gene content suggested that the Hwangkeum assembly is a high-quality assembly. This was further supported by comparison of the marker arrangement of anthocyanin biosynthesis genes and of gene arrangement at the Rsv3 locus. Therefore, the results indicate that the de novo assembly of Hwangkeum is a valuable additional reference genome resource for characterizing traits for the improvement of this important crop species.
Collapse
Affiliation(s)
- Myung-Shin Kim
- Bio-Evaluation Center, Korea Research Institute of Bioscience and Biotechnology, Cheongju, Chungbuk 28116, Republic of Korea
- Plant Immunity Research Center, Interdisciplinary Program in Agricultural Genomics, College of Agriculture and Life Sciences, Seoul National University, Seoul 08826, Republic of Korea
| | - Taeyoung Lee
- Bioinformatics Institute, Macrogen Inc., Seoul 08511, Republic of Korea
| | - Jeonghun Baek
- Bioinformatics Institute, Macrogen Inc., Seoul 08511, Republic of Korea
| | - Ji Hong Kim
- Bio-Evaluation Center, Korea Research Institute of Bioscience and Biotechnology, Cheongju, Chungbuk 28116, Republic of Korea
| | - Changhoon Kim
- Bioinformatics Institute, Macrogen Inc., Seoul 08511, Republic of Korea
| | - Soon-Chun Jeong
- Bio-Evaluation Center, Korea Research Institute of Bioscience and Biotechnology, Cheongju, Chungbuk 28116, Republic of Korea
| |
Collapse
|
2
|
Chen H, Chung MC, Tsai YC, Wei FJ, Hsieh JS, Hsing YIC. Distribution of new satellites and simple sequence repeats in annual and perennial Glycine species. BOTANICAL STUDIES 2015; 56:22. [PMID: 28510831 PMCID: PMC5430363 DOI: 10.1186/s40529-015-0103-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/02/2015] [Accepted: 08/17/2015] [Indexed: 06/07/2023]
Abstract
The repeat sequences occupied more than 50 % of soybean genome. In order to understand where these repeat sequences distributed in soybean genome and its related Glycine species, we examined three new repeat sequences-soybean repeat sequence (SBRS1, SBRS2 and SBRS3), some nonspecific repeat sequences and 45S rDNA on several Glycine species, including annual and perennial accessions in this study. In the annual species, G. soja, signals for SBRS1 and ATT repeat can be found on each chromosome in GG genome, but those for SBRS2 and SBRS3 were located at three specific loci. In perennial Glycine species, these three SBR repeat frequently co-localized with 45S rDNA, two major 45S rDNA loci were found in all tetraploid species. However, an extra minor locus was found in one accession of the G. pescadrensis (Tab074), but not in another accession (Tab004). We demonstrate that some repetitive sequences are present in all Glycine species used in the study, but the abundancy is different in annual or perennial species. We suggest this study may provide additional information in investigations of the phylogeny in the Glycine species.
Collapse
Affiliation(s)
- Hsuan Chen
- Institute of Plant and Microbial Biology, Academia Sinica, Taipei, 115 Taiwan
- Department of Agronomy, National Taiwan University, Taipei, 106 Taiwan
| | - Mei-Chu Chung
- Institute of Plant and Microbial Biology, Academia Sinica, Taipei, 115 Taiwan
| | - Yuan-Ching Tsai
- Institute of Plant and Microbial Biology, Academia Sinica, Taipei, 115 Taiwan
| | - Fu-Jin Wei
- Institute of Plant and Microbial Biology, Academia Sinica, Taipei, 115 Taiwan
| | - Jaw-Shu Hsieh
- Department of Agronomy, National Taiwan University, Taipei, 106 Taiwan
| | - Yue-Ie C. Hsing
- Institute of Plant and Microbial Biology, Academia Sinica, Taipei, 115 Taiwan
| |
Collapse
|
3
|
Luo S, Mach J, Abramson B, Ramirez R, Schurr R, Barone P, Copenhaver G, Folkerts O. The cotton centromere contains a Ty3-gypsy-like LTR retroelement. PLoS One 2012; 7:e35261. [PMID: 22536361 DOI: 10.1371/journal.pone.0035261] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2011] [Accepted: 03/13/2012] [Indexed: 01/16/2023] Open
Abstract
The centromere is a repeat-rich structure essential for chromosome segregation; with the long-term aim of understanding centromere structure and function, we set out to identify cotton centromere sequences. To isolate centromere-associated sequences from cotton, (Gossypium hirsutum) we surveyed tandem and dispersed repetitive DNA in the genus. Centromere-associated elements in other plants include tandem repeats and, in some cases, centromere-specific retroelements. Examination of cotton genomic survey sequences for tandem repeats yielded sequences that did not localize to the centromere. However, among the repetitive sequences we also identified a gypsy-like LTR retrotransposon (Centromere Retroelement Gossypium, CRG) that localizes to the centromere region of all chromosomes in domestic upland cotton, Gossypium hirsutum, the major commercially grown cotton. The location of the functional centromere was confirmed by immunostaining with antiserum to the centromere-specific histone CENH3, which co-localizes with CRG hybridization on metaphase mitotic chromosomes. G. hirsutum is an allotetraploid composed of A and D genomes and CRG is also present in the centromere regions of other AD cotton species. Furthermore, FISH and genomic dot blot hybridization revealed that CRG is found in D-genome diploid cotton species, but not in A-genome diploid species, indicating that this retroelement may have invaded the A-genome centromeres during allopolyploid formation and amplified during evolutionary history. CRG is also found in other diploid Gossypium species, including B and E2 genome species, but not in the C, E1, F, and G genome species tested. Isolation of this centromere-specific retrotransposon from Gossypium provides a probe for further understanding of centromere structure, and a tool for future engineering of centromere mini-chromosomes in this important crop species.
Collapse
Affiliation(s)
- Song Luo
- Chromatin, Inc., Chicago, Illinois, United States of America
| | | | | | | | | | | | | | | |
Collapse
|
4
|
Findley SD, Pappas AL, Cui Y, Birchler JA, Palmer RG, Stacey G. Fluorescence in situ hybridization-based karyotyping of soybean translocation lines. G3 (BETHESDA, MD.) 2011; 1:117-29. [PMID: 22384324 PMCID: PMC3276125 DOI: 10.1534/g3.111.000034] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/04/2011] [Accepted: 05/07/2011] [Indexed: 01/06/2023]
Abstract
Soybean (Glycine max [L.] Merr.) is a major crop species and, therefore, a major target of genomic and genetic research. However, in contrast to other plant species, relatively few chromosomal aberrations have been identified and characterized in soybean. This is due in part to the difficulty of cytogenetic analysis of its small, morphologically homogeneous chromosomes. The recent development of a fluorescence in situ hybridization -based karyotyping system for soybean has enabled our characterization of most of the chromosomal translocation lines identified to date. Utilizing genetic data from existing translocation studies in soybean, we identified the chromosomes and approximate breakpoints involved in five translocation lines.
Collapse
|
5
|
Findley SD, Cannon S, Varala K, Du J, Ma J, Hudson ME, Birchler JA, Stacey G. A fluorescence in situ hybridization system for karyotyping soybean. Genetics 2010; 185:727-44. [PMID: 20421607 PMCID: PMC2907198 DOI: 10.1534/genetics.109.113753] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2009] [Accepted: 04/04/2010] [Indexed: 11/18/2022] Open
Abstract
The development of a universal soybean (Glycine max [L.] Merr.) cytogenetic map that associates classical genetic linkage groups, molecular linkage groups, and a sequence-based physical map with the karyotype has been impeded due to the soybean chromosomes themselves, which are small and morphologically homogeneous. To overcome this obstacle, we screened soybean repetitive DNA to develop a cocktail of fluorescent in situ hybridization (FISH) probes that could differentially label mitotic chromosomes in root tip preparations. We used genetically anchored BAC clones both to identify individual chromosomes in metaphase spreads and to complete a FISH-based karyotyping cocktail that permitted simultaneous identification of all 20 chromosome pairs. We applied these karyotyping tools to wild soybean, G. soja Sieb. and Zucc., which represents a large gene pool of potentially agronomically valuable traits. These studies led to the identification and characterization of a reciprocal chromosome translocation between chromosomes 11 and 13 in two accessions of wild soybean. The data confirm that this translocation is widespread in G. soja accessions and likely accounts for the semi-sterility found in some G. soja by G. max crosses.
Collapse
Affiliation(s)
- Seth D. Findley
- National Center for Soybean Biotechnology, Division of Plant Sciences and Division of Biological Sciences, University of Missouri, Columbia, Missouri 65211, United States Department of Agriculture–Agricultural Research Service, Iowa State University, Ames, Iowa 50011 and Department of Crop Sciences, University of Illinois, Urbana, Illinois 61801 and Department of Agronomy, Purdue University, West Lafayette, Indiana 47907
| | - Steven Cannon
- National Center for Soybean Biotechnology, Division of Plant Sciences and Division of Biological Sciences, University of Missouri, Columbia, Missouri 65211, United States Department of Agriculture–Agricultural Research Service, Iowa State University, Ames, Iowa 50011 and Department of Crop Sciences, University of Illinois, Urbana, Illinois 61801 and Department of Agronomy, Purdue University, West Lafayette, Indiana 47907
| | - Kranthi Varala
- National Center for Soybean Biotechnology, Division of Plant Sciences and Division of Biological Sciences, University of Missouri, Columbia, Missouri 65211, United States Department of Agriculture–Agricultural Research Service, Iowa State University, Ames, Iowa 50011 and Department of Crop Sciences, University of Illinois, Urbana, Illinois 61801 and Department of Agronomy, Purdue University, West Lafayette, Indiana 47907
| | - Jianchang Du
- National Center for Soybean Biotechnology, Division of Plant Sciences and Division of Biological Sciences, University of Missouri, Columbia, Missouri 65211, United States Department of Agriculture–Agricultural Research Service, Iowa State University, Ames, Iowa 50011 and Department of Crop Sciences, University of Illinois, Urbana, Illinois 61801 and Department of Agronomy, Purdue University, West Lafayette, Indiana 47907
| | - Jianxin Ma
- National Center for Soybean Biotechnology, Division of Plant Sciences and Division of Biological Sciences, University of Missouri, Columbia, Missouri 65211, United States Department of Agriculture–Agricultural Research Service, Iowa State University, Ames, Iowa 50011 and Department of Crop Sciences, University of Illinois, Urbana, Illinois 61801 and Department of Agronomy, Purdue University, West Lafayette, Indiana 47907
| | - Matthew E. Hudson
- National Center for Soybean Biotechnology, Division of Plant Sciences and Division of Biological Sciences, University of Missouri, Columbia, Missouri 65211, United States Department of Agriculture–Agricultural Research Service, Iowa State University, Ames, Iowa 50011 and Department of Crop Sciences, University of Illinois, Urbana, Illinois 61801 and Department of Agronomy, Purdue University, West Lafayette, Indiana 47907
| | - James A. Birchler
- National Center for Soybean Biotechnology, Division of Plant Sciences and Division of Biological Sciences, University of Missouri, Columbia, Missouri 65211, United States Department of Agriculture–Agricultural Research Service, Iowa State University, Ames, Iowa 50011 and Department of Crop Sciences, University of Illinois, Urbana, Illinois 61801 and Department of Agronomy, Purdue University, West Lafayette, Indiana 47907
| | - Gary Stacey
- National Center for Soybean Biotechnology, Division of Plant Sciences and Division of Biological Sciences, University of Missouri, Columbia, Missouri 65211, United States Department of Agriculture–Agricultural Research Service, Iowa State University, Ames, Iowa 50011 and Department of Crop Sciences, University of Illinois, Urbana, Illinois 61801 and Department of Agronomy, Purdue University, West Lafayette, Indiana 47907
| |
Collapse
|
6
|
Tek AL, Kashihara K, Murata M, Nagaki K. Functional centromeres in soybean include two distinct tandem repeats and a retrotransposon. Chromosome Res 2010; 18:337-47. [PMID: 20204495 DOI: 10.1007/s10577-010-9119-x] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2009] [Revised: 01/29/2010] [Accepted: 02/04/2010] [Indexed: 10/19/2022]
Abstract
The centromere as a kinetochore assembly site is fundamental to the partitioning of genetic material during cell division. In order to determine the functional centromeres of soybean, we characterized the soybean centromere-specific histone H3 (GmCENH3) protein and developed an antibody against the N-terminal end. Using this antibody, we cloned centromere-associated DNA sequences by chromatin immunoprecipitation. Our analyses indicate that soybean centromeres are composed of two distinct satellite repeats (GmCent-1 and GmCent-4) and retrotransposon-related sequences (GmCR). The possible allopolyploid origin of the soybean genome is discussed in view of the centromeric satellite sequences present.
Collapse
Affiliation(s)
- Ahmet L Tek
- Research Institute for Bioresources, Okayama University, Chuo 2-20-1, Kurashiki, 710-0046, Japan.
| | | | | | | |
Collapse
|
7
|
Abstract
Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70% more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78% of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75% of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.
Collapse
|
8
|
Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T, Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu S, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du J, Tian Z, Zhu L, Gill N, Joshi T, Libault M, Sethuraman A, Zhang XC, Shinozaki K, Nguyen HT, Wing RA, Cregan P, Specht J, Grimwood J, Rokhsar D, Stacey G, Shoemaker RC, Jackson SA. Genome sequence of the palaeopolyploid soybean. Nature 2010; 463:178-83. [PMID: 20075913 DOI: 10.1038/nature08670] [Citation(s) in RCA: 2607] [Impact Index Per Article: 186.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2009] [Accepted: 11/12/2009] [Indexed: 12/27/2022]
Abstract
Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70% more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78% of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75% of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.
Collapse
Affiliation(s)
- Jeremy Schmutz
- HudsonAlpha Genome Sequencing Center, 601 Genome Way, Huntsville, Alabama 35806, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Gill N, Findley S, Walling JG, Hans C, Ma J, Doyle J, Stacey G, Jackson SA. Molecular and chromosomal evidence for allopolyploidy in soybean. PLANT PHYSIOLOGY 2009; 151:1167-74. [PMID: 19605552 PMCID: PMC2773056 DOI: 10.1104/pp.109.137935] [Citation(s) in RCA: 108] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2009] [Accepted: 07/09/2009] [Indexed: 05/18/2023]
Abstract
Recent studies have documented that the soybean (Glycine max) genome has undergone two rounds of large-scale genome and/or segmental duplication. To shed light on the timing and nature of these duplication events, we characterized and analyzed two subfamilies of high-copy centromeric satellite repeats, CentGm-1 and CentGm-2, using a combination of computational and molecular cytogenetic approaches. These two subfamilies of satellite repeats mark distinct subsets of soybean centromeres and, in at least one case, a pair of homologs, suggesting their origins from an allopolyploid event. The satellite monomers of each subfamily are arranged in large tandem arrays, and intermingled monomers of the two subfamilies were not detected by fluorescence in situ hybridization on extended DNA fibers nor at the sequence level. This indicates that there has been little recombination and homogenization of satellite DNA between these two sets of centromeres. These satellite repeats are also present in Glycine soja, the proposed wild progenitor of soybean, but could not be detected in any other relatives of soybean examined in this study, suggesting the rapid divergence of the centromeric satellite DNA within the Glycine genus. Together, these observations provide direct evidence, at molecular and chromosomal levels, in support of the hypothesis that the soybean genome has experienced a recent allopolyploidization event.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Scott A. Jackson
- Department of Agronomy (N.G., J.G.W., C.H., J.M., S.A.J.) and Interdisciplinary Life Science Program (N.G., S.A.J.), Purdue University, West Lafayette, Indiana 47907; Division of Plant Sciences, Bond Life Science Center, University of Missouri, Columbia, Missouri 65211 (S.F., G.S.); and Department of Plant Biology, Cornell University, Ithaca, New York 14853 (J.D.)
| |
Collapse
|
10
|
Innes RW, Ameline-Torregrosa C, Ashfield T, Cannon E, Cannon SB, Chacko B, Chen NWG, Couloux A, Dalwani A, Denny R, Deshpande S, Egan AN, Glover N, Hans CS, Howell S, Ilut D, Jackson S, Lai H, Mammadov J, Del Campo SM, Metcalf M, Nguyen A, O'Bleness M, Pfeil BE, Podicheti R, Ratnaparkhe MB, Samain S, Sanders I, Ségurens B, Sévignac M, Sherman-Broyles S, Thareau V, Tucker DM, Walling J, Wawrzynski A, Yi J, Doyle JJ, Geffroy V, Roe BA, Maroof MAS, Young ND. Differential accumulation of retroelements and diversification of NB-LRR disease resistance genes in duplicated regions following polyploidy in the ancestor of soybean. PLANT PHYSIOLOGY 2008; 148:1740-59. [PMID: 18842825 PMCID: PMC2593655 DOI: 10.1104/pp.108.127902] [Citation(s) in RCA: 89] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2008] [Accepted: 10/06/2008] [Indexed: 05/18/2023]
Abstract
The genomes of most, if not all, flowering plants have undergone whole genome duplication events during their evolution. The impact of such polyploidy events is poorly understood, as is the fate of most duplicated genes. We sequenced an approximately 1 million-bp region in soybean (Glycine max) centered on the Rpg1-b disease resistance gene and compared this region with a region duplicated 10 to 14 million years ago. These two regions were also compared with homologous regions in several related legume species (a second soybean genotype, Glycine tomentella, Phaseolus vulgaris, and Medicago truncatula), which enabled us to determine how each of the duplicated regions (homoeologues) in soybean has changed following polyploidy. The biggest change was in retroelement content, with homoeologue 2 having expanded to 3-fold the size of homoeologue 1. Despite this accumulation of retroelements, over 77% of the duplicated low-copy genes have been retained in the same order and appear to be functional. This finding contrasts with recent analyses of the maize (Zea mays) genome, in which only about one-third of duplicated genes appear to have been retained over a similar time period. Fluorescent in situ hybridization revealed that the homoeologue 2 region is located very near a centromere. Thus, pericentromeric localization, per se, does not result in a high rate of gene inactivation, despite greatly accelerated retrotransposon accumulation. In contrast to low-copy genes, nucleotide-binding-leucine-rich repeat disease resistance gene clusters have undergone dramatic species/homoeologue-specific duplications and losses, with some evidence for partitioning of subfamilies between homoeologues.
Collapse
Affiliation(s)
- Roger W Innes
- Department of Biology, Indiana University, Bloomington, Indiana 47405, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Gao H, Bhattacharyya MK. The soybean-Phytophthora resistance locus Rps1-k encompasses coiled coil-nucleotide binding-leucine rich repeat-like genes and repetitive sequences. BMC PLANT BIOLOGY 2008; 8:29. [PMID: 18366691 PMCID: PMC2330051 DOI: 10.1186/1471-2229-8-29] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/23/2007] [Accepted: 03/19/2008] [Indexed: 05/18/2023]
Abstract
BACKGROUND A series of Rps (resistance to Pytophthora sojae) genes have been protecting soybean from the root and stem rot disease caused by the Oomycete pathogen, Phytophthora sojae. Five Rps genes were mapped to the Rps1 locus located near the 28 cM map position on molecular linkage group N of the composite genetic soybean map. Among these five genes, Rps1-k was introgressed from the cultivar, Kingwa. Rps1-k has been providing stable and broad-spectrum Phytophthora resistance in the major soybean-producing regions of the United States. Rps1-k has been mapped and isolated. More than one functional Rps1-k gene was identified from the Rps1-k locus. The clustering feature at the Rps1-k locus might have facilitated the expansion of Rps1-k gene numbers and the generation of new recognition specificities. The Rps1-k region was sequenced to understand the possible evolutionary steps that shaped the generation of Phytophthora resistance genes in soybean. RESULTS Here the analyses of sequences of three overlapping BAC clones containing the 184,111 bp Rps1-k region are reported. A shotgun sequencing strategy was applied in sequencing the BAC contig. Sequence analysis predicted a few full-length genes including two Rps1-k genes, Rps1-k-1 and Rps1-k-2. Previously reported Rps1-k-3 from this genomic region 1 was evolved through intramolecular recombination between Rps1-k-1 and Rps1-k-2 in Escherichia coli. The majority of the predicted genes are truncated and therefore most likely they are nonfunctional. A member of a highly abundant retroelement, SIRE1, was identified from the Rps1-k region. The Rps1-k region is primarily composed of repetitive sequences. Sixteen simple repeat and 63 tandem repeat sequences were identified from the locus. CONCLUSION These data indicate that the Rps1 locus is located in a gene-poor region. The abundance of repetitive sequences in the Rps1-k region suggested that the location of this locus is in or near a heterochromatic region. Poor recombination frequencies combined with presence of two functional Rps genes at this locus has been providing stable Phytophthora resistance in soybean.
Collapse
Affiliation(s)
- Hongyu Gao
- Department of Agronomy, Interdepartmental Genetics, Iowa State University, Ames, Iowa 50011, USA
| | - Madan K Bhattacharyya
- Department of Agronomy, Interdepartmental Genetics, Iowa State University, Ames, Iowa 50011, USA
| |
Collapse
|
12
|
Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey. BMC Genomics 2007; 8:132. [PMID: 17524145 PMCID: PMC1894642 DOI: 10.1186/1471-2164-8-132] [Citation(s) in RCA: 80] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2006] [Accepted: 05/24/2007] [Indexed: 12/02/2022] Open
Abstract
Background Extensive computational and database tools are available to mine genomic and genetic databases for model organisms, but little genomic data is available for many species of ecological or agricultural significance, especially those with large genomes. Genome surveys using conventional sequencing techniques are powerful, particularly for detecting sequences present in many copies per genome. However these methods are time-consuming and have potential drawbacks. High throughput 454 sequencing provides an alternative method by which much information can be gained quickly and cheaply from high-coverage surveys of genomic DNA. Results We sequenced 78 million base-pairs of randomly sheared soybean DNA which passed our quality criteria. Computational analysis of the survey sequences provided global information on the abundant repetitive sequences in soybean. The sequence was used to determine the copy number across regions of large genomic clones or contigs and discover higher-order structures within satellite repeats. We have created an annotated, online database of sequences present in multiple copies in the soybean genome. The low bias of pyrosequencing against repeat sequences is demonstrated by the overall composition of the survey data, which matches well with past estimates of repetitive DNA content obtained by DNA re-association kinetics (Cot analysis). Conclusion This approach provides a potential aid to conventional or shotgun genome assembly, by allowing rapid assessment of copy number in any clone or clone-end sequence. In addition, we show that partial sequencing can provide access to partial protein-coding sequences.
Collapse
|
13
|
Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey. BMC Genomics 2007. [PMID: 17524145 DOI: 10.1186/1471‐2164‐8‐132] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Extensive computational and database tools are available to mine genomic and genetic databases for model organisms, but little genomic data is available for many species of ecological or agricultural significance, especially those with large genomes. Genome surveys using conventional sequencing techniques are powerful, particularly for detecting sequences present in many copies per genome. However these methods are time-consuming and have potential drawbacks. High throughput 454 sequencing provides an alternative method by which much information can be gained quickly and cheaply from high-coverage surveys of genomic DNA. RESULTS We sequenced 78 million base-pairs of randomly sheared soybean DNA which passed our quality criteria. Computational analysis of the survey sequences provided global information on the abundant repetitive sequences in soybean. The sequence was used to determine the copy number across regions of large genomic clones or contigs and discover higher-order structures within satellite repeats. We have created an annotated, online database of sequences present in multiple copies in the soybean genome. The low bias of pyrosequencing against repeat sequences is demonstrated by the overall composition of the survey data, which matches well with past estimates of repetitive DNA content obtained by DNA re-association kinetics (Cot analysis). CONCLUSION This approach provides a potential aid to conventional or shotgun genome assembly, by allowing rapid assessment of copy number in any clone or clone-end sequence. In addition, we show that partial sequencing can provide access to partial protein-coding sequences.
Collapse
|
14
|
Nunberg A, Bedell JA, Budiman MA, Citek RW, Clifton SW, Fulton L, Pape D, Cai Z, Joshi T, Nguyen H, Xu D, Stacey G. Survey sequencing of soybean elucidates the genome structure, composition and identifies novel repeats. FUNCTIONAL PLANT BIOLOGY : FPB 2006; 33:765-773. [PMID: 32689287 DOI: 10.1071/fp06106] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2006] [Accepted: 05/24/2006] [Indexed: 06/11/2023]
Abstract
In order to expand our knowledge of the soybean genome and to create a useful DNA repeat sequence database, over 24 000 DNA fragments from a soybean [Glycine max (L.) Merr.] cv. Williams 82 genomic shotgun library were sequenced. Additional sequences came from over 29 000 bacterial artificial chromosome (BAC) end sequences derived from a BstI library of the cv. Williams 82 genome. Analysis of these sequences identified 348 different DNA repeats, many of which appear to be novel. To extend the utility of the work, a pilot study was also conducted using methylation filtration to estimate the hypomethylated, soybean gene space. A comparison between 8366 sequences obtained from a filtered library and 23 788 from an unfiltered library indicate a gene-enrichment of ~3.2-fold in the hypomethylated sequences. Given the 1.1-Gb soybean genome, our analysis predicts a ~343-Mb hypomethylated, gene-rich space.
Collapse
Affiliation(s)
- Andrew Nunberg
- Orion Genomics, LLC, 4041 Forest Park Ave, St Louis, MO 63108, USA
| | - Joseph A Bedell
- Orion Genomics, LLC, 4041 Forest Park Ave, St Louis, MO 63108, USA
| | | | - Robert W Citek
- Orion Genomics, LLC, 4041 Forest Park Ave, St Louis, MO 63108, USA
| | - Sandra W Clifton
- Genome Sequencing Center, School of Medicine, Washington University, St Louis, MO 63130, USA
| | - Lucinda Fulton
- Genome Sequencing Center, School of Medicine, Washington University, St Louis, MO 63130, USA
| | - Deana Pape
- Genome Sequencing Center, School of Medicine, Washington University, St Louis, MO 63130, USA
| | - Zheng Cai
- Computer Science Department, University of Missouri, Columbia, MO 65211, USA
| | - Trupti Joshi
- Computer Science Department, University of Missouri, Columbia, MO 65211, USA
| | - Henry Nguyen
- National Center for Soybean Biotechnology, University of Missouri, Columbia, MO 65211, USA
| | - Dong Xu
- Computer Science Department, University of Missouri, Columbia, MO 65211, USA
| | - Gary Stacey
- Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
15
|
Lin JY, Jacobus BH, SanMiguel P, Walling JG, Yuan Y, Shoemaker RC, Young ND, Jackson SA. Pericentromeric regions of soybean (Glycine max L. Merr.) chromosomes consist of retroelements and tandemly repeated DNA and are structurally and evolutionarily labile. Genetics 2005; 170:1221-30. [PMID: 15879505 PMCID: PMC1451161 DOI: 10.1534/genetics.105.041616] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2005] [Accepted: 04/01/2005] [Indexed: 11/18/2022] Open
Abstract
Little is known about the physical makeup of heterochromatin in the soybean (Glycine max L. Merr.) genome. Using DNA sequencing and molecular cytogenetics, an initial analysis of the repetitive fraction of the soybean genome is presented. BAC 076J21, derived from linkage group L, has sequences conserved in the pericentromeric heterochromatin of all 20 chromosomes. FISH analysis of this BAC and three subclones on pachytene chromosomes revealed relatively strict partitioning of the heterochromatic and euchromatic regions. Sequence analysis showed that this BAC consists primarily of repetitive sequences such as a 102-bp tandem repeat with sequence identity to a previously characterized approximately 120-bp repeat (STR120). Fragments of Calypso-like retroelements, a recently inserted SIRE1 element, and a SIRE1 solo LTR were present within this BAC. Some of these sequences are methylated and are not conserved outside of G. max and G. soja, a close relative of soybean, except for STR102, which hybridized to a restriction fragment from G. latifolia. These data present a picture of the repetitive fraction of the soybean genome that is highly concentrated in the pericentromeric regions, consisting of rapidly evolving tandem repeats with interspersed retroelements.
Collapse
Affiliation(s)
- Jer-Young Lin
- Department of Agronomy, Purdue University, West Lafayette, Indiana 47907
| | | | - Phillip SanMiguel
- Purdue University Genomics Core, Department of Horticulture, Purdue University, West Lafayette, Indiana 47907
| | - Jason G. Walling
- Department of Agronomy, Purdue University, West Lafayette, Indiana 47907
| | - Yinan Yuan
- Department of Agronomy, Purdue University, West Lafayette, Indiana 47907
| | - Randy C. Shoemaker
- USDA-ARS-CICGR and Department of Agronomy, Iowa State University, Ames, Iowa 50011
| | - Nevin D. Young
- Department of Plant Pathology, University of Minnesota, Saint Paul, Minnesota 55108
| | - Scott A. Jackson
- Department of Agronomy, Purdue University, West Lafayette, Indiana 47907
| |
Collapse
|
16
|
Stacey G, Vodkin L, Parrott WA, Shoemaker RC. National Science Foundation-sponsored workshop report. Draft plan for soybean genomics. PLANT PHYSIOLOGY 2004; 135:59-70. [PMID: 15141067 PMCID: PMC429333 DOI: 10.1104/pp.103.037903] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/17/2003] [Revised: 02/20/2004] [Accepted: 02/20/2004] [Indexed: 05/11/2023]
Abstract
Recent efforts to coordinate and define a research strategy for soybean (Glycine max) genomics began with the establishment of a Soybean Genetics Executive Committee, which will serve as a communication focal point between the soybean research community and granting agencies. Secondly, a workshop was held to define a strategy to incorporate existing tools into a framework for advancing soybean genomics research. This workshop identified and ranked research priorities essential to making more informed decisions as to how to proceed with large scale sequencing and other genomics efforts. Most critical among these was the need to finalize a physical map and to obtain a better understanding of genome microstructure. Addressing these research needs will require pilot work on new technologies to demonstrate an ability to discriminate between recently duplicated regions in the soybean genome and pilot projects to analyze an adequate amount of random genome sequence to identify and catalog common repeats. The development of additional markers, reverse genetics tools, and bioinformatics is also necessary. Successful implementation of these goals will require close coordination among various working groups.
Collapse
Affiliation(s)
- Gary Stacey
- National Center for Soybean Biotechnology, Department of Plant Microbiology and Pathology, University of Missouri, Columbia, Missouri 65203, USA.
| | | | | | | |
Collapse
|
17
|
Morgante M, Jurman I, Shi L, Zhu T, Keim P, Rafalski JA. The STR120 satellite DNA of soybean: organization, evolution and chromosomal specificity. Chromosome Res 1997; 5:363-73. [PMID: 9364938 DOI: 10.1023/a:1018492208247] [Citation(s) in RCA: 25] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
A highly repeated DNA sequence family, STR120, with tandemly arranged repetitive units (monomers) of approximately 120bp, has been identified in soybean [Glycine max (L.) Merr.]. Five related clones showing tandem repeats of a 120-bp-long monomer were isolated from a soybean genomic library. Results of Southern blotting experiments using three of the clones as probes onto genomic DNA digested with different restriction enzymes were in agreement with a tandem arrangement of these sequences in the genome. A total of 12 monomers were sequenced, showing considerable sequence heterogeneity. A consensus sequence of 126 bp was obtained that exhibits an average similarity of 81% to the sequenced units. In three of the clones identified, neighbouring units are significantly more similar to each other than to units from different clones; in the remaining two clones, however, similarity between the two units observed is low (70%), while the overall similarity between the two clones is high (95%). This indicates that in these cases the repetitive unit may be the dimer rather than the monomer. Based on the presence of direct repeats within each monomer, we suggest that the 120-bp monomer may itself have evolved by duplication of an ancestral 60-bp unit. The STR120 family distribution is limited to annual soybeans and is not found, at least at high-copy number, in related perennial soybeans or other members of the tribe Phaseolae. Fluorescence in situ hybridization (FISH) to metaphase chromosomes using four of the clones as probes shows that the number of chromosomal locations differs depending on the stringency conditions and goes from two to eight when the stringency is progressively lowered. The estimated copy number for one of the clones is from 5000 to 10000, but this may just represent a lower boundary for the whole family in consideration of the high sequence divergence observed within the family. FISH and sequence analysis therefore indicate that different subfamilies as well as higher-order repeat units are present in the STR120 family, very much like those in primate alpha satellite DNA, and that some of the subfamilies seem to exhibit divergence on a chromosomal basis.
Collapse
Affiliation(s)
- M Morgante
- Du Pont Agricultural Products, Biotechnology Research, Experimental Station, Wilmington, DE 19880-0402, USA.
| | | | | | | | | | | |
Collapse
|