1
|
Deschamps S, Crow JA, Chaidir N, Peterson-Burch B, Kumar S, Lin H, Zastrow-Hayes G, May GD. Chromatin loop anchors contain core structural components of the gene expression machinery in maize. BMC Genomics 2021; 22:23. [PMID: 33407087 PMCID: PMC7789236 DOI: 10.1186/s12864-020-07324-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Accepted: 12/14/2020] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND Three-dimensional chromatin loop structures connect regulatory elements to their target genes in regions known as anchors. In complex plant genomes, such as maize, it has been proposed that loops span heterochromatic regions marked by higher repeat content, but little is known on their spatial organization and genome-wide occurrence in relation to transcriptional activity. RESULTS Here, ultra-deep Hi-C sequencing of maize B73 leaf tissue was combined with gene expression and open chromatin sequencing for chromatin loop discovery and correlation with hierarchical topologically-associating domains (TADs) and transcriptional activity. A majority of all anchors are shared between multiple loops from previous public maize high-resolution interactome datasets, suggesting a highly dynamic environment, with a conserved set of anchors involved in multiple interaction networks. Chromatin loop interiors are marked by higher repeat contents than the anchors flanking them. A small fraction of high-resolution interaction anchors, fully embedded in larger chromatin loops, co-locate with active genes and putative protein-binding sites. Combinatorial analyses indicate that all anchors studied here co-locate with at least 81.5% of expressed genes and 74% of open chromatin regions. Approximately 38% of all Hi-C chromatin loops are fully embedded within hierarchical TAD-like domains, while the remaining ones share anchors with domain boundaries or with distinct domains. Those various loop types exhibit specific patterns of overlap for open chromatin regions and expressed genes, but no apparent pattern of gene expression. In addition, up to 63% of all unique variants derived from a prior public maize eQTL dataset overlap with Hi-C loop anchors. Anchor annotation suggests that < 7% of all loops detected here are potentially devoid of any genes or regulatory elements. The overall organization of chromatin loop anchors in the maize genome suggest a loop modeling system hypothesized to resemble phase separation of repeat-rich regions. CONCLUSIONS Sets of conserved chromatin loop anchors mapping to hierarchical domains contains core structural components of the gene expression machinery in maize. The data presented here will be a useful reference to further investigate their function in regard to the formation of transcriptional complexes and the regulation of transcriptional activity in the maize genome.
Collapse
Affiliation(s)
| | - John A. Crow
- Corteva Agriscience, 8325 NW, 62nd Avenue, Johnston, Iowa, 50131 USA
| | - Nadia Chaidir
- Corteva Agriscience, 8325 NW, 62nd Avenue, Johnston, Iowa, 50131 USA
| | | | - Sunil Kumar
- Corteva Agriscience, The V-Acendas, Atria Block, 12th Floor, Plot No.17, Madhapur, Hyderabad, Telangana 500081 India
| | - Haining Lin
- Corteva Agriscience, 8325 NW, 62nd Avenue, Johnston, Iowa, 50131 USA
| | | | - Gregory D. May
- Corteva Agriscience, 8325 NW, 62nd Avenue, Johnston, Iowa, 50131 USA
| |
Collapse
|
2
|
Graham N, Patil GB, Bubeck DM, Dobert RC, Glenn KC, Gutsche AT, Kumar S, Lindbo JA, Maas L, May GD, Vega-Sanchez ME, Stupar RM, Morrell PL. Plant Genome Editing and the Relevance of Off-Target Changes. Plant Physiol 2020; 183:1453-1471. [PMID: 32457089 PMCID: PMC7401131 DOI: 10.1104/pp.19.01194] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Accepted: 05/07/2020] [Indexed: 05/12/2023]
Abstract
Site-directed nucleases (SDNs) used for targeted genome editing are powerful new tools to introduce precise genetic changes into plants. Like traditional approaches, such as conventional crossing and induced mutagenesis, genome editing aims to improve crop yield and nutrition. Next-generation sequencing studies demonstrate that across their genomes, populations of crop species typically carry millions of single nucleotide polymorphisms and many copy number and structural variants. Spontaneous mutations occur at rates of ∼10-8 to 10-9 per site per generation, while variation induced by chemical treatment or ionizing radiation results in higher mutation rates. In the context of SDNs, an off-target change or edit is an unintended, nonspecific mutation occurring at a site with sequence similarity to the targeted edit region. SDN-mediated off-target changes can contribute to a small number of additional genetic variants compared to those that occur naturally in breeding populations or are introduced by induced-mutagenesis methods. Recent studies show that using computational algorithms to design genome editing reagents can mitigate off-target edits in plants. Finally, crops are subject to strong selection to eliminate off-type plants through well-established multigenerational breeding, selection, and commercial variety development practices. Within this context, off-target edits in crops present no new safety concerns compared to other breeding practices. The current generation of genome editing technologies is already proving useful to develop new plant varieties with consumer and farmer benefits. Genome editing will likely undergo improved editing specificity along with new developments in SDN delivery and increasing genomic characterization, further improving reagent design and application.
Collapse
Affiliation(s)
- Nathaniel Graham
- Department of Genetics, Cell Biology and Development, University of Minnesota, St. Paul, Minnesota 55108
- Pairwise, Durham, North Carolina 27709
| | - Gunvant B Patil
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| | | | | | | | | | | | | | - Luis Maas
- Enza Zaden Research USA, San Juan Bautista, California 95045
| | | | | | - Robert M Stupar
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| | - Peter L Morrell
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| |
Collapse
|
3
|
Hunter CJ, Boyd MJ, May GD, Fimognari R. Visible-Light-Mediated N-Desulfonylation of N-Heterocycles Using a Heteroleptic Copper(I) Complex as a Photocatalyst. J Org Chem 2020; 85:8732-8739. [PMID: 32482067 DOI: 10.1021/acs.joc.0c00983] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
A photoredox protocol that uses a heteroleptic Cu (I) complex, [Cu(dq)(BINAP)]BF4, has been developed for the photodeprotection of benzenesulfonyl-protected N-heterocycles. A range of substrates was examined, including indazoles, indoles, pyrazoles, and benzimidazole, featuring both electron-rich and electron-deficient substituents, giving good yields of the N-heterocycle products with broad functional group tolerance. This transformation was also found to be amenable to flow reaction conditions.
Collapse
Affiliation(s)
- Cameron J Hunter
- Vertex Pharmaceuticals Incorporated, 50 Northern Avenue, Boston, Massachusetts 02210, United States
| | - Michael J Boyd
- Vertex Pharmaceuticals Incorporated, 50 Northern Avenue, Boston, Massachusetts 02210, United States
| | - Gregory D May
- Vertex Pharmaceuticals Incorporated, 50 Northern Avenue, Boston, Massachusetts 02210, United States
| | - Robert Fimognari
- Vertex Pharmaceuticals Incorporated, 50 Northern Avenue, Boston, Massachusetts 02210, United States
| |
Collapse
|
4
|
Affiliation(s)
- Gregory D. May
- Department of Biology, Southeast Missouri State University, Cape Girardeau, Missouri 63701
| | - Walt W. Lilly
- Department of Biology, Southeast Missouri State University, Cape Girardeau, Missouri 63701
| |
Collapse
|
5
|
Bissonnette NB, Boyd MJ, May GD, Giroux S, Nuhant P. C–H Functionalization of Heteroarenes Using Unactivated Alkyl Halides through Visible-Light Photoredox Catalysis under Basic Conditions. J Org Chem 2018; 83:10933-10940. [DOI: 10.1021/acs.joc.8b01589] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Noah B. Bissonnette
- Vertex Pharmaceuticals Inc., 50 Northern Avenue, Boston, Massachusetts 02210, United States
| | - Michael J. Boyd
- Vertex Pharmaceuticals Inc., 50 Northern Avenue, Boston, Massachusetts 02210, United States
| | - Gregory D. May
- Vertex Pharmaceuticals Inc., 50 Northern Avenue, Boston, Massachusetts 02210, United States
| | - Simon Giroux
- Vertex Pharmaceuticals Inc., 50 Northern Avenue, Boston, Massachusetts 02210, United States
| | - Philippe Nuhant
- Vertex Pharmaceuticals Inc., 50 Northern Avenue, Boston, Massachusetts 02210, United States
| |
Collapse
|
6
|
Raynor KD, May GD, Bandarage UK, Boyd MJ. Generation of Diversity Sets with High sp3 Fraction Using the Photoredox Coupling of Organotrifluoroborates and Organosilicates with Heteroaryl/Aryl Bromides in Continuous Flow. J Org Chem 2018; 83:1551-1557. [DOI: 10.1021/acs.joc.7b02680] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Kevin D. Raynor
- Vertex Pharmaceuticals Inc., 50 Nothern Avenue, Boston, Massachusetts 02210, United States
| | - Gregory D. May
- Vertex Pharmaceuticals Inc., 50 Nothern Avenue, Boston, Massachusetts 02210, United States
| | - Upul K. Bandarage
- Vertex Pharmaceuticals Inc., 50 Nothern Avenue, Boston, Massachusetts 02210, United States
| | - Michael J. Boyd
- Vertex Pharmaceuticals Inc., 50 Nothern Avenue, Boston, Massachusetts 02210, United States
| |
Collapse
|
7
|
Zheng Y, Hivrale V, Zhang X, Valliyodan B, Lelandais-Brière C, Farmer AD, May GD, Crespi M, Nguyen HT, Sunkar R. Small RNA profiles in soybean primary root tips under water deficit. BMC Syst Biol 2016; 10:126. [PMID: 28105955 PMCID: PMC5249032 DOI: 10.1186/s12918-016-0374-0] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
BACKGROUND Soybean (Glycine max) production is significantly hampered by frequent droughts in many regions of the world including the United States. Identifying microRNA (miRNA)-controlled posttranscriptional gene regulation under drought will enhance our understanding of molecular basis of drought tolerance in this important cash crop. Indeed, miRNA profiles in soybean exposed to drought were studied but not from the primary root tips, which is not only a main zone of water uptake but also critical for water stress sensing and signaling. METHODS Here we report miRNA profiles specifically from well-watered and water-stressed primary root tips (0 to 8 mm from the root apex) of soybean. Small RNA sequencing confirmed the expression of vastly diverse miRNA (303 individual miRNAs) population, and, importantly several conserved miRNAs were abundantly expressed in primary root tips. RESULTS Notably, 12 highly conserved miRNA families were differentially regulated in response to water-deficit; six were upregulated while six others were downregulated at least by one fold (log2) change. Differentially regulated soybean miRNAs are targeting genes include auxin response factors, Cu/Zn Superoxide dismutases, laccases and plantacyanin and several others. CONCLUSIONS These results highlighted the importance of miRNAs in primary root tips both under control and water-deficit conditions; under control conditions, miRNAs could be important for cell division, cell elongation and maintenance of the root apical meristem activity including quiescent centre whereas under water stress differentially regulated miRNAs could decrease auxin signaling and oxidative stress as well as other metabolic processes that save energy and water.
Collapse
Affiliation(s)
- Yun Zheng
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan, 650500, China
| | - Vandana Hivrale
- Department of Biochemistry and Molecular Biology, Oklahoma State University, Stillwater, OK, 74078, USA
| | - Xiaotuo Zhang
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan, 650500, China
| | - Babu Valliyodan
- National Center for Soybean Biotechnology and Division of Plant Sciences, University of Missouri, Columbia, MO, 65211, USA
| | - Christine Lelandais-Brière
- Institut of Plant Sciences Paris-Saclay (IPS2), CNRS, INRA, University of "Paris-Sud", Batiment 630, 91405, Orsay, France
- Institut of Plant Sciences Paris-Saclay (IPS2), CNRS, INRA, University of "Paris-Diderot", Sorbonne Paris-Cité, 91405 Orsay,, Paris, France
| | - Andrew D Farmer
- National Center for Genome Resources, Santa Fe, New Mexico, NM, 87505, USA
| | - Gregory D May
- National Center for Genome Resources, Santa Fe, New Mexico, NM, 87505, USA
- Present address: Pioneer Hi-Bred International, Inc, Johnston, IA, 50131, USA
| | - Martin Crespi
- Institut of Plant Sciences Paris-Saclay (IPS2), CNRS, INRA, University of "Paris-Sud", Batiment 630, 91405, Orsay, France
- Institut of Plant Sciences Paris-Saclay (IPS2), CNRS, INRA, University of "Paris-Diderot", Sorbonne Paris-Cité, 91405 Orsay,, Paris, France
| | - Henry T Nguyen
- National Center for Soybean Biotechnology and Division of Plant Sciences, University of Missouri, Columbia, MO, 65211, USA.
| | - Ramanjulu Sunkar
- Department of Biochemistry and Molecular Biology, Oklahoma State University, Stillwater, OK, 74078, USA.
| |
Collapse
|
8
|
Kilgore MB, Augustin MM, May GD, Crow JA, Kutchan TM. CYP96T1 of Narcissus sp. aff. pseudonarcissus Catalyzes Formation of the Para-Para' C-C Phenol Couple in the Amaryllidaceae Alkaloids. Front Plant Sci 2016; 7:225. [PMID: 26941773 PMCID: PMC4766306 DOI: 10.3389/fpls.2016.00225] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/15/2015] [Accepted: 02/10/2016] [Indexed: 05/07/2023]
Abstract
The Amaryllidaceae alkaloids are a family of amino acid derived alkaloids with many biological activities; examples include haemanthamine, haemanthidine, galanthamine, lycorine, and maritidine. Central to the biosynthesis of the majority of these alkaloids is a C-C phenol-coupling reaction that can have para-para', para-ortho', or ortho-para' regiospecificity. Through comparative transcriptomics of Narcissus sp. aff. pseudonarcissus, Galanthus sp., and Galanthus elwesii we have identified a para-para' C-C phenol coupling cytochrome P450, CYP96T1, capable of forming the products (10bR,4aS)-noroxomaritidine and (10bS,4aR)-noroxomaritidine from 4'-O-methylnorbelladine. CYP96T1 was also shown to catalyzed formation of the para-ortho' phenol coupled product, N-demethylnarwedine, as less than 1% of the total product. CYP96T1 co-expresses with the previously characterized norbelladine 4'-O-methyltransferase. The discovery of CYP96T1 is of special interest because it catalyzes the first major branch in Amaryllidaceae alkaloid biosynthesis. CYP96T1 is also the first phenol-coupling enzyme characterized from a monocot.
Collapse
Affiliation(s)
| | | | | | - John A. Crow
- National Center for Genome ResourcesSanta Fe, NM, USA
| | - Toni M. Kutchan
- Donald Danforth Plant Science CenterSt. Louis, MO, USA
- *Correspondence: Toni M. Kutchan
| |
Collapse
|
9
|
Zastrow-Hayes GM, Lin H, Sigmund AL, Hoffman JL, Alarcon CM, Hayes KR, Richmond TA, Jeddeloh JA, May GD, Beatty MK. Southern-by-Sequencing: A Robust Screening Approach for Molecular Characterization of Genetically Modified Crops. Plant Genome 2015; 8:eplantgenome2014.08.0037. [PMID: 33228291 DOI: 10.3835/plantgenome2014.08.0037] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2014] [Indexed: 05/17/2023]
Abstract
Molecular characterization of events is an integral part of the advancement process during genetically modified (GM) crop product development. Assessment of these events is traditionally accomplished by polymerase chain reaction (PCR) and Southern blot analyses. Southern blot analysis can be time-consuming and comparatively expensive and does not provide sequence-level detail. We have developed a sequence-based application, Southern-by-Sequencing (SbS), utilizing sequence capture coupled with next-generation sequencing (NGS) technology to replace Southern blot analysis for event selection in a high-throughput molecular characterization environment. SbS is accomplished by hybridizing indexed and pooled whole-genome DNA libraries from GM plants to biotinylated probes designed to target the sequence of transformation plasmids used to generate events within the pool. This sequence capture process enriches the sequence data obtained for targeted regions of interest (transformation plasmid DNA). Taking advantage of the DNA adjacent to the targeted bases (referred to as next-to-target sequence) that accompanies the targeted transformation plasmid sequence, the data analysis detects plasmid-to-genome and plasmid-to-plasmid junctions introduced during insertion into the plant genome. Analysis of these junction sequences provides sequence-level information as to the following: the number of insertion loci including detection of unlinked, independently segregating, small DNA fragments; copy number; rearrangements, truncations, or deletions of the intended insertion DNA; and the presence of transformation plasmid backbone sequences. This molecular evidence from SbS analysis is used to characterize and select GM plants meeting optimal molecular characterization criteria. SbS technology has proven to be a robust event screening tool for use in a high-throughput molecular characterization environment.
Collapse
Affiliation(s)
| | - Haining Lin
- DuPont Pioneer, 7300 NW 62nd Ave., Johnston, IA, 50131
| | - Amy L Sigmund
- DuPont Pioneer, 7300 NW 62nd Ave., Johnston, IA, 50131
| | | | | | - Kevin R Hayes
- DuPont Pioneer, 7300 NW 62nd Ave., Johnston, IA, 50131
| | | | | | - Gregory D May
- DuPont Pioneer, 7300 NW 62nd Ave., Johnston, IA, 50131
| | - Mary K Beatty
- DuPont Pioneer, 7300 NW 62nd Ave., Johnston, IA, 50131
| |
Collapse
|
10
|
Motamayor JC, Mockaitis K, Schmutz J, Haiminen N, III DL, Cornejo O, Findley SD, Zheng P, Utro F, Royaert S, Saski C, Jenkins J, Podicheti R, Zhao M, Scheffler BE, Stack JC, Feltus FA, Mustiga GM, Amores F, Phillips W, Marelli JP, May GD, Shapiro H, Ma J, Bustamante CD, Schnell RJ, Main D, Gilbert D, Parida L, Kuhn DN. The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color. Genome Biol 2013; 14:r53. [PMID: 23731509 PMCID: PMC4053823 DOI: 10.1186/gb-2013-14-6-r53] [Citation(s) in RCA: 146] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2013] [Revised: 04/09/2013] [Accepted: 06/03/2013] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Theobroma cacao L. cultivar Matina 1-6 belongs to the most cultivated cacao type. The availability of its genome sequence and methods for identifying genes responsible for important cacao traits will aid cacao researchers and breeders. RESULTS We describe the sequencing and assembly of the genome of Theobroma cacao L. cultivar Matina 1-6. The genome of the Matina 1-6 cultivar is 445 Mbp, which is significantly larger than a sequenced Criollo cultivar, and more typical of other cultivars. The chromosome-scale assembly, version 1.1, contains 711 scaffolds covering 346.0 Mbp, with a contig N50 of 84.4 kbp, a scaffold N50 of 34.4 Mbp, and an evidence-based gene set of 29,408 loci. Version 1.1 has 10x the scaffold N50 and 4x the contig N50 as Criollo, and includes 111 Mb more anchored sequence. The version 1.1 assembly has 4.4% gap sequence, while Criollo has 10.9%. Through a combination of haplotype, association mapping and gene expression analyses, we leverage this robust reference genome to identify a promising candidate gene responsible for pod color variation. We demonstrate that green/red pod color in cacao is likely regulated by the R2R3 MYB transcription factor TcMYB113, homologs of which determine pigmentation in Rosaceae, Solanaceae, and Brassicaceae. One SNP within the target site for a highly conserved trans-acting siRNA in dicots, found within TcMYB113, seems to affect transcript levels of this gene and therefore pod color variation. CONCLUSIONS We report a high-quality sequence and annotation of Theobroma cacao L. and demonstrate its utility in identifying candidate genes regulating traits.
Collapse
Affiliation(s)
| | - Keithanne Mockaitis
- Department of Biology, and Center for Genomics and Bioinformatics, Indiana University, 915 E. Third St, Bloomington, IN, 47405, USA
| | - Jeremy Schmutz
- Mars, Incorporated, 6885 Elm Street, McLean, VA, 22101, USA
- HudsonAlpha Institute for Biotechnology, 601 Genome Way NW, Huntsville, AL, 35806, USA
| | - Niina Haiminen
- IBM T J Watson Research, Yorktown Heights, NY, 10598, USA
| | - Donald Livingstone III
- Mars, Incorporated, 6885 Elm Street, McLean, VA, 22101, USA
- United States Department of Agriculture-Agriculture Research Service, Subtropical Horticulture Research Station, 13601 Old Cutler Rd, Miami, FL, 33158, USA
| | - Omar Cornejo
- Department of Genetics, Stanford University, 300 Pasteur Dr, Stanford, CA, 94305, USA
| | - Seth D Findley
- Mars, Incorporated, 6885 Elm Street, McLean, VA, 22101, USA
| | - Ping Zheng
- Department of Horticulture, Washington State University, Johnson Hall, Pullman, WA, 99164, USA
| | - Filippo Utro
- IBM T J Watson Research, Yorktown Heights, NY, 10598, USA
| | - Stefan Royaert
- United States Department of Agriculture-Agriculture Research Service, Subtropical Horticulture Research Station, 13601 Old Cutler Rd, Miami, FL, 33158, USA
| | - Christopher Saski
- Clemson University Genomics Institute, 105 Collings Street, Clemson, SC, 29634, USA
| | - Jerry Jenkins
- Mars, Incorporated, 6885 Elm Street, McLean, VA, 22101, USA
- HudsonAlpha Institute for Biotechnology, 601 Genome Way NW, Huntsville, AL, 35806, USA
| | - Ram Podicheti
- Center for Genomics and Bioinformatics and School of Informatics and Computing, Indiana University, 919 E 10th St, Bloomington, IN, 47408, USA
| | - Meixia Zhao
- Department of Agronomy, Purdue University, West Lafayette, IN, 47907, USA
| | - Brian E Scheffler
- United States Department of Agriculture-Agriculture Research Service, Genomics and Bioinformatics Research Unit, 141 Experiment Station Road, Stoneville, MS, 38776, USA
| | - Joseph C Stack
- Mars, Incorporated, 6885 Elm Street, McLean, VA, 22101, USA
| | - Frank A Feltus
- Clemson University Genomics Institute, 105 Collings Street, Clemson, SC, 29634, USA
| | | | - Freddy Amores
- Estación Experimental Tropical Pichilingue, Instituto Nacional Autónomo de Investigaciones Agropecuarias (INIAP), Código Postal 24, Km 5 vía Quevedo - El Empalme, Quevedo, Ecuador
| | - Wilbert Phillips
- Programa de Mejoramiento de Cacao, CATIE 7170, Turrialba, Costa Rica
| | | | - Gregory D May
- National Center for Genome Resources, 2935 Rodeo Park Drive E, Santa Fe, NM, 87505, USA
| | - Howard Shapiro
- Mars, Incorporated, 6885 Elm Street, McLean, VA, 22101, USA
| | - Jianxin Ma
- Department of Agronomy, Purdue University, West Lafayette, IN, 47907, USA
| | - Carlos D Bustamante
- Department of Genetics, Stanford University, 300 Pasteur Dr, Stanford, CA, 94305, USA
| | - Raymond J Schnell
- Mars, Incorporated, 6885 Elm Street, McLean, VA, 22101, USA
- United States Department of Agriculture-Agriculture Research Service, Subtropical Horticulture Research Station, 13601 Old Cutler Rd, Miami, FL, 33158, USA
| | - Dorrie Main
- Department of Horticulture, Washington State University, Johnson Hall, Pullman, WA, 99164, USA
| | - Don Gilbert
- Department of Biology, and Center for Genomics and Bioinformatics, Indiana University, 915 E. Third St, Bloomington, IN, 47405, USA
| | - Laxmi Parida
- IBM T J Watson Research, Yorktown Heights, NY, 10598, USA
| | - David N Kuhn
- United States Department of Agriculture-Agriculture Research Service, Subtropical Horticulture Research Station, 13601 Old Cutler Rd, Miami, FL, 33158, USA
| |
Collapse
|
11
|
Stanton-Geddes J, Paape T, Epstein B, Briskine R, Yoder J, Mudge J, Bharti AK, Farmer AD, Zhou P, Denny R, May GD, Erlandson S, Yakub M, Sugawara M, Sadowsky MJ, Young ND, Tiffin P. Candidate genes and genetic architecture of symbiotic and agronomic traits revealed by whole-genome, sequence-based association genetics in Medicago truncatula. PLoS One 2013; 8:e65688. [PMID: 23741505 PMCID: PMC3669257 DOI: 10.1371/journal.pone.0065688] [Citation(s) in RCA: 144] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2012] [Accepted: 04/27/2013] [Indexed: 02/01/2023] Open
Abstract
Genome-wide association study (GWAS) has revolutionized the search for the genetic basis of complex traits. To date, GWAS have generally relied on relatively sparse sampling of nucleotide diversity, which is likely to bias results by preferentially sampling high-frequency SNPs not in complete linkage disequilibrium (LD) with causative SNPs. To avoid these limitations we conducted GWAS with >6 million SNPs identified by sequencing the genomes of 226 accessions of the model legume Medicago truncatula. We used these data to identify candidate genes and the genetic architecture underlying phenotypic variation in plant height, trichome density, flowering time, and nodulation. The characteristics of candidate SNPs differed among traits, with candidates for flowering time and trichome density in distinct clusters of high linkage disequilibrium (LD) and the minor allele frequencies (MAF) of candidates underlying variation in flowering time and height significantly greater than MAF of candidates underlying variation in other traits. Candidate SNPs tagged several characterized genes including nodulation related genes SERK2, MtnodGRP3, MtMMPL1, NFP, CaML3, MtnodGRP3A and flowering time gene MtFD as well as uncharacterized genes that become candidates for further molecular characterization. By comparing sequence-based candidates to candidates identified by in silico 250K SNP arrays, we provide an empirical example of how reliance on even high-density reduced representation genomic makers can bias GWAS results. Depending on the trait, only 30–70% of the top 20 in silico array candidates were within 1 kb of sequence-based candidates. Moreover, the sequence-based candidates tagged by array candidates were heavily biased towards common variants; these comparisons underscore the need for caution when interpreting results from GWAS conducted with sparsely covered genomes.
Collapse
Affiliation(s)
- John Stanton-Geddes
- Department of Plant Biology, University of Minnesota, Saint Paul, Minnesota, United States of America
| | - Timothy Paape
- Department of Plant Biology, University of Minnesota, Saint Paul, Minnesota, United States of America
| | - Brendan Epstein
- Department of Plant Biology, University of Minnesota, Saint Paul, Minnesota, United States of America
| | - Roman Briskine
- Department of Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Jeremy Yoder
- Department of Plant Biology, University of Minnesota, Saint Paul, Minnesota, United States of America
| | - Joann Mudge
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Arvind K. Bharti
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Andrew D. Farmer
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Peng Zhou
- Department of Plant Pathology, University of Minnesota, Saint Paul, Minnesota, United States of America
| | - Roxanne Denny
- Department of Plant Pathology, University of Minnesota, Saint Paul, Minnesota, United States of America
| | - Gregory D. May
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Stephanie Erlandson
- Department of Plant Biology, University of Minnesota, Saint Paul, Minnesota, United States of America
| | - Mohammed Yakub
- Department of Plant Biology, University of Minnesota, Saint Paul, Minnesota, United States of America
| | - Masayuki Sugawara
- Department of Soil, Water, and Climate, and BioTechnology Institute, University of Minnesota, St. Paul, Minnesota, United States of America
| | - Michael J. Sadowsky
- Department of Soil, Water, and Climate, and BioTechnology Institute, University of Minnesota, St. Paul, Minnesota, United States of America
| | - Nevin D. Young
- Department of Plant Biology, University of Minnesota, Saint Paul, Minnesota, United States of America
- Department of Plant Pathology, University of Minnesota, Saint Paul, Minnesota, United States of America
| | - Peter Tiffin
- Department of Plant Biology, University of Minnesota, Saint Paul, Minnesota, United States of America
- * E-mail:
| |
Collapse
|
12
|
Sugawara M, Epstein B, Badgley BD, Unno T, Xu L, Reese J, Gyaneshwar P, Denny R, Mudge J, Bharti AK, Farmer AD, May GD, Woodward JE, Médigue C, Vallenet D, Lajus A, Rouy Z, Martinez-Vaz B, Tiffin P, Young ND, Sadowsky MJ. Comparative genomics of the core and accessory genomes of 48 Sinorhizobium strains comprising five genospecies. Genome Biol 2013; 14:R17. [PMID: 23425606 PMCID: PMC4053727 DOI: 10.1186/gb-2013-14-2-r17] [Citation(s) in RCA: 127] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2012] [Accepted: 02/20/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The sinorhizobia are amongst the most well studied members of nitrogen-fixing root nodule bacteria and contribute substantial amounts of fixed nitrogen to the biosphere. While the alfalfa symbiont Sinorhizobium meliloti RM 1021 was one of the first rhizobial strains to be completely sequenced, little information is available about the genomes of this large and diverse species group. RESULTS Here we report the draft assembly and annotation of 48 strains of Sinorhizobium comprising five genospecies. While S. meliloti and S. medicae are taxonomically related, they displayed different nodulation patterns on diverse Medicago host plants, and have differences in gene content, including those involved in conjugation and organic sulfur utilization. Genes involved in Nod factor and polysaccharide biosynthesis, denitrification and type III, IV, and VI secretion systems also vary within and between species. Symbiotic phenotyping and mutational analyses indicated that some type IV secretion genes are symbiosis-related and involved in nitrogen fixation efficiency. Moreover, there is a correlation between the presence of type IV secretion systems, heme biosynthesis and microaerobic denitrification genes, and symbiotic efficiency. CONCLUSIONS Our results suggest that each Sinorhizobium strain uses a slightly different strategy to obtain maximum compatibility with a host plant. This large genome data set provides useful information to better understand the functional features of five Sinorhizobium species, especially compatibility in legume-Sinorhizobium interactions. The diversity of genes present in the accessory genomes of members of this genus indicates that each bacterium has adopted slightly different strategies to interact with diverse plant genera and soil environments.
Collapse
|
13
|
Yoder JB, Briskine R, Mudge J, Farmer A, Paape T, Steele K, Weiblen GD, Bharti AK, Zhou P, May GD, Young ND, Tiffin P. Phylogenetic signal variation in the genomes of Medicago (Fabaceae). Syst Biol 2013; 62:424-38. [PMID: 23417680 DOI: 10.1093/sysbio/syt009] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Genome-scale data offer the opportunity to clarify phylogenetic relationships that are difficult to resolve with few loci, but they can also identify genomic regions with evolutionary history distinct from that of the species history. We collected whole-genome sequence data from 29 taxa in the legume genus Medicago, then aligned these sequences to the Medicago truncatula reference genome to confidently identify 87 596 variable homologous sites. We used this data set to estimate phylogenetic relationships among Medicago species, to investigate the number of sites needed to provide robust phylogenetic estimates and to identify specific genomic regions supporting topologies in conflict with the genome-wide phylogeny. Our full genomic data set resolves relationships within the genus that were previously intractable. Subsampling the data reveals considerable variation in phylogenetic signal and power in smaller subsets of the data. Even when sampling 5000 sites, no random sample of the data supports a topology identical to that of the genome-wide phylogeny. Phylogenetic relationships estimated from 500-site sliding windows revealed genome regions supporting several alternative species relationships among recently diverged taxa, consistent with the expected effects of deep coalescence or introgression in the recent history of Medicago.
Collapse
Affiliation(s)
- Jeremy B Yoder
- Department of Plant Biology, University of Minnesota, Saint Paul MN 55108, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
14
|
Singer SR, Schwarz JA, Manduca CA, Fox SP, Iverson ER, Taylor BJ, Cannon SB, May GD, Maki SL, Farmer AD, Doyle JJ. IBI series winner. Keeping an eye on biology. Science 2013; 339:408-9. [PMID: 23349282 DOI: 10.1126/science.1229848] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Affiliation(s)
- Susan R Singer
- Department of Biology, Carleton College, Northfield, MN 55057, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Marques JV, Kim KW, Lee C, Costa MA, May GD, Crow JA, Davin LB, Lewis NG. Next generation sequencing in predicting gene function in podophyllotoxin biosynthesis. J Biol Chem 2012; 288:466-79. [PMID: 23161544 DOI: 10.1074/jbc.m112.400689] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Podophyllum species are sources of (-)-podophyllotoxin, an aryltetralin lignan used for semi-synthesis of various powerful and extensively employed cancer-treating drugs. Its biosynthetic pathway, however, remains largely unknown, with the last unequivocally demonstrated intermediate being (-)-matairesinol. Herein, massively parallel sequencing of Podophyllum hexandrum and Podophyllum peltatum transcriptomes and subsequent bioinformatics analyses of the corresponding assemblies were carried out. Validation of the assembly process was first achieved through confirmation of assembled sequences with those of various genes previously established as involved in podophyllotoxin biosynthesis as well as other candidate biosynthetic pathway genes. This contribution describes characterization of two of the latter, namely the cytochrome P450s, CYP719A23 from P. hexandrum and CYP719A24 from P. peltatum. Both enzymes were capable of converting (-)-matairesinol into (-)-pluviatolide by catalyzing methylenedioxy bridge formation and did not act on other possible substrates tested. Interestingly, the enzymes described herein were highly similar to methylenedioxy bridge-forming enzymes from alkaloid biosynthesis, whereas candidates more similar to lignan biosynthetic enzymes were catalytically inactive with the substrates employed. This overall strategy has thus enabled facile further identification of enzymes putatively involved in (-)-podophyllotoxin biosynthesis and underscores the deductive power of next generation sequencing and bioinformatics to probe and deduce medicinal plant biosynthetic pathways.
Collapse
Affiliation(s)
- Joaquim V Marques
- Institute of Biological Chemistry, Washington State University, Pullman, Washington 99164-6340, USA
| | | | | | | | | | | | | | | |
Collapse
|
16
|
Li X, Acharya A, Farmer AD, Crow JA, Bharti AK, Kramer RS, Wei Y, Han Y, Gou J, May GD, Monteros MJ, Brummer EC. Prevalence of single nucleotide polymorphism among 27 diverse alfalfa genotypes as assessed by transcriptome sequencing. BMC Genomics 2012; 13:568. [PMID: 23107476 PMCID: PMC3533575 DOI: 10.1186/1471-2164-13-568] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2012] [Accepted: 10/18/2012] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND Alfalfa, a perennial, outcrossing species, is a widely planted forage legume producing highly nutritious biomass. Currently, improvement of cultivated alfalfa mainly relies on recurrent phenotypic selection. Marker assisted breeding strategies can enhance alfalfa improvement efforts, particularly if many genome-wide markers are available. Transcriptome sequencing enables efficient high-throughput discovery of single nucleotide polymorphism (SNP) markers for a complex polyploid species. RESULT The transcriptomes of 27 alfalfa genotypes, including elite breeding genotypes, parents of mapping populations, and unimproved wild genotypes, were sequenced using an Illumina Genome Analyzer IIx. De novo assembly of quality-filtered 72-bp reads generated 25,183 contigs with a total length of 26.8 Mbp and an average length of 1,065 bp, with an average read depth of 55.9-fold for each genotype. Overall, 21,954 (87.2%) of the 25,183 contigs represented 14,878 unique protein accessions. Gene ontology (GO) analysis suggested that a broad diversity of genes was represented in the resulting sequences. The realignment of individual reads to the contigs enabled the detection of 872,384 SNPs and 31,760 InDels. High resolution melting (HRM) analysis was used to validate 91% of 192 putative SNPs identified by sequencing. Both allelic variants at about 95% of SNP sites identified among five wild, unimproved genotypes are still present in cultivated alfalfa, and all four US breeding programs also contain a high proportion of these SNPs. Thus, little evidence exists among this dataset for loss of significant DNA sequence diversity from either domestication or breeding of alfalfa. Structure analysis indicated that individuals from the subspecies falcata, the diploid subspecies caerulea, and the tetraploid subspecies sativa (cultivated tetraploid alfalfa) were clearly separated. CONCLUSION We used transcriptome sequencing to discover large numbers of SNPs segregating in elite breeding populations of alfalfa. Little loss of SNP diversity was evident between unimproved and elite alfalfa germplasm. The EST and SNP markers generated from this study are publicly available at the Legume Information System ( http://medsa.comparative-legumes.org/) and can contribute to future alfalfa research and breeding applications.
Collapse
Affiliation(s)
- Xuehui Li
- The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK, 73401, USA
| | - Ananta Acharya
- Institute of Plant Breeding, Genetics & Genomics, University of Georgia, Athens, GA, 30602, USA
| | - Andrew D Farmer
- National Center for Genome Resources, Santa Fe, NM, 87505, USA
| | - John A Crow
- National Center for Genome Resources, Santa Fe, NM, 87505, USA
| | - Arvind K Bharti
- National Center for Genome Resources, Santa Fe, NM, 87505, USA
| | - Robin S Kramer
- National Center for Genome Resources, Santa Fe, NM, 87505, USA
| | - Yanling Wei
- The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK, 73401, USA
| | - Yuanhong Han
- The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK, 73401, USA
| | - Jiqing Gou
- The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK, 73401, USA
| | - Gregory D May
- National Center for Genome Resources, Santa Fe, NM, 87505, USA
| | - Maria J Monteros
- The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK, 73401, USA
| | - E Charles Brummer
- The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK, 73401, USA
| |
Collapse
|
17
|
Saxena RK, Penmetsa RV, Upadhyaya HD, Kumar A, Carrasquilla-Garcia N, Schlueter JA, Farmer A, Whaley AM, Sarma BK, May GD, Cook DR, Varshney RK. Large-scale development of cost-effective single-nucleotide polymorphism marker assays for genetic mapping in pigeonpea and comparative mapping in legumes. DNA Res 2012; 19:449-61. [PMID: 23103470 PMCID: PMC3514856 DOI: 10.1093/dnares/dss025] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Single-nucleotide polymorphisms (SNPs, >2000) were discovered by using RNA-seq and allele-specific sequencing approaches in pigeonpea (Cajanus cajan). For making the SNP genotyping cost-effective, successful competitive allele-specific polymerase chain reaction (KASPar) assays were developed for 1616 SNPs and referred to as PKAMs (pigeonpea KASPar assay markers). Screening of PKAMs on 24 genotypes [23 from cultivated species and 1 wild species (Cajanus scarabaeoides)] defined a set of 1154 polymorphic markers (77.4%) with a polymorphism information content (PIC) value from 0.04 to 0.38. One thousand and ninety-four PKAMs showed polymorphisms between parental lines of the reference mapping population (C. cajan ICP 28 × C. scarabaeoides ICPW 94). By using high-quality marker genotyping data on 167 F2 lines from the population, a comprehensive genetic map comprising 875 PKAMs with an average inter-marker distance of 1.11 cM was developed. Previously mapped 35 simple sequence repeat markers were integrated into the PKAM map and an integrated genetic map of 996.21 cM was constructed. Mapped PKAMs showed a higher degree of synteny with the genome of Glycine max followed by Medicago truncatula and Lotus japonicus and least with Vigna unguiculata. These PKAMs will be useful for genetics research and breeding applications in pigeonpea and for utilizing genome information from other legume species.
Collapse
Affiliation(s)
- Rachit K Saxena
- Center of Excellence in Genomics (CEG), International Crops Research Institute for Semi-Arid Tropics (ICRISAT), Patancheru 502324, India
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Abstract
The advent of next-generation DNA sequencing (NGS) technologies has led to the development of rapid genome-wide Single Nucleotide Polymorphism (SNP) detection applications in various plant species. Recent improvements in sequencing throughput combined with an overall decrease in costs per gigabase of sequence is allowing NGS to be applied to not only the evaluation of small subsets of parental inbred lines, but also the mapping and characterization of traits of interest in much larger populations. Such an approach, where sequences are used simultaneously to detect and score SNPs, therefore bypassing the entire marker assay development stage, is known as genotyping-by-sequencing (GBS). This review will summarize the current state of GBS in plants and the promises it holds as a genome-wide genotyping application.
Collapse
Affiliation(s)
- Stéphane Deschamps
- DuPont Agricultural Biotechnology, Experimental Station, PO Box 80353, 200 Powder Mill Road, Wilmington, DE 19880-0353, USA.
| | - Victor Llaca
- DuPont Agricultural Biotechnology, Experimental Station, PO Box 80353, 200 Powder Mill Road, Wilmington, DE 19880-0353, USA.
| | - Gregory D May
- DuPont Pioneer, 7300 NW 62nd Ave., P.O. Box 1004, Johnston, IA 50131-1004, USA.
| |
Collapse
|
19
|
Thudi M, Li Y, Jackson SA, May GD, Varshney RK. Current state-of-art of sequencing technologies for plant genomics research. Brief Funct Genomics 2012; 11:3-11. [PMID: 22345601 DOI: 10.1093/bfgp/elr045] [Citation(s) in RCA: 100] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
A number of next-generation sequencing (NGS) technologies such as Roche/454, Illumina and AB SOLiD have recently become available. These technologies are capable of generating hundreds of thousands or tens of millions of short DNA sequence reads at a relatively low cost. These NGS technologies, now referred as second-generation sequencing (SGS) technologies, are being utilized for de novo sequencing, genome re-sequencing, and whole genome and transcriptome analysis. Now, new generation of sequencers, based on the 'next-next' or third-generation sequencing (TGS) technologies like the Single-Molecule Real-Time (SMRT™) Sequencer, Heliscope™ Single Molecule Sequencer, and the Ion Personal Genome Machine™ are becoming available that are capable of generating longer sequence reads in a shorter time and at even lower costs per instrument run. Ever declining sequencing costs and increased data output and sample throughput for NGS and TGS sequencing technologies enable the plant genomics and breeding community to undertake genotyping-by-sequencing (GBS). Data analysis, storage and management of large-scale second or TGS projects, however, are essential. This article provides an overview of different sequencing technologies with an emphasis on forthcoming TGS technologies and bioinformatics tools required for the latest evolution of DNA sequencing platforms.
Collapse
Affiliation(s)
- Mahendar Thudi
- Centre of Excellence in Genomics, ICRISAT, Patancheru 502 324, Greater Hyderabad, India
| | | | | | | | | |
Collapse
|
20
|
Kudapa H, Bharti AK, Cannon SB, Farmer AD, Mulaosmanovic B, Kramer R, Bohra A, Weeks NT, Crow JA, Tuteja R, Shah T, Dutta S, Gupta DK, Singh A, Gaikwad K, Sharma TR, May GD, Singh NK, Varshney RK. A comprehensive transcriptome assembly of Pigeonpea (Cajanus cajan L.) using sanger and second-generation sequencing platforms. Mol Plant 2012; 5:1020-8. [PMID: 22241453 PMCID: PMC3440007 DOI: 10.1093/mp/ssr111] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2011] [Accepted: 11/29/2011] [Indexed: 05/18/2023]
Abstract
A comprehensive transcriptome assembly for pigeonpea has been developed by analyzing 128.9 million short Illumina GA IIx single end reads, 2.19 million single end FLX/454 reads, and 18 353 Sanger expressed sequenced tags from more than 16 genotypes. The resultant transcriptome assembly, referred to as CcTA v2, comprised 21 434 transcript assembly contigs (TACs) with an N50 of 1510 bp, the largest one being ~8 kb. Of the 21 434 TACs, 16 622 (77.5%) could be mapped on to the soybean genome build 1.0.9 under fairly stringent alignment parameters. Based on knowledge of intron junctions, 10 009 primer pairs were designed from 5033 TACs for amplifying intron spanning regions (ISRs). By using in silico mapping of BAC-end-derived SSR loci of pigeonpea on the soybean genome as a reference, putative mapping positions at the chromosome level were predicted for 6284 ISR markers, covering all 11 pigeonpea chromosomes. A subset of 128 ISR markers were analyzed on a set of eight genotypes. While 116 markers were validated, 70 markers showed one to three alleles, with an average of 0.16 polymorphism information content (PIC) value. In summary, the CcTA v2 transcript assembly and ISR markers will serve as a useful resource to accelerate genetic research and breeding applications in pigeonpea.
Collapse
Affiliation(s)
- Himabindu Kudapa
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru 502324, India
| | - Arvind K. Bharti
- National Center for Genome Resources (NCGR), Santa Fe, NM 87505, USA
| | - Steven B. Cannon
- United States Department of Agriculture–Agricultural Research Service (USDA–ARS), Corn Insects and Crop Genetics Research Unit, Ames, IA, USA
- Department of Agronomy, Iowa State University, Amens, IA, USA
| | - Andrew D. Farmer
- National Center for Genome Resources (NCGR), Santa Fe, NM 87505, USA
| | - Benjamin Mulaosmanovic
- United States Department of Agriculture–Agricultural Research Service (USDA–ARS), Corn Insects and Crop Genetics Research Unit, Ames, IA, USA
| | - Robin Kramer
- National Center for Genome Resources (NCGR), Santa Fe, NM 87505, USA
| | - Abhishek Bohra
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru 502324, India
| | - Nathan T. Weeks
- United States Department of Agriculture–Agricultural Research Service (USDA–ARS), Corn Insects and Crop Genetics Research Unit, Ames, IA, USA
| | - John A. Crow
- National Center for Genome Resources (NCGR), Santa Fe, NM 87505, USA
| | - Reetu Tuteja
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru 502324, India
| | - Trushar Shah
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru 502324, India
| | - Sutapa Dutta
- National Research Centre on Plant Biotechnology (NRCPB), Indian Agricultural Research Institute, New Delhi 110 012, India
| | - Deepak K. Gupta
- National Research Centre on Plant Biotechnology (NRCPB), Indian Agricultural Research Institute, New Delhi 110 012, India
| | - Archana Singh
- National Research Centre on Plant Biotechnology (NRCPB), Indian Agricultural Research Institute, New Delhi 110 012, India
| | - Kishor Gaikwad
- National Research Centre on Plant Biotechnology (NRCPB), Indian Agricultural Research Institute, New Delhi 110 012, India
| | - Tilak R. Sharma
- National Research Centre on Plant Biotechnology (NRCPB), Indian Agricultural Research Institute, New Delhi 110 012, India
| | - Gregory D. May
- National Center for Genome Resources (NCGR), Santa Fe, NM 87505, USA
| | - Nagendra K. Singh
- National Research Centre on Plant Biotechnology (NRCPB), Indian Agricultural Research Institute, New Delhi 110 012, India
| | - Rajeev K. Varshney
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru 502324, India
- CGIAR Generation Challenge Programme (GCP), c/o CIMMYT, 06600 Mexico DF, Mexico
- To whom correspondence should be addressed at address. E-mail , tel. +91 4030713305, fax +91 40 30713074
| |
Collapse
|
21
|
Ryvkin A, Ashkenazy H, Smelyanski L, Kaplan G, Penn O, Weiss-Ottolenghi Y, Privman E, Ngam PB, Woodward JE, May GD, Bell C, Pupko T, Gershoni JM. Deep Panning: steps towards probing the IgOme. PLoS One 2012; 7:e41469. [PMID: 22870226 PMCID: PMC3409857 DOI: 10.1371/journal.pone.0041469] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2012] [Accepted: 06/21/2012] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND Polyclonal serum consists of vast collections of antibodies, products of differentiated B-cells. The spectrum of antibody specificities is dynamic and varies with age, physiology, and exposure to pathological insults. The complete repertoire of antibody specificities in blood, the IgOme, is therefore an extraordinarily rich source of information-a molecular record of previous encounters as well as a status report of current immune activity. The ability to profile antibody specificities of polyclonal serum at exceptionally high resolution has been an important and serious challenge which can now be overcome. METHODOLOGY/PRINCIPAL FINDINGS Here we illustrate the application of Deep Panning, a method that combines the flexibility of combinatorial phage display of random peptides with the power of high-throughput deep sequencing. Deep Panning is first applied to evaluate the quality and diversity of naïve random peptide libraries. The production of very large data sets, hundreds of thousands of peptides, has revealed unexpected properties of combinatorial random peptide libraries and indicates correctives to ensure the quality of the libraries generated. Next, Deep Panning is used to analyze a model monoclonal antibody in addition to allowing one to follow the dynamics of biopanning and peptide selection. Finally Deep Panning is applied to profile polyclonal sera derived from HIV infected individuals. CONCLUSIONS/SIGNIFICANCE The ability to generate and characterize hundreds of thousands of affinity-selected peptides creates an effective means towards the interrogation of the IgOme and understanding of the humoral response to disease. Deep Panning should open the door to new possibilities for serological diagnostics, vaccine design and the discovery of the correlates of immunity to emerging infectious agents.
Collapse
Affiliation(s)
- Arie Ryvkin
- Department of Cell Research and Immunology, Tel Aviv University, Tel Aviv, Israel
| | - Haim Ashkenazy
- Department of Cell Research and Immunology, Tel Aviv University, Tel Aviv, Israel
| | - Larisa Smelyanski
- Department of Cell Research and Immunology, Tel Aviv University, Tel Aviv, Israel
| | - Gilad Kaplan
- Department of Cell Research and Immunology, Tel Aviv University, Tel Aviv, Israel
| | - Osnat Penn
- Department of Cell Research and Immunology, Tel Aviv University, Tel Aviv, Israel
| | | | - Eyal Privman
- Department of Cell Research and Immunology, Tel Aviv University, Tel Aviv, Israel
| | - Peter B. Ngam
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - James E. Woodward
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Gregory D. May
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Callum Bell
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Tal Pupko
- Department of Cell Research and Immunology, Tel Aviv University, Tel Aviv, Israel
| | - Jonathan M. Gershoni
- Department of Cell Research and Immunology, Tel Aviv University, Tel Aviv, Israel
- * E-mail:
| |
Collapse
|
22
|
Peiffer GA, King KE, Severin AJ, May GD, Cianzio SR, Lin SF, Lauter NC, Shoemaker RC. Identification of candidate genes underlying an iron efficiency quantitative trait locus in soybean. Plant Physiol 2012; 158:1745-54. [PMID: 22319075 PMCID: PMC3320182 DOI: 10.1104/pp.111.189860] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/24/2011] [Accepted: 01/29/2012] [Indexed: 05/19/2023]
Abstract
Prevalent on calcareous soils in the United States and abroad, iron deficiency is among the most common and severe nutritional stresses in plants. In soybean (Glycine max) commercial plantings, the identification and use of iron-efficient genotypes has proven to be the best form of managing this soil-related plant stress. Previous studies conducted in soybean identified a significant iron efficiency quantitative trait locus (QTL) explaining more than 70% of the phenotypic variation for the trait. In this research, we identified candidate genes underlying this QTL through molecular breeding, mapping, and transcriptome sequencing. Introgression mapping was performed using two related near-isogenic lines in which a region located on soybean chromosome 3 required for iron efficiency was identified. The region corresponds to the previously reported iron efficiency QTL. The location was further confirmed through QTL mapping conducted in this study. Transcriptome sequencing and quantitative real-time-polymerase chain reaction identified two genes encoding transcription factors within the region that were significantly induced in soybean roots under iron stress. The two induced transcription factors were identified as homologs of the subgroup lb basic helix-loop-helix (bHLH) genes that are known to regulate the strategy I response in Arabidopsis (Arabidopsis thaliana). Resequencing of these differentially expressed genes unveiled a significant deletion within a predicted dimerization domain. We hypothesize that this deletion disrupts the Fe-DEFICIENCY-INDUCED TRANSCRIPTION FACTOR (FIT)/bHLH heterodimer that has been shown to induce known iron acquisition genes.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Randy C. Shoemaker
- Department of Agronomy, Iowa State University, Ames, Iowa 50010 (G.A.P., K.E.K., A.J.S., S.R.C.); National Center for Genome Research, Santa Fe, New Mexico 87505 (G.D.M.); Department of Agronomy, National Taiwan University, Taipei, Taiwan, Republic of China (S.F.L.); and Corn Insects and Crop Genetics Research Unit, United States Department of Agriculture-Agricultural Research Service, Ames, Iowa 50010 (N.C.L., R.C.S.)
| |
Collapse
|
23
|
Ilut DC, Coate JE, Luciano AK, Owens TG, May GD, Farmer A, Doyle JJ. A comparative transcriptomic study of an allotetraploid and its diploid progenitors illustrates the unique advantages and challenges of RNA-seq in plant species. Am J Bot 2012; 99:383-96. [PMID: 22301896 DOI: 10.3732/ajb.1100312] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
PREMISE OF THE STUDY RNA-seq analysis of plant transcriptomes poses unique challenges due to the highly duplicated nature of plant genomes. We address these challenges in the context of recently formed polyploid species and detail an RNA-seq experiment comparing the leaf transcriptome profile of an allopolyploid relative of soybean with the diploid species that contributed its homoeologous genomes. METHODS RNA-seq reads were obtained from the three species and were aligned against the genome sequence of Glycine max. Transcript levels were estimated for each gene, relative contributions of polyploidy-duplicated loci (homoeologues) in the tetraploid were identified, and comparisons of transcript profiles and individual genes were used to analyze the regulation of transcript levels. KEY RESULTS We present a novel metric developed to address issues arising from high degrees of gene space duplication and a method for dissecting a gene's measured transcript level in a polyploid species into the relative contribution of its homoeologues. We identify the gene family likely contributing to differences in photosynthetic rate between the allotetraploid and its progenitors and show that the tetraploid appears to be using the "redundant" gene copies in novel ways. CONCLUSIONS Given the prevalence of polyploidy events in plants, we believe many of the approaches developed here to be applicable, and often necessary, in most plant RNA-seq experiments. The deep sampling provided by RNA-seq allows us to dissect the genetic underpinnings of specific phenotypes as well as examine complex interactions within polyploid genomes.
Collapse
Affiliation(s)
- Daniel C Ilut
- Department of Plant Biology, Cornell University, Ithaca, New York 14853, USA.
| | | | | | | | | | | | | |
Collapse
|
24
|
Azam S, Thakur V, Ruperao P, Shah T, Balaji J, Amindala B, Farmer AD, Studholme DJ, May GD, Edwards D, Jones JDG, Varshney RK. Coverage-based consensus calling (CbCC) of short sequence reads and comparison of CbCC results to identify SNPs in chickpea (Cicer arietinum; Fabaceae), a crop species without a reference genome. Am J Bot 2012; 99:186-192. [PMID: 22301893 DOI: 10.3732/ajb.1100419] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
PREMISE OF THE STUDY Next-generation sequencing (NGS) technologies are frequently used for resequencing and mining of single nucleotide polymorphisms (SNPs) by comparison to a reference genome. In crop species such as chickpea (Cicer arietinum) that lack a reference genome sequence, NGS-based SNP discovery is a challenge. Therefore, unlike probability-based statistical approaches for consensus calling and by comparison with a reference sequence, a coverage-based consensus calling (CbCC) approach was applied and two genotypes were compared for SNP identification. METHODS A CbCC approach is used in this study with four commonly used short read alignment tools (Maq, Bowtie, Novoalign, and SOAP2) and 15.7 and 22.1 million Illumina reads for chickpea genotypes ICC4958 and ICC1882, together with the chickpea trancriptome assembly (CaTA). KEY RESULTS A nonredundant set of 4543 SNPs was identified between two chickpea genotypes. Experimental validation of 224 randomly selected SNPs showed superiority of Maq among individual tools, as 50.0% of SNPs predicted by Maq were true SNPs. For combinations of two tools, greatest accuracy (55.7%) was reported for Maq and Bowtie, with a combination of Bowtie, Maq, and Novoalign identifying 61.5% true SNPs. SNP prediction accuracy generally increased with increasing reads depth. CONCLUSIONS This study provides a benchmark comparison of tools as well as read depths for four commonly used tools for NGS SNP discovery in a crop species without a reference genome sequence. In addition, a large number of SNPs have been identified in chickpea that would be useful for molecular breeding.
Collapse
Affiliation(s)
- Sarwar Azam
- Centre of Excellence in Genomics, International Crops Research Institute for the Semi-Arid Tropics, Patancheru 502324, Andhra Pradesh, India
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
25
|
Young ND, Debellé F, Oldroyd GED, Geurts R, Cannon SB, Udvardi MK, Benedito VA, Mayer KFX, Gouzy J, Schoof H, Van de Peer Y, Proost S, Cook DR, Meyers BC, Spannagl M, Cheung F, De Mita S, Krishnakumar V, Gundlach H, Zhou S, Mudge J, Bharti AK, Murray JD, Naoumkina MA, Rosen B, Silverstein KAT, Tang H, Rombauts S, Zhao PX, Zhou P, Barbe V, Bardou P, Bechner M, Bellec A, Berger A, Bergès H, Bidwell S, Bisseling T, Choisne N, Couloux A, Denny R, Deshpande S, Dai X, Doyle JJ, Dudez AM, Farmer AD, Fouteau S, Franken C, Gibelin C, Gish J, Goldstein S, González AJ, Green PJ, Hallab A, Hartog M, Hua A, Humphray SJ, Jeong DH, Jing Y, Jöcker A, Kenton SM, Kim DJ, Klee K, Lai H, Lang C, Lin S, Macmil SL, Magdelenat G, Matthews L, McCorrison J, Monaghan EL, Mun JH, Najar FZ, Nicholson C, Noirot C, O'Bleness M, Paule CR, Poulain J, Prion F, Qin B, Qu C, Retzel EF, Riddle C, Sallet E, Samain S, Samson N, Sanders I, Saurat O, Scarpelli C, Schiex T, Segurens B, Severin AJ, Sherrier DJ, Shi R, Sims S, Singer SR, Sinharoy S, Sterck L, Viollet A, Wang BB, Wang K, Wang M, Wang X, Warfsmann J, Weissenbach J, White DD, White JD, Wiley GB, Wincker P, Xing Y, Yang L, Yao Z, Ying F, Zhai J, Zhou L, Zuber A, Dénarié J, Dixon RA, May GD, Schwartz DC, Rogers J, Quétier F, Town CD, Roe BA. The Medicago genome provides insight into the evolution of rhizobial symbioses. Nature 2011; 480:520-4. [PMID: 22089132 PMCID: PMC3272368 DOI: 10.1038/nature10625] [Citation(s) in RCA: 762] [Impact Index Per Article: 58.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2011] [Accepted: 10/13/2011] [Indexed: 11/09/2022]
Abstract
Legumes (Fabaceae or Leguminosae) are unique among cultivated plants for their ability to carry out endosymbiotic nitrogen fixation with rhizobial bacteria, a process that takes place in a specialized structure known as the nodule. Legumes belong to one of the two main groups of eurosids, the Fabidae, which includes most species capable of endosymbiotic nitrogen fixation. Legumes comprise several evolutionary lineages derived from a common ancestor 60 million years ago (Myr ago). Papilionoids are the largest clade, dating nearly to the origin of legumes and containing most cultivated species. Medicago truncatula is a long-established model for the study of legume biology. Here we describe the draft sequence of the M. truncatula euchromatin based on a recently completed BAC assembly supplemented with Illumina shotgun sequence, together capturing ∼94% of all M. truncatula genes. A whole-genome duplication (WGD) approximately 58 Myr ago had a major role in shaping the M. truncatula genome and thereby contributed to the evolution of endosymbiotic nitrogen fixation. Subsequent to the WGD, the M. truncatula genome experienced higher levels of rearrangement than two other sequenced legumes, Glycine max and Lotus japonicus. M. truncatula is a close relative of alfalfa (Medicago sativa), a widely cultivated crop with limited genomics tools and complex autotetraploid genetics. As such, the M. truncatula genome sequence provides significant opportunities to expand alfalfa's genomic toolbox.
Collapse
Affiliation(s)
- Nevin D Young
- Department of Plant Pathology, University of Minnesota, St Paul, Minnesota 55108, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Varshney RK, Chen W, Li Y, Bharti AK, Saxena RK, Schlueter JA, Donoghue MTA, Azam S, Fan G, Whaley AM, Farmer AD, Sheridan J, Iwata A, Tuteja R, Penmetsa RV, Wu W, Upadhyaya HD, Yang SP, Shah T, Saxena KB, Michael T, McCombie WR, Yang B, Zhang G, Yang H, Wang J, Spillane C, Cook DR, May GD, Xu X, Jackson SA. Draft genome sequence of pigeonpea (Cajanus cajan), an orphan legume crop of resource-poor farmers. Nat Biotechnol 2011; 30:83-9. [PMID: 22057054 DOI: 10.1038/nbt.2022] [Citation(s) in RCA: 421] [Impact Index Per Article: 32.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2011] [Accepted: 10/03/2011] [Indexed: 11/08/2022]
Abstract
Pigeonpea is an important legume food crop grown primarily by smallholder farmers in many semi-arid tropical regions of the world. We used the Illumina next-generation sequencing platform to generate 237.2 Gb of sequence, which along with Sanger-based bacterial artificial chromosome end sequences and a genetic map, we assembled into scaffolds representing 72.7% (605.78 Mb) of the 833.07 Mb pigeonpea genome. Genome analysis predicted 48,680 genes for pigeonpea and also showed the potential role that certain gene families, for example, drought tolerance-related genes, have played throughout the domestication of pigeonpea and the evolution of its ancestors. Although we found a few segmental duplication events, we did not observe the recent genome-wide duplication events observed in soybean. This reference genome sequence will facilitate the identification of the genetic basis of agronomically important traits, and accelerate the development of improved pigeonpea varieties that could improve food security in many developing countries.
Collapse
Affiliation(s)
- Rajeev K Varshney
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, India.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Hiremath PJ, Farmer A, Cannon SB, Woodward J, Kudapa H, Tuteja R, Kumar A, BhanuPrakash A, Mulaosmanovic B, Gujaria N, Krishnamurthy L, Gaur PM, KaviKishor PB, Shah T, Srinivasan R, Lohse M, Xiao Y, Town CD, Cook DR, May GD, Varshney RK. Large-scale transcriptome analysis in chickpea (Cicer arietinum L.), an orphan legume crop of the semi-arid tropics of Asia and Africa. Plant Biotechnol J 2011; 9:922-31. [PMID: 21615673 PMCID: PMC3437486 DOI: 10.1111/j.1467-7652.2011.00625.x] [Citation(s) in RCA: 83] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
Chickpea (Cicer arietinum L.) is an important legume crop in the semi-arid regions of Asia and Africa. Gains in crop productivity have been low however, particularly because of biotic and abiotic stresses. To help enhance crop productivity using molecular breeding techniques, next generation sequencing technologies such as Roche/454 and Illumina/Solexa were used to determine the sequence of most gene transcripts and to identify drought-responsive genes and gene-based molecular markers. A total of 103,215 tentative unique sequences (TUSs) have been produced from 435,018 Roche/454 reads and 21,491 Sanger expressed sequence tags (ESTs). Putative functions were determined for 49,437 (47.8%) of the TUSs, and gene ontology assignments were determined for 20,634 (41.7%) of the TUSs. Comparison of the chickpea TUSs with the Medicago truncatula genome assembly (Mt 3.5.1 build) resulted in 42,141 aligned TUSs with putative gene structures (including 39,281 predicted intron/splice junctions). Alignment of ∼37 million Illumina/Solexa tags generated from drought-challenged root tissues of two chickpea genotypes against the TUSs identified 44,639 differentially expressed TUSs. The TUSs were also used to identify a diverse set of markers, including 728 simple sequence repeats (SSRs), 495 single nucleotide polymorphisms (SNPs), 387 conserved orthologous sequence (COS) markers, and 2088 intron-spanning region (ISR) markers. This resource will be useful for basic and applied research for genome analysis and crop improvement in chickpea.
Collapse
Affiliation(s)
- Pavana J Hiremath
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT)Patancheru, India
- Osmania University (OU)Hyderabad, India
| | - Andrew Farmer
- National Centre for Genome Resources (NCGR)Santa Fe, NM, USA
| | - Steven B Cannon
- United States Department of Agriculture-Agricultural Research Service, Corn Insects and Crop Genetics Research Unit (USDA-ARS-CICGRU)Ames, IA, USA
- Department of Agronomy, Iowa State UniversityAmes, IA, USA
| | - Jimmy Woodward
- National Centre for Genome Resources (NCGR)Santa Fe, NM, USA
| | - Himabindu Kudapa
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT)Patancheru, India
| | - Reetu Tuteja
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT)Patancheru, India
| | - Ashish Kumar
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT)Patancheru, India
| | - Amindala BhanuPrakash
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT)Patancheru, India
| | | | - Neha Gujaria
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT)Patancheru, India
| | - Laxmanan Krishnamurthy
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT)Patancheru, India
| | - Pooran M Gaur
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT)Patancheru, India
| | | | - Trushar Shah
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT)Patancheru, India
| | - Ramamurthy Srinivasan
- National Research Centre on Plant Biotechnology (NRCPB), IARI CampusNew Delhi, India
| | - Marc Lohse
- Max Planck Institute for Molecular Plant Physiology (MPIMPP)Am Muehlenberg, Potsdam-Golm, Germany
| | - Yongli Xiao
- J. Craig Venter Institute (JCVI)Rockville, MD, USA
| | | | | | - Gregory D May
- National Centre for Genome Resources (NCGR)Santa Fe, NM, USA
| | - Rajeev K Varshney
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT)Patancheru, India
- Generation Challenge Program (GCP)c/o CIMMYT, Mexico DF, Mexico
- *Correspondence (Tel +91 40 30713305; fax +91 40 30713074/3075; email )
| |
Collapse
|
28
|
Dubey A, Farmer A, Schlueter J, Cannon SB, Abernathy B, Tuteja R, Woodward J, Shah T, Mulasmanovic B, Kudapa H, Raju NL, Gothalwal R, Pande S, Xiao Y, Town CD, Singh NK, May GD, Jackson S, Varshney RK. Defining the transcriptome assembly and its use for genome dynamics and transcriptome profiling studies in pigeonpea (Cajanus cajan L.). DNA Res 2011; 18:153-64. [PMID: 21565938 PMCID: PMC3111231 DOI: 10.1093/dnares/dsr007] [Citation(s) in RCA: 77] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
This study reports generation of large-scale genomic resources for pigeonpea, a so-called 'orphan crop species' of the semi-arid tropic regions. FLX/454 sequencing carried out on a normalized cDNA pool prepared from 31 tissues produced 494 353 short transcript reads (STRs). Cluster analysis of these STRs, together with 10 817 Sanger ESTs, resulted in a pigeonpea trancriptome assembly (CcTA) comprising of 127 754 tentative unique sequences (TUSs). Functional analysis of these TUSs highlights several active pathways and processes in the sampled tissues. Comparison of the CcTA with the soybean genome showed similarity to 10 857 and 16 367 soybean gene models (depending on alignment methods). Additionally, Illumina 1G sequencing was performed on Fusarium wilt (FW)- and sterility mosaic disease (SMD)-challenged root tissues of 10 resistant and susceptible genotypes. More than 160 million sequence tags were used to identify FW- and SMD-responsive genes. Sequence analysis of CcTA and the Illumina tags identified a large new set of markers for use in genetics and breeding, including 8137 simple sequence repeats, 12 141 single-nucleotide polymorphisms and 5845 intron-spanning regions. Genomic resources developed in this study should be useful for basic and applied research, not only for pigeonpea improvement but also for other related, agronomically important legumes.
Collapse
Affiliation(s)
- Anuja Dubey
- Centre of Excellence in Genomics (CEG), Building #300, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru 502 324, Greater Hyderabad, India
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
29
|
Bohra A, Dubey A, Saxena RK, Penmetsa RV, Poornima KN, Kumar N, Farmer AD, Srivani G, Upadhyaya HD, Gothalwal R, Ramesh S, Singh D, Saxena K, Kishor PBK, Singh NK, Town CD, May GD, Cook DR, Varshney RK. Analysis of BAC-end sequences (BESs) and development of BES-SSR markers for genetic mapping and hybrid purity assessment in pigeonpea (Cajanus spp.). BMC Plant Biol 2011; 11:56. [PMID: 21447154 PMCID: PMC3079640 DOI: 10.1186/1471-2229-11-56] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/05/2010] [Accepted: 03/29/2011] [Indexed: 05/20/2023]
Abstract
BACKGROUND Pigeonpea [Cajanus cajan (L.) Millsp.] is an important legume crop of rainfed agriculture. Despite of concerted research efforts directed to pigeonpea improvement, stagnated productivity of pigeonpea during last several decades may be accounted to prevalence of various biotic and abiotic constraints and the situation is exacerbated by availability of inadequate genomic resources to undertake any molecular breeding programme for accelerated crop improvement. With the objective of enhancing genomic resources for pigeonpea, this study reports for the first time, large scale development of SSR markers from BAC-end sequences and their subsequent use for genetic mapping and hybridity testing in pigeonpea. RESULTS A set of 88,860 BAC (bacterial artificial chromosome)-end sequences (BESs) were generated after constructing two BAC libraries by using HindIII (34,560 clones) and BamHI (34,560 clones) restriction enzymes. Clustering based on sequence identity of BESs yielded a set of >52K non-redundant sequences, comprising 35 Mbp or >4% of the pigeonpea genome. These sequences were analyzed to develop annotation lists and subdivide the BESs into genome fractions (e.g., genes, retroelements, transpons and non-annotated sequences). Parallel analysis of BESs for microsatellites or simple sequence repeats (SSRs) identified 18,149 SSRs, from which a set of 6,212 SSRs were selected for further analysis. A total of 3,072 novel SSR primer pairs were synthesized and tested for length polymorphism on a set of 22 parental genotypes of 13 mapping populations segregating for traits of interest. In total, we identified 842 polymorphic SSR markers that will have utility in pigeonpea improvement. Based on these markers, the first SSR-based genetic map comprising of 239 loci was developed for this previously uncharacterized genome. Utility of developed SSR markers was also demonstrated by identifying a set of 42 markers each for two hybrids (ICPH 2671 and ICPH 2438) for genetic purity assessment in commercial hybrid breeding programme. CONCLUSION In summary, while BAC libraries and BESs should be useful for genomics studies, BES-SSR markers, and the genetic map should be very useful for linking the genetic map with a future physical map as well as for molecular breeding in pigeonpea.
Collapse
Affiliation(s)
- Abhishek Bohra
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, Hyderabad, 502324, India
- Department of Genetics, Osmania University, Hyderabad 500007, India
| | - Anuja Dubey
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, Hyderabad, 502324, India
- Department of Biotechnology and Bioinformatics Centre, Barkatullah University, Bhopal 462026, India
| | - Rachit K Saxena
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, Hyderabad, 502324, India
- Department of Genetics, Osmania University, Hyderabad 500007, India
| | - R Varma Penmetsa
- Department of Plant Pathology, University of California, Davis, CA 95616, USA
| | - KN Poornima
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, Hyderabad, 502324, India
- Department of Biotechnology, University of Agricultural Sciences (UAS), Bangalore 560065, India
| | - Naresh Kumar
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, Hyderabad, 502324, India
- Department of Plant Breeding and Genetics, CCS Haryana Agricultural University (CCSHAU), Hisar 125004, India
| | - Andrew D Farmer
- National Center for Genome Resources (NCGR), Santa Fe, N M 87505, USA
| | - Gudipati Srivani
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, Hyderabad, 502324, India
| | - Hari D Upadhyaya
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, Hyderabad, 502324, India
| | - Ragini Gothalwal
- Department of Biotechnology and Bioinformatics Centre, Barkatullah University, Bhopal 462026, India
| | - S Ramesh
- Department of Biotechnology, University of Agricultural Sciences (UAS), Bangalore 560065, India
| | - Dhiraj Singh
- Department of Plant Breeding and Genetics, CCS Haryana Agricultural University (CCSHAU), Hisar 125004, India
| | - Kulbhushan Saxena
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, Hyderabad, 502324, India
| | - PB Kavi Kishor
- Department of Genetics, Osmania University, Hyderabad 500007, India
| | - Nagendra K Singh
- National Research Center on Plant Biotechnology (NRCPB), New Delhi 110012, India
| | | | - Gregory D May
- National Center for Genome Resources (NCGR), Santa Fe, N M 87505, USA
| | - Douglas R Cook
- Department of Plant Pathology, University of California, Davis, CA 95616, USA
| | - Rajeev K Varshney
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, Hyderabad, 502324, India
- Generation Challenge Programme (GCP), c/o CIMMYT, 06600 Mexico DF, Mexico
| |
Collapse
|
30
|
Woody JL, Severin AJ, Bolon YT, Joseph B, Diers BW, Farmer AD, Weeks N, Muehlbauer GJ, Nelson RT, Grant D, Specht JE, Graham MA, Cannon SB, May GD, Vance CP, Shoemaker RC. Gene expression patterns are correlated with genomic and genic structure in soybean. Genome 2011; 54:10-8. [PMID: 21217801 DOI: 10.1139/g10-090] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Studies have indicated that exon and intron size and intergenic distance are correlated with gene expression levels and expression breadth. Previous reports on these correlations in plants and animals have been conflicting. In this study, next-generation sequence data, which has been shown to be more sensitive than previous expression profiling technologies, were generated and analyzed from 14 tissues. Our results revealed a novel dichotomy. At the low expression level, an increase in expression breadth correlated with an increase in transcript size because of an increase in the number of exons and introns. No significant changes in intron or exon sizes were noted. Conversely, genes expressed at the intermediate to high expression levels displayed a decrease in transcript size as their expression breadth increased. This was due to smaller exons, with no significant change in the number of exons. Taking advantage of the known gene space of soybean, we evaluated the positioning of genes and found significant clustering of similarly expressed genes. Identifying the correlations between the physical parameters of individual genes could lead to uncovering the role of regulation owing to nucleotide composition, which might have potential impacts in discerning the role of the noncoding regions.
Collapse
Affiliation(s)
- Jenna L Woody
- Department of Agronomy, Iowa State University, Ames, 50011, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Chen C, Farmer AD, Langley RJ, Mudge J, Crow JA, May GD, Huntley J, Smith AG, Retzel EF. Meiosis-specific gene discovery in plants: RNA-Seq applied to isolated Arabidopsis male meiocytes. BMC Plant Biol 2010; 10:280. [PMID: 21167045 PMCID: PMC3018465 DOI: 10.1186/1471-2229-10-280] [Citation(s) in RCA: 104] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2010] [Accepted: 12/17/2010] [Indexed: 05/18/2023]
Abstract
BACKGROUND Meiosis is a critical process in the reproduction and life cycle of flowering plants in which homologous chromosomes pair, synapse, recombine and segregate. Understanding meiosis will not only advance our knowledge of the mechanisms of genetic recombination, but also has substantial applications in crop improvement. Despite the tremendous progress in the past decade in other model organisms (e.g., Saccharomyces cerevisiae and Drosophila melanogaster), the global identification of meiotic genes in flowering plants has remained a challenge due to the lack of efficient methods to collect pure meiocytes for analyzing the temporal and spatial gene expression patterns during meiosis, and for the sensitive identification and quantitation of novel genes. RESULTS A high-throughput approach to identify meiosis-specific genes by combining isolated meiocytes, RNA-Seq, bioinformatic and statistical analysis pipelines was developed. By analyzing the studied genes that have a meiosis function, a pipeline for identifying meiosis-specific genes has been defined. More than 1,000 genes that are specifically or preferentially expressed in meiocytes have been identified as candidate meiosis-specific genes. A group of 55 genes that have mitochondrial genome origins and a significant number of transposable element (TE) genes (1,036) were also found to have up-regulated expression levels in meiocytes. CONCLUSION These findings advance our understanding of meiotic genes, gene expression and regulation, especially the transcript profiles of MGI genes and TE genes, and provide a framework for functional analysis of genes in meiosis.
Collapse
Affiliation(s)
- Changbin Chen
- Department of Horticultural Science, University of Minnesota, 1970 Folwell Avenue, St. Paul, MN 55108, USA
| | - Andrew D Farmer
- National Center for Genome Resources, 2935 Rodeo Park Drive E., Santa Fe, NM 87505, USA
| | - Raymond J Langley
- National Center for Genome Resources, 2935 Rodeo Park Drive E., Santa Fe, NM 87505, USA
- Immunology, Lovelace Respiratory Research Institute, 2425 Ridgecrest Drive SE, Albuquerque, NM 87108, USA
| | - Joann Mudge
- National Center for Genome Resources, 2935 Rodeo Park Drive E., Santa Fe, NM 87505, USA
| | - John A Crow
- National Center for Genome Resources, 2935 Rodeo Park Drive E., Santa Fe, NM 87505, USA
| | - Gregory D May
- National Center for Genome Resources, 2935 Rodeo Park Drive E., Santa Fe, NM 87505, USA
| | - James Huntley
- National Center for Genome Resources, 2935 Rodeo Park Drive E., Santa Fe, NM 87505, USA
- Illumina Inc., Hayward, California 94545, USA
| | - Alan G Smith
- Department of Horticultural Science, University of Minnesota, 1970 Folwell Avenue, St. Paul, MN 55108, USA
| | - Ernest F Retzel
- National Center for Genome Resources, 2935 Rodeo Park Drive E., Santa Fe, NM 87505, USA
| |
Collapse
|
32
|
Severin AJ, Peiffer GA, Xu WW, Hyten DL, Bucciarelli B, O’Rourke JA, Bolon YT, Grant D, Farmer AD, May GD, Vance CP, Shoemaker RC, Stupar RM. An integrative approach to genomic introgression mapping. Plant Physiol 2010; 154:3-12. [PMID: 20656899 PMCID: PMC2938162 DOI: 10.1104/pp.110.158949] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2010] [Accepted: 07/21/2010] [Indexed: 05/20/2023]
Abstract
Near-isogenic lines (NILs) are valuable genetic resources for many crop species, including soybean (Glycine max). The development of new molecular platforms promises to accelerate the mapping of genetic introgressions in these materials. Here, we compare some existing and emerging methodologies for genetic introgression mapping: single-feature polymorphism analysis, Illumina GoldenGate single nucleotide polymorphism (SNP) genotyping, and de novo SNP discovery via RNA-Seq analysis of next-generation sequence data. We used these methods to map the introgressed regions in an iron-inefficient soybean NIL and found that the three mapping approaches are complementary when utilized in combination. The comparative RNA-Seq approach offers several additional advantages, including the greatest mapping resolution, marker depth, and de novo marker utility for downstream fine-mapping analysis. We applied the comparative RNA-Seq method to map genetic introgressions in an additional pair of NILs exhibiting differential seed protein content. Furthermore, we attempted to optimize the comparative RNA-Seq approach by assessing the impact of sequence depth, SNP identification methodology, and post hoc analyses on SNP discovery rates. We conclude that the comparative RNA-Seq approach can be optimized with sufficient sampling and by utilizing a post hoc correction accounting for gene density variation that controls for false discoveries.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | - Robert M. Stupar
- Department of Agronomy, Iowa State University, Ames, Iowa 50011 (A.J.S., G.A.P., R.C.S.); Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455 (W.W.X.); Soybean Genomics and Improvement Laboratory, United States Department of Agriculture-Agricultural Research Service, Beltsville, Maryland 20705 (D.L.H.); United States Department of Agriculture-Agricultural Research Service, Plant Research Unit, St. Paul, Minnesota 55108 (B.B., J.A.O., Y.-T.B., C.P.V.); United States Department of Agriculture-Agricultural Research Service, Corn Insects and Crop Genetics Research Unit, Ames, Iowa 50011 (D.G., R.C.S.); National Center for Genome Resources, Santa Fe, New Mexico 87505 (A.D.F., G.D.M.); Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108 (C.P.V., R.M.S.)
| |
Collapse
|
33
|
Severin AJ, Woody JL, Bolon YT, Joseph B, Diers BW, Farmer AD, Muehlbauer GJ, Nelson RT, Grant D, Specht JE, Graham MA, Cannon SB, May GD, Vance CP, Shoemaker RC. RNA-Seq Atlas of Glycine max: a guide to the soybean transcriptome. BMC Plant Biol 2010; 10:160. [PMID: 20687943 PMCID: PMC3017786 DOI: 10.1186/1471-2229-10-160] [Citation(s) in RCA: 438] [Impact Index Per Article: 31.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/25/2010] [Accepted: 08/05/2010] [Indexed: 05/18/2023]
Abstract
BACKGROUND Next generation sequencing is transforming our understanding of transcriptomes. It can determine the expression level of transcripts with a dynamic range of over six orders of magnitude from multiple tissues, developmental stages or conditions. Patterns of gene expression provide insight into functions of genes with unknown annotation. RESULTS The RNA Seq-Atlas presented here provides a record of high-resolution gene expression in a set of fourteen diverse tissues. Hierarchical clustering of transcriptional profiles for these tissues suggests three clades with similar profiles: aerial, underground and seed tissues. We also investigate the relationship between gene structure and gene expression and find a correlation between gene length and expression. Additionally, we find dramatic tissue-specific gene expression of both the most highly-expressed genes and the genes specific to legumes in seed development and nodule tissues. Analysis of the gene expression profiles of over 2,000 genes with preferential gene expression in seed suggests there are more than 177 genes with functional roles that are involved in the economically important seed filling process. Finally, the Seq-atlas also provides a means of evaluating existing gene model annotations for the Glycine max genome. CONCLUSIONS This RNA-Seq atlas extends the analyses of previous gene expression atlases performed using Affymetrix GeneChip technology and provides an example of new methods to accommodate the increase in transcriptome data obtained from next generation sequencing. Data contained within this RNA-Seq atlas of Glycine max can be explored at http://www.soybase.org/soyseq.
Collapse
Affiliation(s)
- Andrew J Severin
- Department of Agronomy, Iowa State University, Ames, IA 50011, USA
| | - Jenna L Woody
- Department of Agronomy, Iowa State University, Ames, IA 50011, USA
| | - Yung-Tsi Bolon
- United States Department of Agriculture-Agricultural Research Service, Plant Research Unit, St. Paul, MN 55108, USA
| | - Bindu Joseph
- Department of Agronomy, Iowa State University, Ames, IA 50011, USA
| | - Brian W Diers
- Department of Crop Sciences, University of Illinois, 1101 West Peabody Dr., Urbana, IL 61801, USA
| | - Andrew D Farmer
- National Center for Genome Resources, Santa Fe, NM 87505, USA
| | - Gary J Muehlbauer
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN 55108, USA
| | - Rex T Nelson
- United States Department of Agriculture-Agricultural Research Service, Corn Insects and Crop Genetics Resources Unit, Ames, IA 50011, USA
| | - David Grant
- United States Department of Agriculture-Agricultural Research Service, Corn Insects and Crop Genetics Resources Unit, Ames, IA 50011, USA
| | - James E Specht
- Department of Agronomy, University of Nebraska-Lincoln, Lincoln, NE 68583, USA
| | - Michelle A Graham
- Department of Agronomy, Iowa State University, Ames, IA 50011, USA
- United States Department of Agriculture-Agricultural Research Service, Corn Insects and Crop Genetics Resources Unit, Ames, IA 50011, USA
| | - Steven B Cannon
- Department of Agronomy, Iowa State University, Ames, IA 50011, USA
- United States Department of Agriculture-Agricultural Research Service, Corn Insects and Crop Genetics Resources Unit, Ames, IA 50011, USA
| | - Gregory D May
- National Center for Genome Resources, Santa Fe, NM 87505, USA
| | - Carroll P Vance
- United States Department of Agriculture-Agricultural Research Service, Plant Research Unit, St. Paul, MN 55108, USA
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN 55108, USA
| | - Randy C Shoemaker
- Department of Agronomy, Iowa State University, Ames, IA 50011, USA
- United States Department of Agriculture-Agricultural Research Service, Corn Insects and Crop Genetics Resources Unit, Ames, IA 50011, USA
| |
Collapse
|
34
|
Cannon SB, Ilut D, Farmer AD, Maki SL, May GD, Singer SR, Doyle JJ. Polyploidy did not predate the evolution of nodulation in all legumes. PLoS One 2010; 5:e11630. [PMID: 20661290 PMCID: PMC2905438 DOI: 10.1371/journal.pone.0011630] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2010] [Accepted: 06/18/2010] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND Several lines of evidence indicate that polyploidy occurred by around 54 million years ago, early in the history of legume evolution, but it has not been known whether this event was confined to the papilionoid subfamily (Papilionoideae; e.g. beans, medics, lupins) or occurred earlier. Determining the timing of the polyploidy event is important for understanding whether polyploidy might have contributed to rapid diversification and radiation of the legumes near the origin of the family; and whether polyploidy might have provided genetic material that enabled the evolution of a novel organ, the nitrogen-fixing nodule. Although symbioses with nitrogen-fixing partners have evolved in several lineages in the rosid I clade, nodules are widespread only in legume taxa, being nearly universal in the papilionoids and in the mimosoid subfamily (e.g., mimosas, acacias)--which diverged from the papilionoid legumes around 58 million years ago, soon after the origin of the legumes. METHODOLOGY/PRINCIPAL FINDINGS Using transcriptome sequence data from Chamaecrista fasciculata, a nodulating member of the mimosoid clade, we tested whether this species underwent polyploidy within the timeframe of legume diversification. Analysis of gene family branching orders and synonymous-site divergence data from C. fasciculata, Glycine max (soybean), Medicago truncatula, and Vitis vinifera (grape; an outgroup to the rosid taxa) establish that the polyploidy event known from soybean and Medicago occurred after the separation of the mimosoid and papilionoid clades, and at or shortly before the Papilionoideae radiation. CONCLUSIONS The ancestral legume genome was not fundamentally polyploid. Moreover, because there has not been an independent instance of polyploidy in the Chamaecrista lineage there is no necessary connection between polyploidy and nodulation in legumes. Chamaecrista may serve as a useful model in the legumes that lacks a paleopolyploid history, at least relative to the widely studied papilionoid models.
Collapse
Affiliation(s)
- Steven B Cannon
- United States Department of Agriculture-Agricultural Research Service, Corn Insects and Crop Genomics Research Unit, Iowa State University, Ames, Iowa, United States of America.
| | | | | | | | | | | | | |
Collapse
|
35
|
Nelson RT, Avraham S, Shoemaker RC, May GD, Ware D, Gessler DDG. Applications and methods utilizing the Simple Semantic Web Architecture and Protocol (SSWAP) for bioinformatics resource discovery and disparate data and service integration. BioData Min 2010; 3:3. [PMID: 20525377 PMCID: PMC2894815 DOI: 10.1186/1756-0381-3-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2009] [Accepted: 06/04/2010] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND Scientific data integration and computational service discovery are challenges for the bioinformatic community. This process is made more difficult by the separate and independent construction of biological databases, which makes the exchange of data between information resources difficult and labor intensive. A recently described semantic web protocol, the Simple Semantic Web Architecture and Protocol (SSWAP; pronounced "swap") offers the ability to describe data and services in a semantically meaningful way. We report how three major information resources (Gramene, SoyBase and the Legume Information System [LIS]) used SSWAP to semantically describe selected data and web services. METHODS We selected high-priority Quantitative Trait Locus (QTL), genomic mapping, trait, phenotypic, and sequence data and associated services such as BLAST for publication, data retrieval, and service invocation via semantic web services. Data and services were mapped to concepts and categories as implemented in legacy and de novo community ontologies. We used SSWAP to express these offerings in OWL Web Ontology Language (OWL), Resource Description Framework (RDF) and eXtensible Markup Language (XML) documents, which are appropriate for their semantic discovery and retrieval. We implemented SSWAP services to respond to web queries and return data. These services are registered with the SSWAP Discovery Server and are available for semantic discovery at http://sswap.info. RESULTS A total of ten services delivering QTL information from Gramene were created. From SoyBase, we created six services delivering information about soybean QTLs, and seven services delivering genetic locus information. For LIS we constructed three services, two of which allow the retrieval of DNA and RNA FASTA sequences with the third service providing nucleic acid sequence comparison capability (BLAST). CONCLUSIONS The need for semantic integration technologies has preceded available solutions. We report the feasibility of mapping high priority data from local, independent, idiosyncratic data schemas to common shared concepts as implemented in web-accessible ontologies. These mappings are then amenable for use in semantic web services. Our implementation of approximately two dozen services means that biological data at three large information resources (Gramene, SoyBase, and LIS) is available for programmatic access, semantic searching, and enhanced interaction between the separate missions of these resources.
Collapse
Affiliation(s)
- Rex T Nelson
- USDA-ARS, CICGR, 100 Osborne Dr. Rm. 1575, Ames, IA, 50011-1010 USA
| | - Shulamit Avraham
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA
| | | | - Gregory D May
- National Center for Genome Resources, 2935 Rodeo Park Drive East, Santa Fe, NM 87505, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA
- USDA-ARS, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA
| | | |
Collapse
|
36
|
Buggs RJA, Chamala S, Wu W, Gao L, May GD, Schnable PS, Soltis DE, Soltis PS, Barbazuk WB. Characterization of duplicate gene evolution in the recent natural allopolyploid Tragopogon miscellus by next-generation sequencing and Sequenom iPLEX MassARRAY genotyping. Mol Ecol 2010; 19 Suppl 1:132-46. [PMID: 20331776 DOI: 10.1111/j.1365-294x.2009.04469.x] [Citation(s) in RCA: 109] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Tragopogon miscellus (Asteraceae) is an evolutionary model for the study of natural allopolyploidy, but until now has been under-resourced as a genetic model. Using 454 and Illumina expressed sequence tag sequencing of the parental diploid species of T. miscellus, we identified 7782 single nucleotide polymorphisms that differ between the two progenitor genomes present in this allotetraploid. Validation of a sample of 98 of these SNPs in genomic DNA using Sequenom MassARRAY iPlex genotyping confirmed 92 SNP markers at the genomic level that were diagnostic for the two parental genomes. In a transcriptome profile of 2989 SNPs in a single T. miscellus leaf, using Illumina sequencing, 69% of SNPs showed approximately equal expression of both homeologs (duplicate homologous genes derived from different parents), 22% showed apparent differential expression and 8.5% showed apparent silencing of one homeolog in T. miscellus. The majority of cases of homeolog silencing involved the T. dubius SNP homeolog (164/254; 65%) rather than the T. pratensis homeolog (90/254). Sequenom analysis of genomic DNA showed that in a sample of 27 of the homeologs showing apparent silencing, 23 (85%) were because of genomic homeolog loss. These methods could be applied to any organism, allowing efficient and cost-effective generation of genetic markers.
Collapse
Affiliation(s)
- Richard J A Buggs
- Department of Biology, University of Florida, Gainesville, 32611, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T, Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu S, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du J, Tian Z, Zhu L, Gill N, Joshi T, Libault M, Sethuraman A, Zhang XC, Shinozaki K, Nguyen HT, Wing RA, Cregan P, Specht J, Grimwood J, Rokhsar D, Stacey G, Shoemaker RC, Jackson SA. Erratum: Genome sequence of the palaeopolyploid soybean. Nature 2010. [DOI: 10.1038/nature08957] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
38
|
Baranzini SE, Mudge J, van Velkinburgh JC, Khankhanian P, Khrebtukova I, Miller NA, Zhang L, Farmer AD, Bell CJ, Kim RW, May GD, Woodward JE, Caillier SJ, McElroy JP, Gomez R, Pando MJ, Clendenen LE, Ganusova EE, Schilkey FD, Ramaraj T, Khan OA, Huntley JJ, Luo S, Kwok PY, Wu TD, Schroth GP, Oksenberg JR, Hauser SL, Kingsmore SF. Genome, epigenome and RNA sequences of monozygotic twins discordant for multiple sclerosis. Nature 2010; 464:1351-6. [PMID: 20428171 PMCID: PMC2862593 DOI: 10.1038/nature08990] [Citation(s) in RCA: 341] [Impact Index Per Article: 24.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2009] [Accepted: 03/11/2010] [Indexed: 12/15/2022]
Abstract
Identical (or more correctly 'monozygotic') twins are widely used to study the contributions of genetics and environment to human disease. A study that focused on three pairs of monozygotic twins, in which one twin had multiple sclerosis and the other did not, has brought the latest techniques of genome sequencing and analysis to this field, and incidentally published the first female human genome sequences. Full sequences were determined for one pair of twins, and for these and the other two pairs the mRNA transcriptome and epigenome sequences of CD4+ lymphocytes were determined. The striking result is that no genetic, epigenetic or transcriptome differences were found that explained why one twin had the disease and the other did not. Digging deeper into the data, eQTL (expression quantitative trait locus) mapping revealed tantalizing differences within twin pairs that merit closer examination. And some possible causes can be ruled out. Future work might usefully concentrate on studies of other cell types and epigenetic modifications. Studies of identical twins are widely used to dissect the contributions of genes and the environment to human diseases. In multiple sclerosis, an autoimmune demyelinating disease, identical twins often show differences. This might suggest that environmental effects are most significant in this case, but genetic and epigenetic differences between identical twins have been described. Here, however, studies of identical twins show no evidence for genetic, epigenetic or transcriptome differences that could explain disease discordance. Monozygotic or ‘identical’ twins have been widely studied to dissect the relative contributions of genetics and environment in human diseases. In multiple sclerosis (MS), an autoimmune demyelinating disease and common cause of neurodegeneration and disability in young adults, disease discordance in monozygotic twins has been interpreted to indicate environmental importance in its pathogenesis1,2,3,4,5,6,7,8. However, genetic and epigenetic differences between monozygotic twins have been described, challenging the accepted experimental model in disambiguating the effects of nature and nurture9,10,11,12. Here we report the genome sequences of one MS-discordant monozygotic twin pair, and messenger RNA transcriptome and epigenome sequences of CD4+ lymphocytes from three MS-discordant, monozygotic twin pairs. No reproducible differences were detected between co-twins among ∼3.6 million single nucleotide polymorphisms (SNPs) or ∼0.2 million insertion-deletion polymorphisms. Nor were any reproducible differences observed between siblings of the three twin pairs in HLA haplotypes, confirmed MS-susceptibility SNPs, copy number variations, mRNA and genomic SNP and insertion-deletion genotypes, or the expression of ∼19,000 genes in CD4+ T cells. Only 2 to 176 differences in the methylation of ∼2 million CpG dinucleotides were detected between siblings of the three twin pairs, in contrast to ∼800 methylation differences between T cells of unrelated individuals and several thousand differences between tissues or between normal and cancerous tissues. In the first systematic effort to estimate sequence variation among monozygotic co-twins, we did not find evidence for genetic, epigenetic or transcriptome differences that explained disease discordance. These are the first, to our knowledge, female, twin and autoimmune disease individual genome sequences reported.
Collapse
Affiliation(s)
- Sergio E Baranzini
- Department of Neurology, University of California at San Francisco, San Francisco, California 94143, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Libault M, Farmer A, Brechenmacher L, May GD, Stacey G. Soybean root hairs: a valuable system to investigate plant biology at the cellular level. Plant Signal Behav 2010; 5:419-21. [PMID: 20339317 PMCID: PMC7080419 DOI: 10.4161/psb.5.4.11283] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/19/2010] [Accepted: 01/19/2010] [Indexed: 05/21/2023]
Abstract
Plant organs and tissues are composed of many differentiated cell types. Most functional genomic studies sample whole tissues, which dilutes the signals that may arise from individual cells within the population. The result is an averaging of the cellular response. In order to overcome these issues of "signal dilution", methods are needed to allow the full application of modern functional genomics tools to the study of a single differentiated plant cell type. In order to address this need, we developed a method for the isolation of soybean root hair cells, a single epidermal cell type, in sufficient quantities and purity to perform a variety of functional genomic analyses. As a first demonstration of the potential of soybean root hair cells to study plant systems biology, we compared the root hair transcriptome and proteome.
Collapse
Affiliation(s)
- Marc Libault
- Division of Plant Sciences, National Center for Soybean Biotechnology, University of Missouri, Columbia, MO, USA.
| | | | | | | | | |
Collapse
|
40
|
Bolon YT, Joseph B, Cannon SB, Graham MA, Diers BW, Farmer AD, May GD, Muehlbauer GJ, Specht JE, Tu ZJ, Weeks N, Xu WW, Shoemaker RC, Vance CP. Complementary genetic and genomic approaches help characterize the linkage group I seed protein QTL in soybean. BMC Plant Biol 2010; 10:41. [PMID: 20199683 PMCID: PMC2848761 DOI: 10.1186/1471-2229-10-41] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/15/2009] [Accepted: 03/03/2010] [Indexed: 05/19/2023]
Abstract
BACKGROUND The nutritional and economic value of many crops is effectively a function of seed protein and oil content. Insight into the genetic and molecular control mechanisms involved in the deposition of these constituents in the developing seed is needed to guide crop improvement. A quantitative trait locus (QTL) on Linkage Group I (LG I) of soybean (Glycine max (L.) Merrill) has a striking effect on seed protein content. RESULTS A soybean near-isogenic line (NIL) pair contrasting in seed protein and differing in an introgressed genomic segment containing the LG I protein QTL was used as a resource to demarcate the QTL region and to study variation in transcript abundance in developing seed. The LG I QTL region was delineated to less than 8.4 Mbp of genomic sequence on chromosome 20. Using Affymetrix Soy GeneChip and high-throughput Illumina whole transcriptome sequencing platforms, 13 genes displaying significant seed transcript accumulation differences between NILs were identified that mapped to the 8.4 Mbp LG I protein QTL region. CONCLUSIONS This study identifies gene candidates at the LG I protein QTL for potential involvement in the regulation of protein content in the soybean seed. The results demonstrate the power of complementary approaches to characterize contrasting NILs and provide genome-wide transcriptome insight towards understanding seed biology and the soybean genome.
Collapse
Affiliation(s)
- Yung-Tsi Bolon
- United States Department of Agriculture-Agricultural Research Service, Plant Research Unit, St Paul, MN 55108, USA
| | - Bindu Joseph
- Department of Agronomy, Iowa State University, Ames, IA 50011, USA
| | - Steven B Cannon
- United States Department of Agriculture-Agricultural Research Service, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
| | - Michelle A Graham
- United States Department of Agriculture-Agricultural Research Service, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
| | - Brian W Diers
- Department of Crop Sciences, University of Illinois, 1101 West Peabody Dr, Urbana, IL 61801, USA
| | - Andrew D Farmer
- National Center for Genome Resources, Santa Fe, NM 87505, USA
| | - Gregory D May
- National Center for Genome Resources, Santa Fe, NM 87505, USA
| | - Gary J Muehlbauer
- Department of Agronomy and Plant Genetics, University of Minnesota, St Paul, MN 55108, USA
| | - James E Specht
- Department of Agronomy, University of Nebraska, Lincoln, NE 68583, USA
| | - Zheng Jin Tu
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN 55455, USA
| | - Nathan Weeks
- United States Department of Agriculture-Agricultural Research Service, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
| | - Wayne W Xu
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN 55455, USA
| | - Randy C Shoemaker
- United States Department of Agriculture-Agricultural Research Service, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
| | - Carroll P Vance
- United States Department of Agriculture-Agricultural Research Service, Plant Research Unit, St Paul, MN 55108, USA
- Department of Agronomy and Plant Genetics, University of Minnesota, St Paul, MN 55108, USA
| |
Collapse
|
41
|
Libault M, Farmer A, Brechenmacher L, Drnevich J, Langley RJ, Bilgin DD, Radwan O, Neece DJ, Clough SJ, May GD, Stacey G. Complete transcriptome of the soybean root hair cell, a single-cell model, and its alteration in response to Bradyrhizobium japonicum infection. Plant Physiol 2010; 152:541-52. [PMID: 19933387 PMCID: PMC2815892 DOI: 10.1104/pp.109.148379] [Citation(s) in RCA: 184] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2009] [Accepted: 11/16/2009] [Indexed: 05/10/2023]
Abstract
Nodulation is the result of a mutualistic interaction between legumes and symbiotic soil bacteria (e.g. soybean [Glycine max] and Bradyrhizobium japonicum) initiated by the infection of plant root hair cells by the symbiont. Fewer than 20 plant genes involved in the nodulation process have been functionally characterized. Considering the complexity of the symbiosis, significantly more genes are likely involved. To identify genes involved in root hair cell infection, we performed a large-scale transcriptome analysis of B. japonicum-inoculated and mock-inoculated soybean root hairs using three different technologies: microarray hybridization, Illumina sequencing, and quantitative real-time reverse transcription-polymerase chain reaction. Together, a total of 1,973 soybean genes were differentially expressed with high significance during root hair infection, including orthologs of previously characterized root hair infection-related genes such as NFR5 and NIN. The regulation of 60 genes was confirmed by quantitative real-time reverse transcription-polymerase chain reaction. Our analysis also highlighted changes in the expression pattern of some homeologous and tandemly duplicated soybean genes, supporting their rapid specialization.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | - Gary Stacey
- Division of Plant Sciences, National Center for Soybean Biotechnology, C.S. Bond Life Sciences Center (M.L., L.B., G.S.), and Division of Biochemistry, Department of Molecular Microbiology and Immunology, Center for Sustainable Energy (G.S.), University of Missouri, Columbia, Missouri 65211; National Center for Genome Resources, Santa Fe, New Mexico 87505 (A.F., R.J.L., G.D.M.); W.M. Keck Center for Comparative and Functional Genomics, Roy J. Carver Biotechnology Center (J.D.), Institute for Genomic Biology (D.D.B.), and Department of Crop Sciences (S.J.C.), University of Illinois, Urbana, Illinois 61801; and United States Department of Agriculture-Agricultural Research Service, Urbana, Illinois 61801 (O.R., D.J.N., S.J.C.)
| |
Collapse
|
42
|
Hyten DL, Cannon SB, Song Q, Weeks N, Fickus EW, Shoemaker RC, Specht JE, Farmer AD, May GD, Cregan PB. High-throughput SNP discovery through deep resequencing of a reduced representation library to anchor and orient scaffolds in the soybean whole genome sequence. BMC Genomics 2010; 11:38. [PMID: 20078886 PMCID: PMC2817691 DOI: 10.1186/1471-2164-11-38] [Citation(s) in RCA: 221] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2009] [Accepted: 01/15/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The Soybean Consensus Map 4.0 facilitated the anchoring of 95.6% of the soybean whole genome sequence developed by the Joint Genome Institute, Department of Energy, but its marker density was only sufficient to properly orient 66% of the sequence scaffolds. The discovery and genetic mapping of more single nucleotide polymorphism (SNP) markers were needed to anchor and orient the remaining genome sequence. To that end, next generation sequencing and high-throughput genotyping were combined to obtain a much higher resolution genetic map that could be used to anchor and orient most of the remaining sequence and to help validate the integrity of the existing scaffold builds. RESULTS A total of 7,108 to 25,047 predicted SNPs were discovered using a reduced representation library that was subsequently sequenced by the Illumina sequence-by-synthesis method on the clonal single molecule array platform. Using multiple SNP prediction methods, the validation rate of these SNPs ranged from 79% to 92.5%. A high resolution genetic map using 444 recombinant inbred lines was created with 1,790 SNP markers. Of the 1,790 mapped SNP markers, 1,240 markers had been selectively chosen to target existing unanchored or un-oriented sequence scaffolds, thereby increasing the amount of anchored sequence to 97%. CONCLUSION We have demonstrated how next generation sequencing was combined with high-throughput SNP detection assays to quickly discover large numbers of SNPs. Those SNPs were then used to create a high resolution genetic map that assisted in the assembly of scaffolds from the 8x whole genome shotgun sequences into pseudomolecules corresponding to chromosomes of the organism.
Collapse
Affiliation(s)
- David L Hyten
- Soybean Genomics and Improvement Laboratory, U.S. Department of Agriculture, Agricultural Research Service, Beltsville, MD 20705, USA
| | - Steven B Cannon
- Department of Agronomy, U.S. Department of Agriculture, Agricultural Research Service, Iowa State University, Ames, IA 50011, USA
| | - Qijian Song
- Soybean Genomics and Improvement Laboratory, U.S. Department of Agriculture, Agricultural Research Service, Beltsville, MD 20705, USA
- Department Plant Science and Landscape Architecture, University of Maryland, College Park, MD 20742, USA
| | - Nathan Weeks
- Department of Agronomy, U.S. Department of Agriculture, Agricultural Research Service, Iowa State University, Ames, IA 50011, USA
| | - Edward W Fickus
- Soybean Genomics and Improvement Laboratory, U.S. Department of Agriculture, Agricultural Research Service, Beltsville, MD 20705, USA
| | - Randy C Shoemaker
- Department of Agronomy, U.S. Department of Agriculture, Agricultural Research Service, Iowa State University, Ames, IA 50011, USA
| | - James E Specht
- Department of Agronomy and Horticulture, University of Nebraska Lincoln, Lincoln, Nebraska, NE 68583, USA
| | - Andrew D Farmer
- National Center for Genome Resources, Santa Fe, NM 87505, USA
| | - Gregory D May
- National Center for Genome Resources, Santa Fe, NM 87505, USA
| | - Perry B Cregan
- Soybean Genomics and Improvement Laboratory, U.S. Department of Agriculture, Agricultural Research Service, Beltsville, MD 20705, USA
| |
Collapse
|
43
|
Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T, Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu S, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du J, Tian Z, Zhu L, Gill N, Joshi T, Libault M, Sethuraman A, Zhang XC, Shinozaki K, Nguyen HT, Wing RA, Cregan P, Specht J, Grimwood J, Rokhsar D, Stacey G, Shoemaker RC, Jackson SA. Genome sequence of the palaeopolyploid soybean. Nature 2010; 463:178-83. [PMID: 20075913 DOI: 10.1038/nature08670] [Citation(s) in RCA: 2569] [Impact Index Per Article: 183.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2009] [Accepted: 11/12/2009] [Indexed: 12/27/2022]
Abstract
Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70% more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78% of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75% of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.
Collapse
Affiliation(s)
- Jeremy Schmutz
- HudsonAlpha Genome Sequencing Center, 601 Genome Way, Huntsville, Alabama 35806, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Singer SR, Maki SL, Farmer AD, Ilut D, May GD, Cannon SB, Doyle JJ. Venturing beyond beans and peas: what can we learn from Chamaecrista? Plant Physiol 2009; 151:1041-7. [PMID: 19755538 PMCID: PMC2773047 DOI: 10.1104/pp.109.144774] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/14/2009] [Accepted: 09/06/2009] [Indexed: 05/20/2023]
Affiliation(s)
- Susan R Singer
- Department of Biology, Carleton College, Northfield, Minnesota 55057, USA.
| | | | | | | | | | | | | |
Collapse
|
45
|
Cannon SB, May GD, Jackson SA. Three sequenced legume genomes and many crop species: rich opportunities for translational genomics. Plant Physiol 2009; 151:970-7. [PMID: 19759344 PMCID: PMC2773077 DOI: 10.1104/pp.109.144659] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/11/2009] [Accepted: 09/14/2009] [Indexed: 05/20/2023]
Affiliation(s)
- Steven B Cannon
- United States Department of Agriculture-Agricultural Research Service, Corn Insects and Crop Genetics Research Unit, Ames, Iowa 50011, USA.
| | | | | |
Collapse
|
46
|
Varshney RK, Nayak SN, May GD, Jackson SA. Next-generation sequencing technologies and their implications for crop genetics and breeding. Trends Biotechnol 2009; 27:522-30. [PMID: 19679362 DOI: 10.1016/j.tibtech.2009.05.006] [Citation(s) in RCA: 396] [Impact Index Per Article: 26.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2009] [Revised: 05/21/2009] [Accepted: 05/27/2009] [Indexed: 10/20/2022]
Abstract
Using next-generation sequencing technologies it is possible to resequence entire plant genomes or sample entire transcriptomes more efficiently and economically and in greater depth than ever before. Rather than sequencing individual genomes, we envision the sequencing of hundreds or even thousands of related genomes to sample genetic diversity within and between germplasm pools. Identification and tracking of genetic variation are now so efficient and precise that thousands of variants can be tracked within large populations. In this review, we outline some important areas such as the large-scale development of molecular markers for linkage mapping, association mapping, wide crosses and alien introgression, epigenetic modifications, transcript profiling, population genetics and de novo genome/organellar genome assembly for which these technologies are expected to advance crop genetics and breeding, leading to crop improvement.
Collapse
Affiliation(s)
- Rajeev K Varshney
- Centre of Excellence in Genomics (CEG), International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru 502324, A.P., India.
| | | | | | | |
Collapse
|
47
|
Mudge J, Miller NA, Khrebtukova I, Lindquist IE, May GD, Huntley JJ, Luo S, Zhang L, van Velkinburgh JC, Farmer AD, Lewis S, Beavis WD, Schilkey FD, Virk SM, Black CF, Myers MK, Mader LC, Langley RJ, Utsey JP, Kim RW, Roberts RC, Khalsa SK, Garcia M, Ambriz-Griffith V, Harlan R, Czika W, Martin S, Wolfinger RD, Perrone-Bizzozero NI, Schroth GP, Kingsmore SF. Genomic convergence analysis of schizophrenia: mRNA sequencing reveals altered synaptic vesicular transport in post-mortem cerebellum. PLoS One 2008; 3:e3625. [PMID: 18985160 PMCID: PMC2576459 DOI: 10.1371/journal.pone.0003625] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2008] [Accepted: 10/10/2008] [Indexed: 02/06/2023] Open
Abstract
Schizophrenia (SCZ) is a common, disabling mental illness with high heritability but complex, poorly understood genetic etiology. As the first phase of a genomic convergence analysis of SCZ, we generated 16.7 billion nucleotides of short read, shotgun sequences of cDNA from post-mortem cerebellar cortices of 14 patients and six, matched controls. A rigorous analysis pipeline was developed for analysis of digital gene expression studies. Sequences aligned to approximately 33,200 transcripts in each sample, with average coverage of 450 reads per gene. Following adjustments for confounding clinical, sample and experimental sources of variation, 215 genes differed significantly in expression between cases and controls. Golgi apparatus, vesicular transport, membrane association, Zinc binding and regulation of transcription were over-represented among differentially expressed genes. Twenty three genes with altered expression and involvement in presynaptic vesicular transport, Golgi function and GABAergic neurotransmission define a unifying molecular hypothesis for dysfunction in cerebellar cortex in SCZ.
Collapse
Affiliation(s)
- Joann Mudge
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Neil A. Miller
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | | | - Ingrid E. Lindquist
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Gregory D. May
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Jim J. Huntley
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Shujun Luo
- Illumina Inc., Hayward, California, United States of America
| | - Lu Zhang
- Illumina Inc., Hayward, California, United States of America
| | | | - Andrew D. Farmer
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Sharon Lewis
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - William D. Beavis
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Faye D. Schilkey
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Selene M. Virk
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - C. Forrest Black
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - M. Kathy Myers
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Lar C. Mader
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Ray J. Langley
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - John P. Utsey
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Ryan W. Kim
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
| | - Rosalinda C. Roberts
- Department of Psychiatry, University of Alabama at Birmingham, Birmingham, Alabama, United States of America
| | - Sat Kirpal Khalsa
- Northern New Mexico College, Española, New Mexico, United States of America
| | - Meredith Garcia
- Northern New Mexico College, Española, New Mexico, United States of America
| | | | - Richard Harlan
- Northern New Mexico College, Española, New Mexico, United States of America
| | - Wendy Czika
- SAS Institute, Cary, North Carolina, United States of America
| | - Stanton Martin
- SAS Institute, Cary, North Carolina, United States of America
| | | | - Nora I. Perrone-Bizzozero
- Department of Neurosciences, University of New Mexico, Albuquerque, New Mexico, United States of America
| | - Gary P. Schroth
- Illumina Inc., Hayward, California, United States of America
| | - Stephen F. Kingsmore
- National Center for Genome Resources, Santa Fe, New Mexico, United States of America
- * E-mail:
| |
Collapse
|
48
|
Mian MAR, Zhang Y, Wang ZY, Zhang JY, Cheng X, Chen L, Chekhovskiy K, Dai X, Mao C, Cheung F, Zhao X, He J, Scott AD, Town CD, May GD. Analysis of tall fescue ESTs representing different abiotic stresses, tissue types and developmental stages. BMC Plant Biol 2008; 8:27. [PMID: 18318913 PMCID: PMC2323379 DOI: 10.1186/1471-2229-8-27] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/22/2007] [Accepted: 03/04/2008] [Indexed: 05/02/2023]
Abstract
BACKGROUND Tall fescue (Festuca arundinacea Schreb) is a major cool season forage and turf grass species grown in the temperate regions of the world. In this paper we report the generation of a tall fescue expressed sequence tag (EST) database developed from nine cDNA libraries representing tissues from different plant organs, developmental stages, and abiotic stress factors. The results of inter-library and library-specific in silico expression analyses of these ESTs are also reported. RESULTS A total of 41,516 ESTs were generated from nine cDNA libraries of tall fescue representing tissues from different plant organs, developmental stages, and abiotic stress conditions. The Festuca Gene Index (FaGI) has been established. To date, this represents the first publicly available tall fescue EST database. In silico gene expression studies using these ESTs were performed to understand stress responses in tall fescue. A large number of ESTs of known stress response gene were identified from stressed tissue libraries. These ESTs represent gene homologues of heat-shock and oxidative stress proteins, and various transcription factor protein families. Highly expressed ESTs representing genes of unknown functions were also identified in the stressed tissue libraries. CONCLUSION FaGI provides a useful resource for genomics studies of tall fescue and other closely related forage and turf grass species. Comparative genomic analyses between tall fescue and other grass species, including ryegrasses (Lolium sp.), meadow fescue (F. pratensis) and tetraploid fescue (F. arundinacea var glaucescens) will benefit from this database. These ESTs are an excellent resource for the development of simple sequence repeat (SSR) and single nucleotide polymorphism (SNP) PCR-based molecular markers.
Collapse
Affiliation(s)
- MA Rouf Mian
- Forage Improvement Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK 73402, USA
- USDA-ARS, The Ohio State University & OARDC, 1680 Madison Avenue, Wooster, OH 44691, USA
| | - Yan Zhang
- Forage Improvement Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK 73402, USA
| | - Zeng-Yu Wang
- Forage Improvement Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK 73402, USA
| | - Ji-Yi Zhang
- Forage Improvement Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK 73402, USA
| | - Xiaofei Cheng
- Forage Improvement Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK 73402, USA
| | - Lei Chen
- Forage Improvement Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK 73402, USA
| | - Konstantin Chekhovskiy
- Forage Improvement Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK 73402, USA
| | - Xinbin Dai
- Plant Biology Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK 73402, USA
| | - Chunhong Mao
- Virginia Bioinformatics Institute, 1750 Kraft Drive Suite 1400, Virginia Tech, Blacksburg, VA 24061, USA
| | - Foo Cheung
- The J. Craig Venter Institute, 9712 Medical Center Drive, Rockville, MD 20850, USA
| | - Xuechun Zhao
- Plant Biology Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK 73402, USA
| | - Ji He
- Plant Biology Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK 73402, USA
| | - Angela D Scott
- Plant Biology Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK 73402, USA
| | - Christopher D Town
- The J. Craig Venter Institute, 9712 Medical Center Drive, Rockville, MD 20850, USA
| | - Gregory D May
- Plant Biology Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK 73402, USA
- National Center for Genome Resources, 2935 Rodeo Park Drive East, Santa Fe, NM 87505, USA
| |
Collapse
|
49
|
Miller NA, Kingsmore SF, Farmer A, Langley RJ, Mudge J, Crow JA, Gonzalez AJ, Schilkey FD, Kim RJ, van Velkinburgh J, May GD, Black CF, Myers MK, Utsey JP, Frost NS, Sugarbaker DJ, Bueno R, Gullans SR, Baxter SM, Day SW, Retzel EF. Management of High-Throughput DNA Sequencing Projects: Alpheus. ACTA ACUST UNITED AC 2008; 1:132. [PMID: 20151039 DOI: 10.4172/jcsb.1000013] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
High-throughput DNA sequencing has enabled systems biology to begin to address areas in health, agricultural and basic biological research. Concomitant with the opportunities is an absolute necessity to manage significant volumes of high-dimensional and inter-related data and analysis. Alpheus is an analysis pipeline, database and visualization software for use with massively parallel DNA sequencing technologies that feature multi-gigabase throughput characterized by relatively short reads, such as Illumina-Solexa (sequencing-by-synthesis), Roche-454 (pyrosequencing) and Applied Biosystem's SOLiD (sequencing-by-ligation). Alpheus enables alignment to reference sequence(s), detection of variants and enumeration of sequence abundance, including expression levels in transcriptome sequence. Alpheus is able to detect several types of variants, including non-synonymous and synonymous single nucleotide polymorphisms (SNPs), insertions/deletions (indels), premature stop codons, and splice isoforms. Variant detection is aided by the ability to filter variant calls based on consistency, expected allele frequency, sequence quality, coverage, and variant type in order to minimize false positives while maximizing the identification of true positives. Alpheus also enables comparisons of genes with variants between cases and controls or bulk segregant pools. Sequence-based differential expression comparisons can be developed, with data export to SAS JMP Genomics for statistical analysis.
Collapse
Affiliation(s)
- Neil A Miller
- National Center for Genome Resources, 2935 Rodeo Park Drive East, Santa Fe, NM 87505, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Naoumkina M, Torres-Jerez I, Allen S, He J, Zhao PX, Dixon RA, May GD. Analysis of cDNA libraries from developing seeds of guar (Cyamopsis tetragonoloba (L.) Taub). BMC Plant Biol 2007; 7:62. [PMID: 18034910 PMCID: PMC2241620 DOI: 10.1186/1471-2229-7-62] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/03/2007] [Accepted: 11/23/2007] [Indexed: 05/25/2023]
Abstract
BACKGROUND Guar, Cyamopsis tetragonoloba (L.) Taub, is a member of the Leguminosae (Fabaceae) family and is economically the most important of the four species in the genus. The endosperm of guar seed is a rich source of mucilage or gum, which forms a viscous gel in cold water, and is used as an emulsifier, thickener and stabilizer in a wide range of foods and industrial applications. Guar gum is a galactomannan, consisting of a linear (1-->4)-beta-linked D-mannan backbone with single-unit, (1-->6)-linked, alpha-D-galactopyranosyl side chains. To better understand regulation of guar seed development and galactomannan metabolism we created cDNA libraries and a resulting EST dataset from different developmental stages of guar seeds. RESULTS A database of 16,476 guar seed ESTs was constructed, with 8,163 and 8,313 ESTs derived from cDNA libraries I and II, respectively. Library I was constructed from seeds at an early developmental stage (15-25 days after flowering, DAF), and library II from seeds at 30-40 DAF. Quite different sets of genes were represented in these two libraries. Approximately 27% of the clones were not similar to known sequences, suggesting that these ESTs represent novel genes or may represent non-coding RNA. The high flux of energy into carbohydrate and storage protein synthesis in guar seeds was reflected by a high representation of genes annotated as involved in signal transduction, carbohydrate metabolism, chaperone and proteolytic processes, and translation and ribosome structure. Guar unigenes involved in galactomannan metabolism were identified. Among the seed storage proteins, the most abundant contig represented a conglutin accounting for 3.7% of the total ESTs from both libraries. CONCLUSION The present EST collection and its annotation provide a resource for understanding guar seed biology and galactomannan metabolism.
Collapse
Affiliation(s)
- Marina Naoumkina
- Plant Biology Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, Oklahoma 73401, USA
| | - Ivone Torres-Jerez
- Plant Biology Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, Oklahoma 73401, USA
| | - Stacy Allen
- Plant Biology Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, Oklahoma 73401, USA
| | - Ji He
- Plant Biology Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, Oklahoma 73401, USA
| | - Patrick X Zhao
- Plant Biology Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, Oklahoma 73401, USA
| | - Richard A Dixon
- Plant Biology Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, Oklahoma 73401, USA
| | - Gregory D May
- Plant Biology Division, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, Oklahoma 73401, USA
- National Center for Genome Resources, 2935 Rodeo Park Drive East, Santa Fe, New Mexico 87505, USA
| |
Collapse
|