501
|
Mascher M, Wu S, Amand PS, Stein N, Poland J. Application of genotyping-by-sequencing on semiconductor sequencing platforms: a comparison of genetic and reference-based marker ordering in barley. PLoS One 2013; 8:e76925. [PMID: 24098570 PMCID: PMC3789676 DOI: 10.1371/journal.pone.0076925] [Citation(s) in RCA: 119] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2013] [Accepted: 09/04/2013] [Indexed: 12/17/2022] Open
Abstract
The rapid development of next-generation sequencing platforms has enabled the use of sequencing for routine genotyping across a range of genetics studies and breeding applications. Genotyping-by-sequencing (GBS), a low-cost, reduced representation sequencing method, is becoming a common approach for whole-genome marker profiling in many species. With quickly developing sequencing technologies, adapting current GBS methodologies to new platforms will leverage these advancements for future studies. To test new semiconductor sequencing platforms for GBS, we genotyped a barley recombinant inbred line (RIL) population. Based on a previous GBS approach, we designed bar code and adapter sets for the Ion Torrent platforms. Four sets of 24-plex libraries were constructed consisting of 94 RILs and the two parents and sequenced on two Ion platforms. In parallel, a 96-plex library of the same RILs was sequenced on the Illumina HiSeq 2000. We applied two different computational pipelines to analyze sequencing data; the reference-independent TASSEL pipeline and a reference-based pipeline using SAMtools. Sequence contigs positioned on the integrated physical and genetic map were used for read mapping and variant calling. We found high agreement in genotype calls between the different platforms and high concordance between genetic and reference-based marker order. There was, however, paucity in the number of SNP that were jointly discovered by the different pipelines indicating a strong effect of alignment and filtering parameters on SNP discovery. We show the utility of the current barley genome assembly as a framework for developing very low-cost genetic maps, facilitating high resolution genetic mapping and negating the need for developing de novo genetic maps for future studies in barley. Through demonstration of GBS on semiconductor sequencing platforms, we conclude that the GBS approach is amenable to a range of platforms and can easily be modified as new sequencing technologies, analysis tools and genomic resources develop.
Collapse
Affiliation(s)
- Martin Mascher
- Leibniz Institute of Plant Genetics and Crop Plant Research, Gatersleben, Germany
| | - Shuangye Wu
- Department of Agronomy, Kansas State University, Manhattan, Kansas, United States of America
| | - Paul St. Amand
- United States Department of Agriculture, Agricultural Research Service, Hard Winter Wheat Genetics Research Unit, Manhattan, Kansas, United States of America
| | - Nils Stein
- Leibniz Institute of Plant Genetics and Crop Plant Research, Gatersleben, Germany
| | - Jesse Poland
- Department of Agronomy, Kansas State University, Manhattan, Kansas, United States of America
- United States Department of Agriculture, Agricultural Research Service, Hard Winter Wheat Genetics Research Unit, Manhattan, Kansas, United States of America
- * E-mail:
| |
Collapse
|
502
|
Mascher M, Wu S, Amand PS, Stein N, Poland J. Application of genotyping-by-sequencing on semiconductor sequencing platforms: a comparison of genetic and reference-based marker ordering in barley. PLoS One 2013; 8:e76925. [PMID: 24098570 DOI: 10.1371/journal.pone.076925] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2013] [Accepted: 09/04/2013] [Indexed: 05/27/2023] Open
Abstract
The rapid development of next-generation sequencing platforms has enabled the use of sequencing for routine genotyping across a range of genetics studies and breeding applications. Genotyping-by-sequencing (GBS), a low-cost, reduced representation sequencing method, is becoming a common approach for whole-genome marker profiling in many species. With quickly developing sequencing technologies, adapting current GBS methodologies to new platforms will leverage these advancements for future studies. To test new semiconductor sequencing platforms for GBS, we genotyped a barley recombinant inbred line (RIL) population. Based on a previous GBS approach, we designed bar code and adapter sets for the Ion Torrent platforms. Four sets of 24-plex libraries were constructed consisting of 94 RILs and the two parents and sequenced on two Ion platforms. In parallel, a 96-plex library of the same RILs was sequenced on the Illumina HiSeq 2000. We applied two different computational pipelines to analyze sequencing data; the reference-independent TASSEL pipeline and a reference-based pipeline using SAMtools. Sequence contigs positioned on the integrated physical and genetic map were used for read mapping and variant calling. We found high agreement in genotype calls between the different platforms and high concordance between genetic and reference-based marker order. There was, however, paucity in the number of SNP that were jointly discovered by the different pipelines indicating a strong effect of alignment and filtering parameters on SNP discovery. We show the utility of the current barley genome assembly as a framework for developing very low-cost genetic maps, facilitating high resolution genetic mapping and negating the need for developing de novo genetic maps for future studies in barley. Through demonstration of GBS on semiconductor sequencing platforms, we conclude that the GBS approach is amenable to a range of platforms and can easily be modified as new sequencing technologies, analysis tools and genomic resources develop.
Collapse
Affiliation(s)
- Martin Mascher
- Leibniz Institute of Plant Genetics and Crop Plant Research, Gatersleben, Germany
| | | | | | | | | |
Collapse
|
503
|
Zhang W, Fraiture M, Kolb D, Löffelhardt B, Desaki Y, Boutrot FF, Tör M, Zipfel C, Gust AA, Brunner F. Arabidopsis receptor-like protein30 and receptor-like kinase suppressor of BIR1-1/EVERSHED mediate innate immunity to necrotrophic fungi. THE PLANT CELL 2013; 25:4227-41. [PMID: 24104566 PMCID: PMC3877809 DOI: 10.1105/tpc.113.117010] [Citation(s) in RCA: 187] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/03/2013] [Revised: 08/03/2013] [Accepted: 09/20/2013] [Indexed: 05/18/2023]
Abstract
Effective plant defense strategies rely in part on the perception of non-self determinants, so-called microbe-associated molecular patterns (MAMPs), by transmembrane pattern recognition receptors leading to MAMP-triggered immunity. Plant resistance against necrotrophic pathogens with a broad host range is complex and yet not well understood. Particularly, it is unclear if resistance to necrotrophs involves pattern recognition receptors. Here, we partially purified a novel proteinaceous elicitor called sclerotinia culture filtrate elicitor1 (SCFE1) from the necrotrophic fungal pathogen Sclerotinia sclerotiorum that induces typical MAMP-triggered immune responses in Arabidopsis thaliana. Analysis of natural genetic variation revealed five Arabidopsis accessions (Mt-0, Lov-1, Lov-5, Br-0, and Sq-1) that are fully insensitive to the SCFE1-containing fraction. We used a forward genetics approach and mapped the locus determining SCFE1 sensitivity to receptor-like protein30 (RLP30). We also show that SCFE1-triggered immune responses engage a signaling pathway dependent on the regulatory receptor-like kinases brassinosteroid insensitive1-associated receptor kinase1 (BAK1) and Suppressor of BIR1-1/evershed (SOBIR1/EVR). Mutants of RLP30, BAK1, and SOBIR1 are more susceptible to S. sclerotiorum and the related fungus Botrytis cinerea. The presence of an elicitor in S. sclerotiorum evoking MAMP-triggered immune responses and sensed by RLP30/SOBIR1/BAK1 demonstrates the relevance of MAMP-triggered immunity in resistance to necrotrophic fungi.
Collapse
Affiliation(s)
- Weiguo Zhang
- Department of Biochemistry, Center for Plant Molecular Biology, Eberhard Karls University, D-72076 Tuebingen, Germany
| | - Malou Fraiture
- Department of Biochemistry, Center for Plant Molecular Biology, Eberhard Karls University, D-72076 Tuebingen, Germany
| | - Dagmar Kolb
- Department of Biochemistry, Center for Plant Molecular Biology, Eberhard Karls University, D-72076 Tuebingen, Germany
| | - Birgit Löffelhardt
- Department of Biochemistry, Center for Plant Molecular Biology, Eberhard Karls University, D-72076 Tuebingen, Germany
| | - Yoshitake Desaki
- Department of Biochemistry, Center for Plant Molecular Biology, Eberhard Karls University, D-72076 Tuebingen, Germany
| | | | - Mahmut Tör
- National Pollen and Aerobiology Research Unit, Institute of Science and the Environment, University of Worcester, Worcester WR2 6AJ, United Kingdom
| | - Cyril Zipfel
- The Sainsbury Laboratory, Norwich NR4 7UH, United Kingdom
| | - Andrea A. Gust
- Department of Biochemistry, Center for Plant Molecular Biology, Eberhard Karls University, D-72076 Tuebingen, Germany
| | - Frédéric Brunner
- Department of Biochemistry, Center for Plant Molecular Biology, Eberhard Karls University, D-72076 Tuebingen, Germany
- Address correspondence to
| |
Collapse
|
504
|
Schmitz RJ, He Y, Valdés-López O, Khan SM, Joshi T, Urich MA, Nery JR, Diers B, Xu D, Stacey G, Ecker JR. Epigenome-wide inheritance of cytosine methylation variants in a recombinant inbred population. Genome Res 2013; 23:1663-74. [PMID: 23739894 PMCID: PMC3787263 DOI: 10.1101/gr.152538.112] [Citation(s) in RCA: 169] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2012] [Accepted: 06/05/2013] [Indexed: 01/22/2023]
Abstract
Cytosine DNA methylation is one avenue for passing information through cell divisions. Here, we present epigenomic analyses of soybean recombinant inbred lines (RILs) and their parents. Identification of differentially methylated regions (DMRs) revealed that DMRs mostly cosegregated with the genotype from which they were derived, but examples of the uncoupling of genotype and epigenotype were identified. Linkage mapping of methylation states assessed from whole-genome bisulfite sequencing of 83 RILs uncovered widespread evidence for local methylQTL. This epigenomics approach provides a comprehensive study of the patterns and heritability of methylation variants in a complex genetic population over multiple generations, paving the way for understanding how methylation variants contribute to phenotypic variation.
Collapse
Affiliation(s)
- Robert J. Schmitz
- Plant Biology Laboratory, The Salk Institute for Biological Studies, La Jolla, California 92037, USA
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, California 92037, USA
| | - Yupeng He
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, California 92037, USA
- Bioinformatics Program, University of California at San Diego, La Jolla, California 92093, USA
| | - Oswaldo Valdés-López
- Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, Missouri 65211, USA
| | - Saad M. Khan
- Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, Missouri 65211, USA
- Informatics Institute, University of Missouri, Columbia, Missouri 65211, USA
| | - Trupti Joshi
- Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, Missouri 65211, USA
- Informatics Institute, University of Missouri, Columbia, Missouri 65211, USA
- Department of Computer Science, University of Missouri, Columbia, Missouri 65211, USA
- National Center for Soybean Biotechnology, University of Missouri, Columbia, Missouri 65211, USA
| | - Mark A. Urich
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, California 92037, USA
| | - Joseph R. Nery
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, California 92037, USA
| | - Brian Diers
- Department of Crop Sciences, University of Illinois, Urbana, Illinois 61801, USA
| | - Dong Xu
- Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, Missouri 65211, USA
- Informatics Institute, University of Missouri, Columbia, Missouri 65211, USA
- Department of Computer Science, University of Missouri, Columbia, Missouri 65211, USA
- National Center for Soybean Biotechnology, University of Missouri, Columbia, Missouri 65211, USA
| | - Gary Stacey
- Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, Missouri 65211, USA
- National Center for Soybean Biotechnology, University of Missouri, Columbia, Missouri 65211, USA
- Divisions of Plant Science and Biochemistry, University of Missouri, Columbia, Missouri 65211, USA
| | - Joseph R. Ecker
- Plant Biology Laboratory, The Salk Institute for Biological Studies, La Jolla, California 92037, USA
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, California 92037, USA
- Howard Hughes Medical Institute, The Salk Institute for Biological Studies, La Jolla, California 92037, USA
| |
Collapse
|
505
|
Epigenomic programming contributes to the genomic drift evolution of the F-Box protein superfamily in Arabidopsis. Proc Natl Acad Sci U S A 2013; 110:16927-32. [PMID: 24082131 DOI: 10.1073/pnas.1316009110] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Comparisons within expanding sequence databases have revealed a dynamic interplay among genomic and epigenomic forces in driving plant evolution. Such forces are especially obvious within the F-Box (FBX) superfamily, one of the largest and most polymorphic gene families in land plants, where its frequent lineage-specific expansions and contractions provide an excellent model to assess how genetic variation impacted gene function before and after speciation. Previous phylogenetic comparisons based on orthology, diversity, and expression patterns identified three plant FBX groups--Common, Lineage-Specific, and Pseudo(genized)--whose emergences are consistent with genomic drift evolution. Here, we examined this variance within Arabidopsis thaliana by evaluating SNPs for all 877 FBX loci from 432 naturally occurring accessions and their relationships to variations in natural selection, expression, and DNA/histone methylation. In line with their phenotypic importance, Common FBX loci have low polymorphism but high deleterious mutation rates indicative of stringent functional constraints. In contrast, the Lineage-Specific and Pseudo groups are enriched in genes with basal expression and higher SNP density and more correlated with methylation marks (RNA-directed DNA methylation and histone H3K27 trimethylation) that promote transcriptional silencing. Taken together, we propose that reversible epigenomic modifications helped shape FBX gene evolution by transcriptionally suppressing the adverse effects of gene dosage imbalance and harmful FBX alleles that arise during genomic drift, while simultaneously allowing innovations to emerge through epigenomic reprogramming.
Collapse
|
506
|
Lin H, Miller ML, Granas DM, Dutcher SK. Whole genome sequencing identifies a deletion in protein phosphatase 2A that affects its stability and localization in Chlamydomonas reinhardtii. PLoS Genet 2013; 9:e1003841. [PMID: 24086163 PMCID: PMC3784568 DOI: 10.1371/journal.pgen.1003841] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2013] [Accepted: 08/13/2013] [Indexed: 11/19/2022] Open
Abstract
Whole genome sequencing is a powerful tool in the discovery of single nucleotide polymorphisms (SNPs) and small insertions/deletions (indels) among mutant strains, which simplifies forward genetics approaches. However, identification of the causative mutation among a large number of non-causative SNPs in a mutant strain remains a big challenge. In the unicellular biflagellate green alga Chlamydomonas reinhardtii, we generated a SNP/indel library that contains over 2 million polymorphisms from four wild-type strains, one highly polymorphic strain that is frequently used in meiotic mapping, ten mutant strains that have flagellar assembly or motility defects, and one mutant strain, imp3, which has a mating defect. A comparison of polymorphisms in the imp3 strain and the other 15 strains allowed us to identify a deletion of the last three amino acids, Y313F314L315, in a protein phosphatase 2A catalytic subunit (PP2A3) in the imp3 strain. Introduction of a wild-type HA-tagged PP2A3 rescues the mutant phenotype, but mutant HA-PP2A3 at Y313 or L315 fail to rescue. Our immunoprecipitation results indicate that the Y313, L315, or YFLΔ mutations do not affect the binding of PP2A3 to the scaffold subunit, PP2A-2r. In contrast, the Y313, L315, or YFLΔ mutations affect both the stability and the localization of PP2A3. The PP2A3 protein is less abundant in these mutants and fails to accumulate in the basal body area as observed in transformants with either wild-type HA-PP2A3 or a HA-PP2A3 with a V310T change. The accumulation of HA-PP2A3 in the basal body region disappears in mated dikaryons, which suggests that the localization of PP2A3 may be essential to the mating process. Overall, our results demonstrate that the terminal YFL tail of PP2A3 is important in the regulation on Chlamydomonas mating.
Collapse
Affiliation(s)
- Huawen Lin
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Michelle L. Miller
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - David M. Granas
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
- Center for Genomic Sciences and System Biology, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Susan K. Dutcher
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
- * E-mail:
| |
Collapse
|
507
|
Bush SJ, Castillo-Morales A, Tovar-Corona JM, Chen L, Kover PX, Urrutia AO. Presence-absence variation in A. thaliana is primarily associated with genomic signatures consistent with relaxed selective constraints. Mol Biol Evol 2013; 31:59-69. [PMID: 24072814 PMCID: PMC3879440 DOI: 10.1093/molbev/mst166] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
The sequencing of multiple genomes of the same plant species has revealed polymorphic gene and exon loss. Genes associated with disease resistance are overrepresented among those showing structural variations, suggesting an adaptive role for gene and exon presence–absence variation (PAV). To shed light on the possible functional relevance of polymorphic coding region loss and the mechanisms driving this process, we characterized genes that have lost entire exons or their whole coding regions in 17 fully sequenced Arabidopsis thaliana accessions. We found that although a significant enrichment in genes associated with certain functional categories is observed, PAV events are largely restricted to genes with signatures of reduced essentiality: PAV genes tend to be newer additions to the genome, tissue specific, and lowly expressed. In addition, PAV genes are located in regions of lower gene density and higher transposable element density. Partial coding region PAV events were associated with only a marginal reduction in gene expression level in the affected accession and occurred in genes with higher levels of alternative splicing in the Col-0 accession. Together, these results suggest that although adaptive scenarios cannot be ruled out, PAV events can be explained without invoking them.
Collapse
Affiliation(s)
- Stephen J Bush
- Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| | | | | | | | | | | |
Collapse
|
508
|
Abstract
The sequencing of the complete genome of the nematode Caenorhabditis elegans was a landmark achievement and ushered in a new era of whole-organism, systems analyses of the biology of this powerful model organism. The success of the C. elegans genome sequencing project also inspired communities working on other organisms to approach genome sequencing of their species. The phylum Nematoda is rich and diverse and of interest to a wide range of research fields from basic biology through ecology and parasitic disease. For all these communities, it is now clear that access to genome scale data will be key to advancing understanding, and in the case of parasites, developing new ways to control or cure diseases. The advent of second-generation sequencing technologies, improvements in computing algorithms and infrastructure and growth in bioinformatics and genomics literacy is making the addition of genome sequencing to the research goals of any nematode research program a less daunting prospect. To inspire, promote and coordinate genomic sequencing across the diversity of the phylum, we have launched a community wiki and the 959 Nematode Genomes initiative (www.nematodegenomes.org/). Just as the deciphering of the developmental lineage of the 959 cells of the adult hermaphrodite C. elegans was the gateway to broad advances in biomedical science, we hope that a nematode phylogeny with (at least) 959 sequenced species will underpin further advances in understanding the origins of parasitism, the dynamics of genomic change and the adaptations that have made Nematoda one of the most successful animal phyla.
Collapse
Affiliation(s)
- Sujai Kumar
- Institute of Evolutionary Biology; University of Edinburgh; Edinburgh, UK
| | | | | | | |
Collapse
|
509
|
Livaja M, Wang Y, Wieckhorst S, Haseneyer G, Seidel M, Hahn V, Knapp SJ, Taudien S, Schön CC, Bauer E. BSTA: a targeted approach combines bulked segregant analysis with next- generation sequencing and de novo transcriptome assembly for SNP discovery in sunflower. BMC Genomics 2013; 14:628. [PMID: 24330545 PMCID: PMC3848877 DOI: 10.1186/1471-2164-14-628] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2013] [Accepted: 09/16/2013] [Indexed: 01/31/2023] Open
Abstract
Background Sunflower belongs to the largest plant family on earth, the genomically poorly explored Compositae. Downy mildew Plasmopara halstedii (Farlow) Berlese & de Toni is one of the major diseases of cultivated sunflower (Helianthus annuus L.). In the search for new sources of downy mildew resistance, the locus PlARG on linkage group 1 (LG1) originating from H. argophyllus is promising since it confers resistance against all known races of the pathogen. However, the mapping resolution in the PlARG region is hampered by significantly suppressed recombination and by limited availability of polymorphic markers. Here we examined a strategy developed for the enrichment of molecular markers linked to this specific genomic region. We combined bulked segregant analysis (BSA) with next-generation sequencing (NGS) and de novo assembly of the sunflower transcriptome for single nucleotide polymorphism (SNP) discovery in a sequence resource combining reads originating from two sunflower species, H. annuus and H. argophyllus. Results A computational pipeline developed for SNP calling and pattern detection identified 219 candidate genes. For a proof of concept, 42 resistance gene-like sequences were subjected to experimental SNP validation. Using a high-resolution mapping population, 12 SNP markers were mapped to LG1. We successfully verified candidate sequences either co-segregating with or closely flanking PlARG. Conclusions This study is the first successful example to improve bulked segregant analysis with de novo transcriptome assembly using next generation sequencing. The BSTA pipeline we developed provides a useful guide for similar studies in other non-model organisms. Our results demonstrate this method is an efficient way to enrich molecular markers and to identify candidate genes in a specific mapping interval.
Collapse
|
510
|
Sakurai T, Mochida K, Yoshida T, Akiyama K, Ishitani M, Seki M, Shinozaki K. Genome-wide discovery and information resource development of DNA polymorphisms in cassava. PLoS One 2013; 8:e74056. [PMID: 24040164 PMCID: PMC3770675 DOI: 10.1371/journal.pone.0074056] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2013] [Accepted: 07/29/2013] [Indexed: 01/06/2023] Open
Abstract
Cassava (Manihot esculenta Crantz) is an important crop that provides food security and income generation in many tropical countries, and is known for its adaptability to various environmental conditions. Its draft genome sequence and many expressed sequence tags are now publicly available, allowing the development of cassava polymorphism information. Here, we describe the genome-wide discovery of cassava DNA polymorphisms. Using the alignment of predicted transcribed sequences from the cassava draft genome sequence and ESTs from GenBank, we discovered 10,546 single-nucleotide polymorphisms and 647 insertions and deletions. To facilitate molecular marker development for cassava, we designed 9,316 PCR primer pairs to amplify the genomic region around each DNA polymorphism. Of the discovered SNPs, 62.7% occurred in protein-coding regions. Disease-resistance genes were found to have a significantly higher ratio of nonsynonymous-to-synonymous substitutions. We identified 24 read-through (changes of a stop codon to a coding codon) and 38 premature stop (changes of a coding codon to a stop codon) single-nucleotide polymorphisms, and found that the 5 gene ontology terms in biological process were significantly different in genes with read-through single-nucleotide polymorphisms compared with all cassava genes. All data on the discovered DNA polymorphisms were organized into the Cassava Online Archive database, which is available at http://cassava.psc.riken.jp/.
Collapse
Affiliation(s)
- Tetsuya Sakurai
- RIKEN Center for Sustainable Resource Science, Tsurumi-ku, Yokohama, Kanagawa, Japan
| | - Keiichi Mochida
- RIKEN Center for Sustainable Resource Science, Tsurumi-ku, Yokohama, Kanagawa, Japan
- RIKEN Biomass Engineering Program, Tsurumi-ku, Yokohama, Kanagawa, Japan
- Kihara Institute for Biological Research, Yokohama City University, Totsuka-ku, Yokohama, Kanagawa, Japan
| | - Takuhiro Yoshida
- RIKEN Center for Sustainable Resource Science, Tsurumi-ku, Yokohama, Kanagawa, Japan
| | - Kenji Akiyama
- RIKEN Center for Sustainable Resource Science, Tsurumi-ku, Yokohama, Kanagawa, Japan
| | - Manabu Ishitani
- Agrobiodiversity Research Area, International Center for Tropical Agriculture (CIAT), Cali, Colombia
| | - Motoaki Seki
- RIKEN Center for Sustainable Resource Science, Tsurumi-ku, Yokohama, Kanagawa, Japan
- Kihara Institute for Biological Research, Yokohama City University, Totsuka-ku, Yokohama, Kanagawa, Japan
| | - Kazuo Shinozaki
- RIKEN Center for Sustainable Resource Science, Tsurumi-ku, Yokohama, Kanagawa, Japan
- RIKEN Biomass Engineering Program, Tsurumi-ku, Yokohama, Kanagawa, Japan
| |
Collapse
|
511
|
Arabidopsis semidwarfs evolved from independent mutations in GA20ox1, ortholog to green revolution dwarf alleles in rice and barley. Proc Natl Acad Sci U S A 2013; 110:15818-23. [PMID: 24023067 DOI: 10.1073/pnas.1314979110] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Understanding the genetic bases of natural variation for developmental and stress-related traits is a major goal of current plant biology. Variation in plant hormone levels and signaling might underlie such phenotypic variation occurring even within the same species. Here we report the genetic and molecular basis of semidwarf individuals found in natural Arabidopsis thaliana populations. Allelism tests demonstrate that independent loss-of-function mutations at GA locus 5 (GA5), which encodes gibberellin 20-oxidase 1 (GA20ox1) involved in the last steps of gibberellin biosynthesis, are found in different populations from southern, western, and northern Europe; central Asia; and Japan. Sequencing of GA5 identified 21 different loss-of-function alleles causing semidwarfness without any obvious general tradeoff affecting plant performance traits. GA5 shows signatures of purifying selection, whereas GA5 loss-of-function alleles can also exhibit patterns of positive selection in specific populations as shown by Fay and Wu's H statistics. These results suggest that antagonistic pleiotropy might underlie the occurrence of GA5 loss-of-function mutations in nature. Furthermore, because GA5 is the ortholog of rice SD1 and barley Sdw1/Denso green revolution genes, this study illustrates the occurrence of conserved adaptive evolution between wild A.thaliana and domesticated plants.
Collapse
|
512
|
Sabarinathan R, Tafer H, Seemann SE, Hofacker IL, Stadler PF, Gorodkin J. RNAsnp: efficient detection of local RNA secondary structure changes induced by SNPs. Hum Mutat 2013; 34:546-56. [PMID: 23315997 PMCID: PMC3708107 DOI: 10.1002/humu.22273] [Citation(s) in RCA: 99] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2012] [Accepted: 12/18/2012] [Indexed: 02/05/2023]
Abstract
Structural characteristics are essential for the functioning of many noncoding RNAs and cis-regulatory elements of mRNAs. SNPs may disrupt these structures, interfere with their molecular function, and hence cause a phenotypic effect. RNA folding algorithms can provide detailed insights into structural effects of SNPs. The global measures employed so far suffer from limited accuracy of folding programs on large RNAs and are computationally too demanding for genome-wide applications. Here, we present a strategy that focuses on the local regions of maximal structural change between mutant and wild-type. These local regions are approximated in a “screening mode” that is intended for genome-wide applications. Furthermore, localized regions are identified as those with maximal discrepancy. The mutation effects are quantified in terms of empirical P values. To this end, the RNAsnp software uses extensive precomputed tables of the distribution of SNP effects as function of length and GC content. RNAsnp thus achieves both a noise reduction and speed-up of several orders of magnitude over shuffling-based approaches. On a data set comprising 501 SNPs associated with human-inherited diseases, we predict 54 to have significant local structural effect in the untranslated region of mRNAs. RNAsnp is available at http://rth.dk/resources/rnasnp.
Collapse
|
513
|
Mathew LA, Staab PR, Rose LE, Metzler D. Why to account for finite sites in population genetic studies and how to do this with Jaatha 2.0. Ecol Evol 2013; 3:3647-62. [PMID: 24198930 PMCID: PMC3810865 DOI: 10.1002/ece3.722] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2013] [Revised: 06/20/2013] [Accepted: 06/23/2013] [Indexed: 11/06/2022] Open
Abstract
With the advent of next-generation sequencing technologies, large data sets of several thousand loci from multiple conspecific individuals are available. Such data sets should make it possible to obtain accurate estimates of population genetic parameters, even for complex models of population history. In the analyses of large data sets, it is difficult to consider finite-sites mutation models (FSMs). Here, we use extensive simulations to demonstrate that the inclusion of FSMs is necessary to avoid severe biases in the estimation of the population mutation rate θ, population divergence times, and migration rates. We present a new version of Jaatha, an efficient composite-likelihood method for estimating demographic parameters from population genetic data and evaluate the usefulness of Jaatha in two biological examples. For the first application, we infer the speciation process of two wild tomato species, Solanum chilense and Solanum peruvianum. In our second application example, we demonstrate that Jaatha is readily applicable to NGS data by analyzing genome-wide data from two southern European populations of Arabidopsis thaliana. Jaatha is now freely available as an R package from the Comprehensive R Archive Network (CRAN).
Collapse
Affiliation(s)
- Lisha A Mathew
- Life Sciences, École Polytechnique Fédérale de Lausanne Lausanne, Switzerland ; Swiss Institiute of Bioinformatics (SIB) Lausanne, Switzerland
| | | | | | | |
Collapse
|
514
|
Jiao WB, Huang D, Xing F, Hu Y, Deng XX, Xu Q, Chen LL. Genome-wide characterization and expression analysis of genetic variants in sweet orange. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2013; 75:954-64. [PMID: 23738603 DOI: 10.1111/tpj.12254] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/25/2013] [Revised: 05/20/2013] [Accepted: 05/31/2013] [Indexed: 05/11/2023]
Abstract
Heterozyosity is an important feature of many plant genomes, and is related to heterosis. Sweet orange, a highly heterozygous species, is thought to have originated from an inter-species hybrid between pummelo and mandarin. To investigate the heterozygosity of the sweet orange genome and examine how this heterozygosity affects gene expression, we characterized the genome of Valencia orange for single nucleotide variations (SNVs), small insertions and deletions (InDels) and structural variations (SVs), and determined their functional effects on protein-coding genes and non-coding sequences. Almost half of the genes containing large-effect SNVs and InDels were expressed in a tissue-specific manner. We identified 3542 large SVs (>50 bp), including deletions, insertions and inversions. Most of the 296 genes located in large-deletion regions showed low expression levels. RNA-Seq reads and DNA sequencing reads revealed that the alleles of 1062 genes were differentially expressed. In addition, we detected approximately 42 Mb of contigs that were not found in the reference genome of a haploid sweet orange by de novo assembly of unmapped reads, and annotated 134 protein-coding genes within these contigs. We discuss how this heterozygosity affects the quality of genome assembly. This study advances our understanding of the genome architecture of sweet orange, and provides a global view of gene expression at heterozygous loci.
Collapse
Affiliation(s)
- Wen-Biao Jiao
- State Key Laboratory of Agricultural Microbiology, Center for Bioinformatics, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070, China
| | | | | | | | | | | | | |
Collapse
|
515
|
Günther T, Coop G. Robust identification of local adaptation from allele frequencies. Genetics 2013; 195:205-20. [PMID: 23821598 PMCID: PMC3761302 DOI: 10.1534/genetics.113.152462] [Citation(s) in RCA: 364] [Impact Index Per Article: 33.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2013] [Accepted: 06/17/2013] [Indexed: 12/15/2022] Open
Abstract
Comparing allele frequencies among populations that differ in environment has long been a tool for detecting loci involved in local adaptation. However, such analyses are complicated by an imperfect knowledge of population allele frequencies and neutral correlations of allele frequencies among populations due to shared population history and gene flow. Here we develop a set of methods to robustly test for unusual allele frequency patterns and correlations between environmental variables and allele frequencies while accounting for these complications based on a Bayesian model previously implemented in the software Bayenv. Using this model, we calculate a set of "standardized allele frequencies" that allows investigators to apply tests of their choice to multiple populations while accounting for sampling and covariance due to population history. We illustrate this first by showing that these standardized frequencies can be used to detect nonparametric correlations with environmental variables; these correlations are also less prone to spurious results due to outlier populations. We then demonstrate how these standardized allele frequencies can be used to construct a test to detect SNPs that deviate strongly from neutral population structure. This test is conceptually related to FST and is shown to be more powerful, as we account for population history. We also extend the model to next-generation sequencing of population pools-a cost-efficient way to estimate population allele frequencies, but one that introduces an additional level of sampling noise. The utility of these methods is demonstrated in simulations and by reanalyzing human SNP data from the Human Genome Diversity Panel populations and pooled next-generation sequencing data from Atlantic herring. An implementation of our method is available from http://gcbias.org.
Collapse
Affiliation(s)
- Torsten Günther
- Institute of Plant Breeding, Seed Science, and Population Genetics, University of Hohenheim, 70593 Stuttgart, Germany
| | - Graham Coop
- Department of Evolution and Ecology and Center for Population Biology, University of California, Davis, California 95616
| |
Collapse
|
516
|
Lowry DB, Logan TL, Santuari L, Hardtke CS, Richards JH, DeRose-Wilson LJ, McKay JK, Sen S, Juenger TE. Expression quantitative trait locus mapping across water availability environments reveals contrasting associations with genomic features in Arabidopsis. THE PLANT CELL 2013; 25:3266-79. [PMID: 24045022 PMCID: PMC3809531 DOI: 10.1105/tpc.113.115352] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/26/2013] [Revised: 08/09/2013] [Accepted: 08/26/2013] [Indexed: 05/18/2023]
Abstract
The regulation of gene expression is crucial for an organism's development and response to stress, and an understanding of the evolution of gene expression is of fundamental importance to basic and applied biology. To improve this understanding, we conducted expression quantitative trait locus (eQTL) mapping in the Tsu-1 (Tsushima, Japan) × Kas-1 (Kashmir, India) recombinant inbred line population of Arabidopsis thaliana across soil drying treatments. We then used genome resequencing data to evaluate whether genomic features (promoter polymorphism, recombination rate, gene length, and gene density) are associated with genes responding to the environment (E) or with genes with genetic variation (G) in gene expression in the form of eQTLs. We identified thousands of genes that responded to soil drying and hundreds of main-effect eQTLs. However, we identified very few statistically significant eQTLs that interacted with the soil drying treatment (GxE eQTL). Analysis of genome resequencing data revealed associations of several genomic features with G and E genes. In general, E genes had lower promoter diversity and local recombination rates. By contrast, genes with eQTLs (G) had significantly greater promoter diversity and were located in genomic regions with higher recombination. These results suggest that genomic architecture may play an important a role in the evolution of gene expression.
Collapse
Affiliation(s)
- David B Lowry
- Department of Integrative Biology and Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, Texas 78712
| | | | | | | | | | | | | | | | | |
Collapse
|
517
|
Abstract
BACKGROUND While many bioinformatics tools currently exist for assembling and discovering variants from next-generation sequence data, there are very few tools available for performing evolutionary analyses from these data. Evolutionary and population genomics studies hold great promise for providing valuable insights into natural selection, the effect of mutations on phenotypes, and the origin of species. Thus, there is a need for an extensible and flexible computational tool that can function into a growing number of evolutionary bioinformatics pipelines. RESULTS This paper describes the POPBAM software, which is a comprehensive set of computational tools for evolutionary analysis of whole-genome alignments consisting of multiple individuals, from multiple populations or species. POPBAM works directly from BAM-formatted assembly files, calls variant sites, and calculates a variety of commonly used evolutionary sequence statistics. POPBAM is designed primarily to perform analyses in sliding windows across chromosomes or scaffolds. POPBAM accurately measures nucleotide diversity, population divergence, linkage disequilibrium, and the frequency spectrum of mutations from two or more populations. POPBAM can also produce phylogenetic trees of all samples in a BAM file. Finally, I demonstrate that the implementation of POPBAM is both fast and memory-efficient, and also can feasibly scale to the analysis of large BAM files with many individuals and populations. Software: The POPBAM program is written in C/C++ and is available from http://dgarriga.github.io/POPBAM. The program has few dependencies and can be built on a variety of Linux platforms. The program is open-source and users are encouraged to participate in the development of this resource.
Collapse
Affiliation(s)
- Daniel Garrigan
- Department of Biology, University of Rochester, Rochester, New York 14627 USA
| |
Collapse
|
518
|
Wandelt S, Leser U. FRESCO: Referential compression of highly similar sequences. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013; 10:1275-1288. [PMID: 24524158 DOI: 10.1109/tcbb.2013.122] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
In many applications, sets of similar texts or sequences are of high importance. Prominent examples are revision histories of documents or genomic sequences. Modern high-throughput sequencing technologies are able to generate DNA sequences at an ever-increasing rate. In parallel to the decreasing experimental time and cost necessary to produce DNA sequences, computational requirements for analysis and storage of the sequences are steeply increasing. Compression is a key technology to deal with this challenge. Recently, referential compression schemes, storing only the differences between a to-be-compressed input and a known reference sequence, gained a lot of interest in this field. In this paper, we propose a general open-source framework to compress large amounts of biological sequence data called Framework for REferential Sequence COmpression (FRESCO). Our basic compression algorithm is shown to be one to two orders of magnitudes faster than comparable related work, while achieving similar compression ratios. We also propose several techniques to further increase compression ratios, while still retaining the advantage in speed: 1) selecting a good reference sequence; and 2) rewriting a reference sequence to allow for better compression. In addition,we propose a new way of further boosting the compression ratios by applying referential compression to already referentially compressed files (second-order compression). This technique allows for compression ratios way beyond state of the art, for instance,4,000:1 and higher for human genomes. We evaluate our algorithms on a large data set from three different species (more than 1,000 genomes, more than 3 TB) and on a collection of versions of Wikipedia pages. Our results show that real-time compression of highly similar sequences at high compression ratios is possible on modern hardware.
Collapse
|
519
|
Perry G, DiNatale C, Xie W, Navabi A, Reinprecht Y, Crosby W, Yu K, Shi C, Pauls KP. A comparison of the molecular organization of genomic regions associated with resistance to common bacterial blight in two Phaseolus vulgaris genotypes. FRONTIERS IN PLANT SCIENCE 2013; 4:318. [PMID: 24009615 PMCID: PMC3756299 DOI: 10.3389/fpls.2013.00318] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2013] [Accepted: 07/29/2013] [Indexed: 05/28/2023]
Abstract
Resistance to common bacterial blight, caused by Xanthomonas axonopodis pv. phaseoli, in Phaseolus vulgaris is conditioned by several loci on different chromosomes. Previous studies with OAC-Rex, a CBB-resistant, white bean variety of Mesoamerican origin, identified two resistance loci associated with the molecular markers Pv-CTT001 and SU91, on chromosome 4 and 8, respectively. Resistance to CBB is assumed to be derived from an interspecific cross with Phaseolus acutifolius in the pedigree of OAC-Rex. Our current whole genome sequencing effort with OAC-Rex provided the opportunity to compare its genome in the regions associated with CBB resistance with the v1.0 release of the P. vulgaris line G19833, which is a large seeded bean of Andean origin, and (assumed to be) CBB susceptible. In addition, the genomic regions containing SAP6, a marker associated with P. vulgaris-derived CBB-resistance on chromosome 10, were compared. These analyses indicated that gene content was highly conserved between G19833 and OAC-Rex across the regions examined (>80%). However, fifty-nine genes unique to OAC Rex were identified, with resistance gene homologues making up the largest category (10 genes identified). Two unique genes in OAC-Rex located within the SU91 resistance QTL have homology to P. acutifolius ESTs and may be potential sources of CBB resistance. As the genomic sequence assembly of OAC-Rex is completed, we expect that further comparisons between it and the G19833 genome will lead to a greater understanding of CBB resistance in bean.
Collapse
Affiliation(s)
- Gregory Perry
- Department of Plant Agriculture, University of Guelph, GuelphON, Canada
| | - Claudia DiNatale
- Department of Biological Sciences, University of Windsor, WindsorON, Canada
| | - Weilong Xie
- Agriculture and Agri-Food Canada, c/o Department of Plant Agriculture, University of Guelph, GuelphON, Canada
| | - Alireza Navabi
- Agriculture and Agri-Food Canada, c/o Department of Plant Agriculture, University of Guelph, GuelphON, Canada
| | | | - William Crosby
- Department of Biological Sciences, University of Windsor, WindsorON, Canada
| | - Kangfu Yu
- Greenhouse and Processing Crops Research Centre, Agriculture and Agri-Food Canada, HarrowON, Canada
| | - Chun Shi
- Greenhouse and Processing Crops Research Centre, Agriculture and Agri-Food Canada, HarrowON, Canada
| | - K. Peter Pauls
- Department of Plant Agriculture, University of Guelph, GuelphON, Canada
| |
Collapse
|
520
|
Divergent evolutionary and expression patterns between lineage specific new duplicate genes and their parental paralogs in Arabidopsis thaliana. PLoS One 2013; 8:e72362. [PMID: 24009676 PMCID: PMC3756979 DOI: 10.1371/journal.pone.0072362] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2013] [Accepted: 07/11/2013] [Indexed: 12/14/2022] Open
Abstract
Gene duplication is an important mechanism for the origination of functional novelties in organisms. We performed a comparative genome analysis to systematically estimate recent lineage specific gene duplication events in Arabidopsis thaliana and further investigate whether and how these new duplicate genes (NDGs) play a functional role in the evolution and adaption of A. thaliana. We accomplished this using syntenic relationship among four closely related species, A. thaliana, A. lyrata, Capsella rubella and Brassica rapa. We identified 100 NDGs, showing clear origination patterns, whose parental genes are located in syntenic regions and/or have clear orthologs in at least one of three outgroup species. All 100 NDGs were transcribed and under functional constraints, while 24% of the NDGs have differential expression patterns compared to their parental genes. We explored the underlying evolutionary forces of these paralogous pairs through conducting neutrality tests with sequence divergence and polymorphism data. Evolution of about 15% of NDGs appeared to be driven by natural selection. Moreover, we found that 3 NDGs not only altered their expression patterns when compared with parental genes, but also evolved under positive selection. We investigated the underlying mechanisms driving the differential expression of NDGs and their parents, and found a number of NDGs had different cis-elements and methylation patterns from their parental genes. Overall, we demonstrated that NDGs acquired divergent cis-elements and methylation patterns and may experience sub-functionalization or neo-functionalization influencing the evolution and adaption of A. thaliana.
Collapse
|
521
|
An alternative polyadenylation mechanism coopted to the Arabidopsis RPP7 gene through intronic retrotransposon domestication. Proc Natl Acad Sci U S A 2013; 110:E3535-43. [PMID: 23940361 DOI: 10.1073/pnas.1312545110] [Citation(s) in RCA: 117] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Transposable elements (TEs) can drive evolution by creating genetic and epigenetic variation. Although examples of adaptive TE insertions are accumulating, proof that epigenetic information carried by such "domesticated" TEs has been coopted to control host gene function is still limited. We show that COPIA-R7, a TE inserted into the Arabidopsis thaliana disease resistance gene RPP7 recruited the histone mark H3K9me2 to this locus. H3K9me2 levels at COPIA-R7 affect the choice between two alternative RPP7 polyadenylation sites in the pre-mRNA and, thereby, influence the critical balance between RPP7-coding and non-RPP7-coding transcript isoforms. Function of RPP7 is fully dependent on high levels of H3K9me2 at COPIA-R7. We present a direct in vivo demonstration for cooption of a TE-associated histone mark to the epigenetic control of pre-mRNA processing and establish a unique mechanism for regulation of plant immune surveillance gene expression. Our results functionally link a histone mark to alternative polyadenylation and the balance between distinct transcript isoforms from a single gene.
Collapse
|
522
|
Gloss AD, Dittrich ACN, Goldman-Huertas B, Whiteman NK. Maintenance of genetic diversity through plant-herbivore interactions. CURRENT OPINION IN PLANT BIOLOGY 2013; 16:443-50. [PMID: 23834766 PMCID: PMC4059408 DOI: 10.1016/j.pbi.2013.06.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/24/2013] [Revised: 06/01/2013] [Accepted: 06/07/2013] [Indexed: 05/10/2023]
Abstract
Identifying the factors governing the maintenance of genetic variation is a central challenge in evolutionary biology. New genomic data, methods and conceptual advances provide increasing evidence that balancing selection, mediated by antagonistic species interactions, maintains genome-wide functionally important genetic variation within species and natural populations. Because diverse interactions between plants and herbivorous insects dominate terrestrial communities, they provide excellent systems to address this hypothesis. Population genomic studies of Arabidopsis thaliana and its relatives suggest spatial variation in herbivory maintains adaptive genetic variation controlling defense phenotypes, both within and among populations. Conversely, inter-species variation in plant defenses promotes adaptive genetic variation in herbivores. Emerging genomic model herbivores of Arabidopsis could illuminate how genetic variation in herbivores and plants interact simultaneously.
Collapse
Affiliation(s)
- Andrew D. Gloss
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721, USA,
| | - Anna C. Nelson Dittrich
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721, USA,
| | | | - Noah K. Whiteman
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721, USA,
| |
Collapse
|
523
|
Shirasawa K, Fukuoka H, Matsunaga H, Kobayashi Y, Kobayashi I, Hirakawa H, Isobe S, Tabata S. Genome-wide association studies using single nucleotide polymorphism markers developed by re-sequencing of the genomes of cultivated tomato. DNA Res 2013; 20:593-603. [PMID: 23903436 PMCID: PMC3859326 DOI: 10.1093/dnares/dst033] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
With the aim of understanding relationship between genetic and phenotypic variations in cultivated tomato, single nucleotide polymorphism (SNP) markers covering the whole genome of cultivated tomato were developed and genome-wide association studies (GWAS) were performed. The whole genomes of six tomato lines were sequenced with the ABI-5500xl SOLiD sequencer. Sequence reads covering ∼13.7× of the genome for each line were obtained, and mapped onto tomato reference genomes (SL2.40) to detect ∼1.5 million SNP candidates. Of the identified SNPs, 1.5% were considered to confer gene functions. In the subsequent Illumina GoldenGate assay for 1536 SNPs, 1293 SNPs were successfully genotyped, and 1248 showed polymorphisms among 663 tomato accessions. The whole-genome linkage disequilibrium (LD) analysis detected highly biased LD decays between euchromatic (58 kb) and heterochromatic regions (13.8 Mb). Subsequent GWAS identified SNPs that were significantly associated with agronomical traits, with SNP loci located near genes that were previously reported as candidates for these traits. This study demonstrates that attractive loci can be identified by performing GWAS with a large number of SNPs obtained from re-sequencing analysis.
Collapse
Affiliation(s)
- Kenta Shirasawa
- 1 Kazusa DNA Research Institute, 2-6-7 Kazusa-Kamatari, Kisarazu, Chiba 292-0818, Japan
| | | | | | | | | | | | | | | |
Collapse
|
524
|
Staats M, Erkens RHJ, van de Vossenberg B, Wieringa JJ, Kraaijeveld K, Stielow B, Geml J, Richardson JE, Bakker FT. Genomic treasure troves: complete genome sequencing of herbarium and insect museum specimens. PLoS One 2013; 8:e69189. [PMID: 23922691 PMCID: PMC3726723 DOI: 10.1371/journal.pone.0069189] [Citation(s) in RCA: 124] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2013] [Accepted: 06/03/2013] [Indexed: 12/03/2022] Open
Abstract
Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS) world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae) herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22–82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus) were generated with 81.4–97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2–71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes), but at least generating vital comparative genomic data for testing (phylo)genetic, demographic and genetic hypotheses, that become increasingly more horizontal. Furthermore, NGS of historical DNA enables recovering crucial genetic information from old type specimens that to date have remained mostly unutilized and, thus, opens up a new frontier for taxonomic research as well.
Collapse
Affiliation(s)
- Martijn Staats
- Biosystematics Group, Wageningen University, Wageningen, The Netherlands
| | - Roy H. J. Erkens
- Maastricht Science Program, Maastricht University, Maastricht, The Netherlands
- Ecology and Biodiversity Group, Department of Biology, Utrecht University, Utrecht, The Netherlands
| | - Bart van de Vossenberg
- Dutch National Plant Protection Organization, National Reference Centre, Wageningen, The Netherlands
| | - Jan J. Wieringa
- Biosystematics Group, Wageningen University, Wageningen, The Netherlands
- Netherlands Centre for Biodiversity Naturalis (section NHN), Herbarium Vadense (WAG), Wageningen University, Wageningen, The Netherlands
| | - Ken Kraaijeveld
- Department of Human Genetics/Leiden Genome Technology Center, Leiden University Medical Center, Leiden, The Netherlands
| | - Benjamin Stielow
- Centraalbureau voor Schimmelcultures Fungal Biodiversity Centre (CBS-KNAW), Utrecht, The Netherlands
| | - József Geml
- Naturalis Biodiversity Center, Section National Herbarium of the Netherlands, Leiden, The Netherlands
| | - James E. Richardson
- Royal Botanic Garden Edinburgh, Inverleith Row, Edinburgh, United Kingdom
- Laboratorio de Botánica y Sistemática, Universidad de Los Andes, Apartado Aéreo 4976, Bogotá, Colombia
| | - Freek T. Bakker
- Biosystematics Group, Wageningen University, Wageningen, The Netherlands
- * E-mail:
| |
Collapse
|
525
|
Wahler D, Schauser L, Bendiek J, Grohmann L. Next-Generation Sequencing as a Tool for Detailed Molecular Characterisation of Genomic Insertions and Flanking Regions in Genetically Modified Plants: a Pilot Study Using a Rice Event Unauthorised in the EU. FOOD ANAL METHOD 2013. [DOI: 10.1007/s12161-013-9673-x] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
526
|
Paritosh K, Yadava SK, Gupta V, Panjabi-Massand P, Sodhi YS, Pradhan AK, Pental D. RNA-seq based SNPs in some agronomically important oleiferous lines of Brassica rapa and their use for genome-wide linkage mapping and specific-region fine mapping. BMC Genomics 2013; 14:463. [PMID: 23837684 PMCID: PMC3711843 DOI: 10.1186/1471-2164-14-463] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2012] [Accepted: 07/01/2013] [Indexed: 01/01/2023] Open
Abstract
Background Brassica rapa (AA) contains very diverse forms which include oleiferous types and many vegetable types. Genome sequence of B. rapa line Chiifu (ssp. pekinensis), a leafy vegetable type, was published in 2011. Using this knowledge, it is important to develop genomic resources for the oleiferous types of B. rapa. This will allow more involved molecular mapping, in-depth study of molecular mechanisms underlying important agronomic traits and introgression of traits from B. rapa to major oilseed crops - B. juncea (AABB) and B. napus (AACC). The study explores the availability of SNPs in RNA-seq generated contigs of three oleiferous lines of B. rapa - Candle (ssp. oleifera, turnip rape), YSPB-24 and Tetra (ssp. trilocularis, Yellow sarson) and their use in genome-wide linkage mapping and specific-region fine mapping using a RIL population between Chiifu and Tetra. Results RNA-seq was carried out on the RNA isolated from young inflorescences containing unopened floral buds, floral axis and small leaves, using Illumina paired-end sequencing technology. Sequence assembly was carried out using the Velvet de-novo programme and the assembled contigs were organised against Chiifu gene models, available in the BRAD-CDS database. RNA-seq confirmed the presence of more than 17,000 single-copy gene models described in the BRAD database. The assembled contigs and the BRAD gene models were analyzed for the presence of SSRs and SNPs. While the number of SSRs was limited, more than 0.2 million SNPs were observed between Chiifu and the three oleiferous lines. Assays for SNPs were designed using KASPar technology and tested on a F7-RIL population derived from a Chiifu x Tetra cross. The design of the SNP assays were based on three considerations - the 50 bp flanking region of the SNPs should be strictly similar, the SNP should have a read-depth of ≥7 and no exon/intron junction should be present within the 101 bp target region. Using these criteria, a total of 640 markers (580 for genome-wide mapping and 60 for specific-region mapping) marking as many genes were tested for mapping. Out of 640 markers that were tested, 594 markers could be mapped unambiguously which included 542 markers for genome-wide mapping and 42 markers for fine mapping of the tet-o locus that is involved with the trait tetralocular ovary in the line Tetra. Conclusion A large number of SNPs and PSVs are present in the transcriptome of B. rapa lines for genome-wide linkage mapping and specific-region fine mapping. Criteria used for SNP identification delivered markers, more than 93% of which could be successfully mapped to the F7–RIL population of Chiifu x Tetra cross.
Collapse
|
527
|
Iovene M, Zhang T, Lou Q, Buell CR, Jiang J. Copy number variation in potato - an asexually propagated autotetraploid species. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2013; 75:80-89. [PMID: 23573982 DOI: 10.1111/tpj.12200] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/05/2013] [Revised: 03/29/2013] [Accepted: 04/07/2013] [Indexed: 05/23/2023]
Abstract
Copy number variation (CNV) has been revealed as a significant contributor to the genetic variation in humans. Although CNV has been reported in several model animal and plant species, the presence of CNV and its biological impact in polyploid species has not yet been documented. We conducted a fluorescence in situ hybridization (FISH)-based CNV survey in potato, a vegetatively propagated autotetraploid species (2n = 4x = 48). We conducted FISH analysis using 18 randomly selected potato bacterial artificial chromosome (BAC) clones in a set of 16 potato cultivars with diverse breeding backgrounds. Six BACs (33%) with insert sizes of 137-145 kb were found to be associated with large CNV events detectable at the cytological level. We demonstrate that the large CNVs associated with two specific BACs (RH102I10 and RH83C08) were widespread among potato cultivars developed in North America and Europe. We measured the transcript abundance of four genes associated with the CNV spanned by BAC RH102I10. All four genes displayed a dosage effect in transcription. Although potato is vegetatively propagated, we observed that female gametes lacking the RH102I10-associated CNV were inferior to those with at least one copy of this CNV, indicating that the RH102I10-associated CNV can impact on the growth and development of the potato plants. Our results show that CNV is highly abundant in the potato genome and may play a significant role in genetic variation of this important food crop.
Collapse
Affiliation(s)
- Marina Iovene
- Department of Horticulture, University of Wisconsin-Madison, Madison, WI, 53706, USA
- CNR-Institute of Plant Genetics, Bari, 70126, Italy
| | - Tao Zhang
- Department of Horticulture, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Qunfeng Lou
- Department of Horticulture, University of Wisconsin-Madison, Madison, WI, 53706, USA
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Jiangsu, Nanjing, 210095, People's Republic of China
| | - C Robin Buell
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Jiming Jiang
- Department of Horticulture, University of Wisconsin-Madison, Madison, WI, 53706, USA
| |
Collapse
|
528
|
Haudry A, Platts AE, Vello E, Hoen DR, Leclercq M, Williamson RJ, Forczek E, Joly-Lopez Z, Steffen JG, Hazzouri KM, Dewar K, Stinchcombe JR, Schoen DJ, Wang X, Schmutz J, Town CD, Edger PP, Pires JC, Schumaker KS, Jarvis DE, Mandáková T, Lysak MA, van den Bergh E, Schranz ME, Harrison PM, Moses AM, Bureau TE, Wright SI, Blanchette M. An atlas of over 90,000 conserved noncoding sequences provides insight into crucifer regulatory regions. Nat Genet 2013; 45:891-8. [PMID: 23817568 DOI: 10.1038/ng.2684] [Citation(s) in RCA: 220] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2012] [Accepted: 06/04/2013] [Indexed: 12/17/2022]
Abstract
Despite the central importance of noncoding DNA to gene regulation and evolution, understanding of the extent of selection on plant noncoding DNA remains limited compared to that of other organisms. Here we report sequencing of genomes from three Brassicaceae species (Leavenworthia alabamica, Sisymbrium irio and Aethionema arabicum) and their joint analysis with six previously sequenced crucifer genomes. Conservation across orthologous bases suggests that at least 17% of the Arabidopsis thaliana genome is under selection, with nearly one-quarter of the sequence under selection lying outside of coding regions. Much of this sequence can be localized to approximately 90,000 conserved noncoding sequences (CNSs) that show evidence of transcriptional and post-transcriptional regulation. Population genomics analyses of two crucifer species, A. thaliana and Capsella grandiflora, confirm that most of the identified CNSs are evolving under medium to strong purifying selection. Overall, these CNSs highlight both similarities and several key differences between the regulatory DNA of plants and other species.
Collapse
Affiliation(s)
- Annabelle Haudry
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario, Canada
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
529
|
Long Q, Rabanal FA, Meng D, Huber CD, Farlow A, Platzer A, Zhang Q, Vilhjálmsson BJ, Korte A, Nizhynska V, Voronin V, Korte P, Sedman L, Mandáková T, Lysak MA, Seren Ü, Hellmann I, Nordborg M. Massive genomic variation and strong selection in Arabidopsis thaliana lines from Sweden. Nat Genet 2013; 45:884-890. [PMID: 23793030 DOI: 10.1038/ng.2678] [Citation(s) in RCA: 264] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2012] [Accepted: 05/31/2013] [Indexed: 12/16/2022]
Abstract
Despite advances in sequencing, the goal of obtaining a comprehensive view of genetic variation in populations is still far from reached. We sequenced 180 lines of A. thaliana from Sweden to obtain as complete a picture as possible of variation in a single region. Whereas simple polymorphisms in the unique portion of the genome are readily identified, other polymorphisms are not. The massive variation in genome size identified by flow cytometry seems largely to be due to 45S rDNA copy number variation, with lines from northern Sweden having particularly large numbers of copies. Strong selection is evident in the form of long-range linkage disequilibrium (LD), as well as in LD between nearby compensatory mutations. Many footprints of selective sweeps were found in lines from northern Sweden, and a massive global sweep was shown to have involved a 700-kb transposition.
Collapse
Affiliation(s)
- Quan Long
- Gregor Mendel Institute, Austrian Academy of Sciences, Vienna, Austria
| | | | - Dazhe Meng
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, USA
| | | | - Ashley Farlow
- Gregor Mendel Institute, Austrian Academy of Sciences, Vienna, Austria
| | - Alexander Platzer
- Gregor Mendel Institute, Austrian Academy of Sciences, Vienna, Austria
| | - Qingrun Zhang
- Gregor Mendel Institute, Austrian Academy of Sciences, Vienna, Austria
| | - Bjarni J Vilhjálmsson
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, USA
| | - Arthur Korte
- Gregor Mendel Institute, Austrian Academy of Sciences, Vienna, Austria
| | | | - Viktor Voronin
- Gregor Mendel Institute, Austrian Academy of Sciences, Vienna, Austria
| | - Pamela Korte
- Gregor Mendel Institute, Austrian Academy of Sciences, Vienna, Austria
| | - Laura Sedman
- Gregor Mendel Institute, Austrian Academy of Sciences, Vienna, Austria
| | - Terezie Mandáková
- Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Martin A Lysak
- Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Ümit Seren
- Gregor Mendel Institute, Austrian Academy of Sciences, Vienna, Austria
| | - Ines Hellmann
- Max F. Perutz Laboratories, University of Vienna, Vienna, Austria
| | - Magnus Nordborg
- Gregor Mendel Institute, Austrian Academy of Sciences, Vienna, Austria.,Molecular and Computational Biology, University of Southern California, Los Angeles, California, USA
| |
Collapse
|
530
|
Paape T, Bataillon T, Zhou P, J Y Kono T, Briskine R, Young ND, Tiffin P. Selection, genome-wide fitness effects and evolutionary rates in the model legume Medicago truncatula. Mol Ecol 2013; 22:3525-38. [PMID: 23773281 DOI: 10.1111/mec.12329] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2012] [Revised: 02/22/2013] [Accepted: 03/12/2013] [Indexed: 12/15/2022]
Abstract
Sequence data for >20 000 annotated genes from 56 accessions of Medicago truncatula were used to identify potential targets of positive selection, the determinants of evolutionary rate variation and the relative importance of positive and purifying selection in shaping nucleotide diversity. Based upon patterns of intraspecific diversity and interspecific divergence, c. 50-75% of nonsynonymous polymorphisms are subject to strong purifying selection and 1% of the sampled genes harbour a signature of positive selection. Combining polymorphism with expression data, we estimated the distribution of fitness effects and found that the proportion of deleterious mutations is significantly greater for expressed genes than for genes with undetected transcripts (nonexpressed) in a previous RNA-seq experiment and greater for broadly expressed genes than those expressed in only a single tissue. Expression level is the strongest correlate of evolutionary rates at nonsynonymous sites, and despite multiple genomic features being significantly correlated with evolutionary rates, they explain less than 20% of the variation in nonsynonymous rates (dN) and <15% of the variation in either synonymous rates (dS) or dN:dS. Among putative targets of selection were genes involved in defence against pathogens and herbivores, genes with roles in mediating the relationship with rhizobial symbionts and one-third of annotated histone-lysine methyltransferases. Adaptive evolution of the methyltransferases suggests that positive selection in gene expression may have occurred through evolution of enzymes involved in epigenetic modification.
Collapse
Affiliation(s)
- Timothy Paape
- Institute of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, 8057, Switzerland
| | | | | | | | | | | | | |
Collapse
|
531
|
Muñoz-Amatriaín M, Eichten SR, Wicker T, Richmond TA, Mascher M, Steuernagel B, Scholz U, Ariyadasa R, Spannagl M, Nussbaumer T, Mayer KFX, Taudien S, Platzer M, Jeddeloh JA, Springer NM, Muehlbauer GJ, Stein N. Distribution, functional impact, and origin mechanisms of copy number variation in the barley genome. Genome Biol 2013; 14:R58. [PMID: 23758725 PMCID: PMC3706897 DOI: 10.1186/gb-2013-14-6-r58] [Citation(s) in RCA: 93] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2013] [Accepted: 06/12/2013] [Indexed: 12/20/2022] Open
Abstract
Background There is growing evidence for the prevalence of copy number variation (CNV) and its role in phenotypic variation in many eukaryotic species. Here we use array comparative genomic hybridization to explore the extent of this type of structural variation in domesticated barley cultivars and wild barleys. Results A collection of 14 barley genotypes including eight cultivars and six wild barleys were used for comparative genomic hybridization. CNV affects 14.9% of all the sequences that were assessed. Higher levels of CNV diversity are present in the wild accessions relative to cultivated barley. CNVs are enriched near the ends of all chromosomes except 4H, which exhibits the lowest frequency of CNVs. CNV affects 9.5% of the coding sequences represented on the array and the genes affected by CNV are enriched for sequences annotated as disease-resistance proteins and protein kinases. Sequence-based comparisons of CNV between cultivars Barke and Morex provided evidence that DNA repair mechanisms of double-strand breaks via single-stranded annealing and synthesis-dependent strand annealing play an important role in the origin of CNV in barley. Conclusions We present the first catalog of CNVs in a diploid Triticeae species, which opens the door for future genome diversity research in a tribe that comprises the economically important cereal species wheat, barley, and rye. Our findings constitute a valuable resource for the identification of CNV affecting genes of agronomic importance. We also identify potential mechanisms that can generate variation in copy number in plant genomes.
Collapse
|
532
|
Sousa V, Hey J. Understanding the origin of species with genome-scale data: modelling gene flow. Nat Rev Genet 2013; 14:404-14. [PMID: 23657479 DOI: 10.1038/nrg3446] [Citation(s) in RCA: 181] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
As it becomes easier to sequence multiple genomes from closely related species, evolutionary biologists working on speciation are struggling to get the most out of very large population genomic data sets. Such data hold the potential to resolve long-standing questions in evolutionary biology about the role of gene exchange in species formation. In principle, the new population genomic data can be used to disentangle the conflicting roles of natural selection and gene flow during the divergence process. However, there are great challenges in taking full advantage of such data, especially with regard to including recombination in genetic models of the divergence process. Current data, models, methods and the potential pitfalls in using them will be considered here.
Collapse
Affiliation(s)
- Vitor Sousa
- Department of Genetics, Rutgers, the State University of New Jersey, Piscataway, New Jersey 08854, USA
| | | |
Collapse
|
533
|
Tisné S, Serrand Y, Bach L, Gilbault E, Ben Ameur R, Balasse H, Voisin R, Bouchez D, Durand-Tardif M, Guerche P, Chareyron G, Da Rugna J, Camilleri C, Loudet O. Phenoscope: an automated large-scale phenotyping platform offering high spatial homogeneity. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2013; 74:534-44. [PMID: 23452317 DOI: 10.1111/tpj.12131] [Citation(s) in RCA: 87] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/03/2012] [Revised: 01/24/2013] [Accepted: 01/25/2013] [Indexed: 05/20/2023]
Abstract
Increased phenotyping accuracy and throughput are necessary to improve our understanding of quantitative variation and to be able to deconstruct complex traits such as those involved in growth responses to the environment. Still, only a few facilities are known to handle individual plants of small stature for non-destructive, real-time phenotype acquisition from plants grown in precisely adjusted and variable experimental conditions. Here, we describe Phenoscope, a high-throughput phenotyping platform that has the unique feature of continuously rotating 735 individual pots over a table. It automatically adjusts watering and is equipped with a zenithal imaging system to monitor rosette size and expansion rate during the vegetative stage, with automatic image analysis allowing manual correction. When applied to Arabidopsis thaliana, we show that rotating the pots strongly reduced micro-environmental disparity: heterogeneity in evaporation was cut by a factor of 2.5 and the number of replicates needed to detect a specific mild genotypic effect was reduced by a factor of 3. In addition, by controlling a large proportion of the micro-environmental variance, other tangible sources of variance become noticeable. Overall, Phenoscope makes it possible to perform large-scale experiments that would not be possible or reproducible by hand. When applied to a typical quantitative trait loci (QTL) mapping experiment, we show that mapping power is more limited by genetic complexity than phenotyping accuracy. This will help to draw a more general picture as to how genetic diversity shapes phenotypic variation.
Collapse
Affiliation(s)
- Sébastien Tisné
- INRA-Institut National de la Recherche Agronomique, UMR 1318, Institut Jean-Pierre Bourgin, RD10, F-78000, Versailles, France
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
534
|
Gobron N, Waszczak C, Simon M, Hiard S, Boivin S, Charif D, Ducamp A, Wenes E, Budar F. A cryptic cytoplasmic male sterility unveils a possible gynodioecious past for Arabidopsis thaliana. PLoS One 2013; 8:e62450. [PMID: 23658632 PMCID: PMC3639211 DOI: 10.1371/journal.pone.0062450] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2013] [Accepted: 03/21/2013] [Indexed: 01/25/2023] Open
Abstract
Gynodioecy, the coexistence of hermaphrodites and females (i.e. male-sterile plants) in natural plant populations, most often results from polymorphism at genetic loci involved in a particular interaction between the nuclear and cytoplasmic genetic compartments (cytonuclear epistasis): cytoplasmic male sterility (CMS). Although CMS clearly contributes to the coevolution of involved nuclear loci and cytoplasmic genomes in gynodioecious species, the occurrence of CMS genetic factors in the absence of sexual polymorphism (cryptic CMS) is not easily detected and rarely taken in consideration. We found cryptic CMS in the model plant Arabidopsis thaliana after crossing distantly related accessions, Sha and Mr-0. Male sterility resulted from an interaction between the Sha cytoplasm and two Mr-0 genomic regions located on chromosome 1 and chromosome 3. Additional accessions with either nuclear sterility maintainers or sterilizing cytoplasms were identified from crosses with either Sha or Mr-0. By comparing two very closely related cytoplasms with different male-sterility inducing abilities, we identified a novel mitochondrial ORF, named orf117Sha, that is most likely the sterilizing factor of the Sha cytoplasm. The presence of orf117Sha was investigated in worldwide natural accessions. It was found mainly associated with a single chlorotype in accessions belonging to a clade predominantly originating from Central Asia. More than one-third of accessions from this clade carried orf117Sha, indicating that the sterilizing-inducing cytoplasm had spread in this lineage. We also report the coexistence of the sterilizing cytoplasm with a non-sterilizing cytoplasm at a small, local scale in a natural population; in addition a correlation between cytotype and nuclear haplotype was detected in this population. Our results suggest that this CMS system induced sexual polymorphism in A. thaliana populations, at the time when the species was mainly outcrossing.
Collapse
Affiliation(s)
- Nicolas Gobron
- INRA Institut National de la Recherche Agronomique, UMR1318, IJPB Institut Jean-Pierre Bourgin, Versailles, France
- AgroParisTech, IJPB Institut Jean-Pierre Bourgin, Versailles, France
| | - Cezary Waszczak
- INRA Institut National de la Recherche Agronomique, UMR1318, IJPB Institut Jean-Pierre Bourgin, Versailles, France
- AgroParisTech, IJPB Institut Jean-Pierre Bourgin, Versailles, France
| | - Matthieu Simon
- INRA Institut National de la Recherche Agronomique, UMR1318, IJPB Institut Jean-Pierre Bourgin, Versailles, France
- AgroParisTech, IJPB Institut Jean-Pierre Bourgin, Versailles, France
| | - Sophie Hiard
- INRA Institut National de la Recherche Agronomique, UMR1318, IJPB Institut Jean-Pierre Bourgin, Versailles, France
- AgroParisTech, IJPB Institut Jean-Pierre Bourgin, Versailles, France
| | - Stéphane Boivin
- INRA Institut National de la Recherche Agronomique, UMR1318, IJPB Institut Jean-Pierre Bourgin, Versailles, France
- AgroParisTech, IJPB Institut Jean-Pierre Bourgin, Versailles, France
| | - Delphine Charif
- INRA Institut National de la Recherche Agronomique, UMR1318, IJPB Institut Jean-Pierre Bourgin, Versailles, France
- AgroParisTech, IJPB Institut Jean-Pierre Bourgin, Versailles, France
| | - Aloïse Ducamp
- INRA Institut National de la Recherche Agronomique, UMR1318, IJPB Institut Jean-Pierre Bourgin, Versailles, France
- AgroParisTech, IJPB Institut Jean-Pierre Bourgin, Versailles, France
| | - Estelle Wenes
- INRA Institut National de la Recherche Agronomique, UMR1318, IJPB Institut Jean-Pierre Bourgin, Versailles, France
- AgroParisTech, IJPB Institut Jean-Pierre Bourgin, Versailles, France
| | - Françoise Budar
- INRA Institut National de la Recherche Agronomique, UMR1318, IJPB Institut Jean-Pierre Bourgin, Versailles, France
- AgroParisTech, IJPB Institut Jean-Pierre Bourgin, Versailles, France
| |
Collapse
|
535
|
Lucas Lledó JI, Cáceres M. On the power and the systematic biases of the detection of chromosomal inversions by paired-end genome sequencing. PLoS One 2013; 8:e61292. [PMID: 23637806 PMCID: PMC3634047 DOI: 10.1371/journal.pone.0061292] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2012] [Accepted: 03/07/2013] [Indexed: 12/15/2022] Open
Abstract
One of the most used techniques to study structural variation at a genome level is paired-end mapping (PEM). PEM has the advantage of being able to detect balanced events, such as inversions and translocations. However, inversions are still quite difficult to predict reliably, especially from high-throughput sequencing data. We simulated realistic PEM experiments with different combinations of read and library fragment lengths, including sequencing errors and meaningful base-qualities, to quantify and track down the origin of false positives and negatives along sequencing, mapping, and downstream analysis. We show that PEM is very appropriate to detect a wide range of inversions, even with low coverage data. However, ≥% of inversions located between segmental duplications are expected to go undetected by the most common sequencing strategies. In general, longer DNA libraries improve the detectability of inversions far better than increments of the coverage depth or the read length. Finally, we review the performance of three algorithms to detect inversions--SVDetect, GRIAL, and VariationHunter--, identify common pitfalls, and reveal important differences in their breakpoint precisions. These results stress the importance of the sequencing strategy for the detection of structural variants, especially inversions, and offer guidelines for the design of future genome sequencing projects.
Collapse
Affiliation(s)
- José Ignacio Lucas Lledó
- Institut de Biotecnologia i de Biomedicina, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain
| | - Mario Cáceres
- Institut de Biotecnologia i de Biomedicina, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain
| |
Collapse
|
536
|
Pavlopoulos GA, Kumar P, Sifrim A, Sakai R, Lin ML, Voet T, Moreau Y, Aerts J. Meander: visually exploring the structural variome using space-filling curves. Nucleic Acids Res 2013; 41:e118. [PMID: 23605045 PMCID: PMC3675473 DOI: 10.1093/nar/gkt254] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
The introduction of next generation sequencing methods in genome studies has made it possible to shift research from a gene-centric approach to a genome wide view. Although methods and tools to detect single nucleotide polymorphisms are becoming more mature, methods to identify and visualize structural variation (SV) are still in their infancy. Most genome browsers can only compare a given sequence to a reference genome; therefore, direct comparison of multiple individuals still remains a challenge. Therefore, the implementation of efficient approaches to explore and visualize SVs and directly compare two or more individuals is desirable. In this article, we present a visualization approach that uses space-filling Hilbert curves to explore SVs based on both read-depth and pair-end information. An interactive open-source Java application, called Meander, implements the proposed methodology, and its functionality is demonstrated using two cases. With Meander, users can explore variations at different levels of resolution and simultaneously compare up to four different individuals against a common reference. The application was developed using Java version 1.6 and Processing.org and can be run on any platform. It can be found at http://homes.esat.kuleuven.be/~bioiuser/meander.
Collapse
Affiliation(s)
- Georgios A Pavlopoulos
- Department of Electrical Engineering (ESAT/SCD), University of Leuven, Kasteelpark Arenberg 10, Box 2446, 3001 Leuven, Belgium.
| | | | | | | | | | | | | | | |
Collapse
|
537
|
Chen G, Wang C, Shi L, Tong W, Qu X, Chen J, Yang J, Shi C, Chen L, Zhou P, Lu B, Shi T. Comprehensively identifying and characterizing the missing gene sequences in human reference genome with integrated analytic approaches. Hum Genet 2013; 132:899-911. [PMID: 23572138 DOI: 10.1007/s00439-013-1300-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2012] [Accepted: 03/25/2013] [Indexed: 11/25/2022]
Abstract
The human reference genome is still incomplete and a number of gene sequences are missing from it. The approaches to uncover them, the reasons causing their absence and their functions are less explored. Here, we comprehensively identified and characterized the missing genes of human reference genome with RNA-Seq data from 16 different human tissues. By using a combined approach of genome-guided transcriptome reconstruction coupled with genome-wide comparison, we uncovered 3.78 and 2.37 Mb transcribed regions in the human genome assemblies of Celera and HuRef either missed from their homologous chromosomes of NCBI human reference genome build 37.2 or partially or entirely absent from the reference. We further identified a significant number of novel transcript contigs in each tissue from de novo transcriptome assembly that are unalignable to NCBI build 37.2 but can be aligned to at least one of the genomes from Celera, HuRef, chimpanzee, macaca or mouse. Our analyses indicate that the missing genes could result from genome misassembly, transposition, copy number variation, translocation and other structural variations. Moreover, our results further suggest that a large portion of these missing genes are conserved between human and other mammals, implying their important biological functions. Totally, 1,233 functional protein domains were detected in these missing genes. Collectively, our study not only provides approaches for uncovering the missing genes of a genome, but also proposes the potential reasons causing genes missed from the genome and highlights the importance of uncovering the missing genes of incomplete genomes.
Collapse
Affiliation(s)
- Geng Chen
- Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, China
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
538
|
Challis RJ, Hepworth J, Mouchel C, Waites R, Leyser O. A role for more axillary growth1 (MAX1) in evolutionary diversity in strigolactone signaling upstream of MAX2. PLANT PHYSIOLOGY 2013; 161:1885-902. [PMID: 23424248 PMCID: PMC3613463 DOI: 10.1104/pp.112.211383] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Strigolactones (SLs) are carotenoid-derived phytohormones with diverse roles. They are secreted from roots as attractants for arbuscular mycorrhizal fungi and have a wide range of endogenous functions, such as regulation of root and shoot system architecture. To date, six genes associated with SL synthesis and signaling have been molecularly identified using the shoot-branching mutants more axillary growth (max) of Arabidopsis (Arabidopsis thaliana) and dwarf (d) of rice (Oryza sativa). Here, we present a phylogenetic analysis of the MAX/D genes to clarify the relationships of each gene with its wider family and to allow the correlation of events in the evolution of the genes with the evolution of SL function. Our analysis suggests that the notion of a distinct SL pathway is inappropriate. Instead, there may be a diversity of SL-like compounds, the response to which requires a D14/D14-like protein. This ancestral system could have been refined toward distinct ligand-specific pathways channeled through MAX2, the most downstream known component of SL signaling. MAX2 is tightly conserved among land plants and is more diverged from its nearest sister clade than any other SL-related gene, suggesting a pivotal role in the evolution of SL signaling. By contrast, the evidence suggests much greater flexibility upstream of MAX2. The MAX1 gene is a particularly strong candidate for contributing to diversification of inputs upstream of MAX2. Our functional analysis of the MAX1 family demonstrates the early origin of its catalytic function and both redundancy and functional diversification associated with its duplication in angiosperm lineages.
Collapse
|
539
|
Verde I, Abbott AG, Scalabrin S, Jung S, Shu S, Marroni F, Zhebentyayeva T, Dettori MT, Grimwood J, Cattonaro F, Zuccolo A, Rossini L, Jenkins J, Vendramin E, Meisel LA, Decroocq V, Sosinski B, Prochnik S, Mitros T, Policriti A, Cipriani G, Dondini L, Ficklin S, Goodstein DM, Xuan P, Del Fabbro C, Aramini V, Copetti D, Gonzalez S, Horner DS, Falchi R, Lucas S, Mica E, Maldonado J, Lazzari B, Bielenberg D, Pirona R, Miculan M, Barakat A, Testolin R, Stella A, Tartarini S, Tonutti P, Arús P, Orellana A, Wells C, Main D, Vizzotto G, Silva H, Salamini F, Schmutz J, Morgante M, Rokhsar DS. The high-quality draft genome of peach (Prunus persica) identifies unique patterns of genetic diversity, domestication and genome evolution. Nat Genet 2013; 45:487-94. [PMID: 23525075 DOI: 10.1038/ng.2586] [Citation(s) in RCA: 587] [Impact Index Per Article: 53.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2012] [Accepted: 02/22/2013] [Indexed: 11/09/2022]
Abstract
Rosaceae is the most important fruit-producing clade, and its key commercially relevant genera (Fragaria, Rosa, Rubus and Prunus) show broadly diverse growth habits, fruit types and compact diploid genomes. Peach, a diploid Prunus species, is one of the best genetically characterized deciduous trees. Here we describe the high-quality genome sequence of peach obtained from a completely homozygous genotype. We obtained a complete chromosome-scale assembly using Sanger whole-genome shotgun methods. We predicted 27,852 protein-coding genes, as well as noncoding RNAs. We investigated the path of peach domestication through whole-genome resequencing of 14 Prunus accessions. The analyses suggest major genetic bottlenecks that have substantially shaped peach genome diversity. Furthermore, comparative analyses showed that peach has not undergone recent whole-genome duplication, and even though the ancestral triplicated blocks in peach are fragmentary compared to those in grape, all seven paleosets of paralogs from the putative paleoancestor are detectable.
Collapse
Affiliation(s)
-
- Consiglio per la Ricerca e la Sperimentazione in Agricoltura (CRA)-Centro di Ricerca per la Frutticoltura, Rome, Italy.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
540
|
Genomic signatures of selection at linked sites: unifying the disparity among species. Nat Rev Genet 2013; 14:262-74. [PMID: 23478346 DOI: 10.1038/nrg3425] [Citation(s) in RCA: 315] [Impact Index Per Article: 28.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Population genetics theory supplies powerful predictions about how natural selection interacts with genetic linkage to sculpt the genomic landscape of nucleotide polymorphism. Both the spread of beneficial mutations and the removal of deleterious mutations act to depress polymorphism levels, especially in low-recombination regions. However, empiricists have documented extreme disparities among species. Here we characterize the dominant features that could drive differences in linked selection among species--including roles for selective sweeps being 'hard' or 'soft'--and the concealing effects of demography and confounding genomic variables. We advocate targeted studies of closely related species to unify our understanding of how selection and linkage interact to shape genome evolution.
Collapse
|
541
|
Hirakawa H, Shirasawa K, Ohyama A, Fukuoka H, Aoki K, Rothan C, Sato S, Isobe S, Tabata S. Genome-wide SNP genotyping to infer the effects on gene functions in tomato. DNA Res 2013; 20:221-33. [PMID: 23482505 PMCID: PMC3686429 DOI: 10.1093/dnares/dst005] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The genotype data of 7054 single nucleotide polymorphism (SNP) loci in 40 tomato lines, including inbred lines, F1 hybrids, and wild relatives, were collected using Illumina's Infinium and GoldenGate assay platforms, the latter of which was utilized in our previous study. The dendrogram based on the genotype data corresponded well to the breeding types of tomato and wild relatives. The SNPs were classified into six categories according to their positions in the genes predicted on the tomato genome sequence. The genes with SNPs were annotated by homology searches against the nucleotide and protein databases, as well as by domain searches, and they were classified into the functional categories defined by the NCBI's eukaryotic orthologous groups (KOG). To infer the SNPs' effects on the gene functions, the three-dimensional structures of the 843 proteins that were encoded by the genes with SNPs causing missense mutations were constructed by homology modelling, and 200 of these proteins were considered to carry non-synonymous amino acid substitutions in the predicted functional sites. The SNP information obtained in this study is available at the Kazusa Tomato Genomics Database (http://plant1.kazusa.or.jp/tomato/).
Collapse
Affiliation(s)
- Hideki Hirakawa
- Kazusa DNA Research Institute, 2-6-7 Kazusa-Kamatari, Kisarazu, Chiba 292-0818, Japan.
| | | | | | | | | | | | | | | | | |
Collapse
|
542
|
Intraspecific sequence variation and differential expression in starch synthase genes of Arabidopsis thaliana. BMC Res Notes 2013; 6:84. [PMID: 23497496 PMCID: PMC3608163 DOI: 10.1186/1756-0500-6-84] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/28/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Natural accessions of Arabidopsis thaliana are a well-known system to measure levels of intraspecific genetic variation. Leaf starch content correlates negatively with biomass. Starch is synthesized by the coordinated action of many (iso)enzymes. Quantitatively dominant is the repetitive transfer of glucosyl residues to the non-reducing ends of α-glucans as mediated by starch synthases. In the genome of A. thaliana, there are five classes of starch synthases, designated as soluble starch synthases (SSI, SSII, SSIII, and SSIV) and granule-bound synthase (GBSS). Each class is represented by a single gene. The five genes are homologous in functional domains due to their common origin, but have evolved individual features as well. Here, we analyze the extent of genetic variation in these fundamental protein classes as well as possible functional implications on transcript and protein levels. FINDINGS Intraspecific sequence variation of the five starch synthases was determined by sequencing the entire loci including promoter regions from 30 worldwide distributed accessions of A. thaliana. In all genes, a considerable number of nucleotide polymorphisms was observed, both in non-coding and coding regions, and several amino acid substitutions were identified in functional domains. Furthermore, promoters possess numerous polymorphisms in potentially regulatory cis-acting regions. By realtime experiments performed with selected accessions, we demonstrate that DNA sequence divergence correlates with significant differences in transcript levels. CONCLUSIONS Except for AtSSII, all starch synthase classes clustered into two or three groups of haplotypes, respectively. Significant difference in transcript levels among haplotype clusters in AtSSIV provides evidence for cis-regulation. By contrast, no such correlation was found for AtSSI, AtSSII, AtSSIII, and AtGBSS, suggesting trans-regulation. The expression data presented here point to a regulation by common trans-regulatory transcription factors which ensures a coordinated action of the products of these four genes during starch granule biosynthesis. The apparent cis-regulation of AtSSIV might be related to its role in the initiation of de novo biosynthesis of granules.
Collapse
|
543
|
Abstract
Natural epigenetic variation provides a source for the generation of phenotypic diversity, but to understand its contribution to phenotypic diversity, its interaction with genetic variation requires further investigation. Here, we report population-wide DNA sequencing of genomes, transcriptomes, and methylomes of wild Arabidopsis thaliana accessions. Single cytosine methylation polymorphisms are unlinked to genotype. However, the rate of linkage disequilibrium decay amongst differentially methylated regions targeted by RNA-directed DNA methylation is similar to the rate for single nucleotide polymorphisms. Association analyses of these RNA-directed DNA methylation regions with genetic variants identified thousands of methylQTL, which revealed the first population estimate of genetically dependent methylation variation. Analysis of invariably methylated transposons and genes across this population indicates that loci targeted by RNA-directed DNA methylation are epigenetically activated in pollen and seeds, which facilitates proper development of these structures.
Collapse
|
544
|
Grimm D, Hagmann J, Koenig D, Weigel D, Borgwardt K. Accurate indel prediction using paired-end short reads. BMC Genomics 2013; 14:132. [PMID: 23442375 PMCID: PMC3614465 DOI: 10.1186/1471-2164-14-132] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2012] [Accepted: 02/06/2013] [Indexed: 11/12/2022] Open
Abstract
Background One of the major open challenges in next generation sequencing (NGS) is the accurate identification of structural variants such as insertions and deletions (indels). Current methods for indel calling assign scores to different types of evidence or counter-evidence for the presence of an indel, such as the number of split read alignments spanning the boundaries of a deletion candidate or reads that map within a putative deletion. Candidates with a score above a manually defined threshold are then predicted to be true indels. As a consequence, structural variants detected in this manner contain many false positives. Results Here, we present a machine learning based method which is able to discover and distinguish true from false indel candidates in order to reduce the false positive rate. Our method identifies indel candidates using a discriminative classifier based on features of split read alignment profiles and trained on true and false indel candidates that were validated by Sanger sequencing. We demonstrate the usefulness of our method with paired-end Illumina reads from 80 genomes of the first phase of the 1001 Genomes Project (
http://www.1001genomes.org) in Arabidopsis thaliana. Conclusion In this work we show that indel classification is a necessary step to reduce the number of false positive candidates. We demonstrate that missing classification may lead to spurious biological interpretations. The software is available at:
http://agkb.is.tuebingen.mpg.de/Forschung/SV-M/.
Collapse
Affiliation(s)
- Dominik Grimm
- Machine Learning and Computational Biology Research Group, Max Planck Institute for Developmental Biology and Max Planck Institute for Intelligent Systems, Tübingen, Germany.
| | | | | | | | | |
Collapse
|
545
|
Abraham MC, Metheetrairut C, Irish VF. Natural variation identifies multiple loci controlling petal shape and size in Arabidopsis thaliana. PLoS One 2013; 8:e56743. [PMID: 23418598 PMCID: PMC3572026 DOI: 10.1371/journal.pone.0056743] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2012] [Accepted: 01/14/2013] [Indexed: 12/14/2022] Open
Abstract
Natural variation in organ morphologies can have adaptive significance and contribute to speciation. However, the underlying allelic differences responsible for variation in organ size and shape remain poorly understood. We have utilized natural phenotypic variation in three Arabidopsis thaliana ecotypes to examine the genetic basis for quantitative variation in petal length, width, area, and shape. We identified 23 loci responsible for such variation, many of which appear to correspond to genes not previously implicated in controlling organ morphology. These analyses also demonstrated that allelic differences at distinct loci can independently affect petal length, width, area or shape, suggesting that these traits behave as independent modules. We also showed that ERECTA (ER), encoding a leucine-rich repeat (LRR) receptor-like serine-threonine kinase, is a major effect locus determining petal shape. Allelic variation at the ER locus was associated with differences in petal cell proliferation and concomitant effects on petal shape. ER has been previously shown to be required for regulating cell division and expansion in other contexts; the ER receptor-like kinase functioning to also control organ-specific proliferation patterns suggests that allelic variation in common signaling components may nonetheless have been a key factor in morphological diversification.
Collapse
Affiliation(s)
- Mary C. Abraham
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, Connecticut, United States of America
| | - Chanatip Metheetrairut
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, Connecticut, United States of America
| | - Vivian F. Irish
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, Connecticut, United States of America
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut, United States of America
| |
Collapse
|
546
|
Wang X, Weigel D, Smith LM. Transposon variants and their effects on gene expression in Arabidopsis. PLoS Genet 2013; 9:e1003255. [PMID: 23408902 PMCID: PMC3567156 DOI: 10.1371/journal.pgen.1003255] [Citation(s) in RCA: 107] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2012] [Accepted: 12/03/2012] [Indexed: 02/01/2023] Open
Abstract
Transposable elements (TEs) make up the majority of many plant genomes. Their transcription and transposition is controlled through siRNAs and epigenetic marks including DNA methylation. To dissect the interplay of siRNA–mediated regulation and TE evolution, and to examine how TE differences affect nearby gene expression, we investigated genome-wide differences in TEs, siRNAs, and gene expression among three Arabidopsis thaliana accessions. Both TE sequence polymorphisms and presence of linked TEs are positively correlated with intraspecific variation in gene expression. The expression of genes within 2 kb of conserved TEs is more stable than that of genes next to variant TEs harboring sequence polymorphisms. Polymorphism levels of TEs and closely linked adjacent genes are positively correlated as well. We also investigated the distribution of 24-nt-long siRNAs, which mediate TE repression. TEs targeted by uniquely mapping siRNAs are on average farther from coding genes, apparently because they more strongly suppress expression of adjacent genes. Furthermore, siRNAs, and especially uniquely mapping siRNAs, are enriched in TE regions missing in other accessions. Thus, targeting by uniquely mapping siRNAs appears to promote sequence deletions in TEs. Overall, our work indicates that siRNA–targeting of TEs may influence removal of sequences from the genome and hence evolution of gene expression in plants. Transposable elements (TEs) are selfish DNA sequences. Together with their immobilized derivatives, they account for a large fraction of eukaryotic genomes. TEs can affect nearby gene activity, either directly by disrupting regulatory sequences or indirectly through the host mechanisms used to prevent TE proliferation. A comparison of Arabidopsis thaliana genomes reveals rapid TE degeneration. We asked what drives TE degeneration and how often TE variation affects nearby gene expression. To answer these questions, we studied the interplay between TEs, DNA sequence variation, and short interfering RNAs (siRNAs) in three A. thaliana strains. We find sequence variation in genes and adjacent TEs to be correlated, from which we conclude either that TEs insert more often near polymorphic genes or that TEs next to polymorphic genes are less efficiently purged from the genome. We also noticed that processes that cause deletions within TEs and ones that silence TEs appear to be linked, because siRNA targeting is a predictor of sequence loss in accessions. Our work provides insight into the contribution of TEs to gene expression plasticity, and it links TE silencing mechanisms to the evolution of TE variation between genomes, thereby linking TE silencing mechanisms to expression plasticity.
Collapse
Affiliation(s)
- Xi Wang
- Department of Molecular Biology, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Detlef Weigel
- Department of Molecular Biology, Max Planck Institute for Developmental Biology, Tübingen, Germany
- * E-mail: (DW); (LMS)
| | - Lisa M. Smith
- Department of Molecular Biology, Max Planck Institute for Developmental Biology, Tübingen, Germany
- * E-mail: (DW); (LMS)
| |
Collapse
|
547
|
Puerma E, Aguadé M. Polymorphism at genes involved in salt tolerance in Arabidopsis thaliana (Brassicaceae). AMERICAN JOURNAL OF BOTANY 2013; 100:384-390. [PMID: 23345415 DOI: 10.3732/ajb.1200332] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
PREMISE OF THE STUDY Genes involved in relevant functions for environmental adaptation can be considered primary candidates for their variation having been shaped by natural selection. Detecting recent selective events through their footprint on nucleotide variation constitutes a challenging task in species with a complex demographic history such as Arabidopsis thaliana. We have surveyed nucleotide variation in this species at nine genes involved in salt tolerance. The available genomewide information for this species has allowed us to contrast the levels and patterns of variation detected at the candidate genes with empirical distributions obtained from noncandidate regions. METHODS We sequenced nine genes involved in salt tolerance (~32 kb) in 20 ecotypes of A. thaliana and analyzed polymorphism and divergence at the individual gene and multilocus levels. KEY RESULTS Variation at the nine genes studied was characterized by a generalized skew toward polymorphisms with low-frequency variants. Except for genes RCD1 and NHX8, this pattern was similar to that generally detected in the A. thaliana genome and could thus be primarily explained by the species demographic history. The more extreme deviation at the NHX8 gene and its excess of polymorphism relative to divergence points to the recent action of selection on this gene. CONCLUSIONS The analysis of nucleotide polymorphism and divergence at nine genes involved in salt tolerance provided little evidence for the recent action of positive selection. Only the signals detected at NHX8 from both polymorphism and divergence were suggestive of the putative contribution of this gene to local adaptation.
Collapse
Affiliation(s)
- Eva Puerma
- Departament de Genètica, Facultat de Biologia, i Institut de Recerca de la Biodiversitat, Universitat de Barcelona, Diagonal 643, 08028 Barcelona, Spain
| | | |
Collapse
|
548
|
Balsera M, Uberegui E, Susanti D, Schmitz RA, Mukhopadhyay B, Schürmann P, Buchanan BB. Ferredoxin:thioredoxin reductase (FTR) links the regulation of oxygenic photosynthesis to deeply rooted bacteria. PLANTA 2013; 237:619-635. [PMID: 23223880 DOI: 10.1007/s00425-012-1803-y] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2012] [Accepted: 10/26/2012] [Indexed: 06/01/2023]
Abstract
Uncovered in studies on photosynthesis 35 years ago, redox regulation has been extended to all types of living cells. We understand a great deal about the occurrence, function, and mechanism of action of this mode of regulation, but we know little about its origin and its evolution. To help fill this gap, we have taken advantage of available genome sequences that make it possible to trace the phylogenetic roots of members of the system that was originally described for chloroplasts-ferredoxin, ferredoxin:thioredoxin reductase (FTR), and thioredoxin as well as target enzymes. The results suggest that: (1) the catalytic subunit, FTRc, originated in deeply rooted microaerophilic, chemoautotrophic bacteria where it appears to function in regulating CO(2) fixation by the reverse citric acid cycle; (2) FTRc was incorporated into oxygenic photosynthetic organisms without significant structural change except for addition of a variable subunit (FTRv) seemingly to protect the Fe-S cluster against oxygen; (3) new Trxs and target enzymes were systematically added as evolution proceeded from bacteria through the different types of oxygenic photosynthetic organisms; (4) an oxygenic type of regulation preceded classical light-dark regulation in the regulation of enzymes of CO(2) fixation by the Calvin-Benson cycle; (5) FTR is not universally present in oxygenic photosynthetic organisms, and in certain early representatives is seemingly functionally replaced by NADP-thioredoxin reductase; and (6) FTRc underwent structural diversification to meet the ecological needs of a variety of bacteria and archaea.
Collapse
Affiliation(s)
- Monica Balsera
- Instituto de Recursos Naturales y Agrobiología de Salamanca, Salamanca, Spain.
| | | | | | | | | | | | | |
Collapse
|
549
|
Su CL, Chao YT, Yen SH, Chen CY, Chen WC, Chang YCA, Shih MC. Orchidstra: an integrated orchid functional genomics database. PLANT & CELL PHYSIOLOGY 2013; 54:e11. [PMID: 23324169 PMCID: PMC3583029 DOI: 10.1093/pcp/pct004] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
A specialized orchid database, named Orchidstra (URL: http://orchidstra.abrc.sinica.edu.tw), has been constructed to collect, annotate and share genomic information for orchid functional genomics studies. The Orchidaceae is a large family of Angiosperms that exhibits extraordinary biodiversity in terms of both the number of species and their distribution worldwide. Orchids exhibit many unique biological features; however, investigation of these traits is currently constrained due to the limited availability of genomic information. Transcriptome information for five orchid species and one commercial hybrid has been included in the Orchidstra database. Altogether, these comprise >380,000 non-redundant orchid transcript sequences, of which >110,000 are protein-coding genes. Sequences from the transcriptome shotgun assembly (TSA) were obtained either from output reads from next-generation sequencing technologies assembled into contigs, or from conventional cDNA library approaches. An annotation pipeline using Gene Ontology, KEGG and Pfam was built to assign gene descriptions and functional annotation to protein-coding genes. Deep sequencing of small RNA was also performed for Phalaenopsis aphrodite to search for microRNAs (miRNAs), extending the information archived for this species to miRNA annotation, precursors and putative target genes. The P. aphrodite transcriptome information was further used to design probes for an oligonucleotide microarray, and expression profiling analysis was carried out. The intensities of hybridized probes derived from microarray assays of various tissues were incorporated into the database as part of the functional evidence. In the future, the content of the Orchidstra database will be expanded with transcriptome data and genomic information from more orchid species.
Collapse
Affiliation(s)
- Chun-lin Su
- Agricultural Biotechnology Research Center, Academia Sinica, Taipei 11529, Taiwan
- These authors contributed equally to this work
| | - Ya-Ting Chao
- Agricultural Biotechnology Research Center, Academia Sinica, Taipei 11529, Taiwan
- These authors contributed equally to this work
| | - Shao-Hua Yen
- Agricultural Biotechnology Research Center, Academia Sinica, Taipei 11529, Taiwan
| | - Chun-Yi Chen
- Agricultural Biotechnology Research Center, Academia Sinica, Taipei 11529, Taiwan
| | - Wan-Chieh Chen
- Agricultural Biotechnology Research Center, Academia Sinica, Taipei 11529, Taiwan
| | - Yao-Chien Alex Chang
- Department of Horticulture and Landscape Architecture, National Taiwan University, Taipei 10617, Taiwan
| | - Ming-Che Shih
- Agricultural Biotechnology Research Center, Academia Sinica, Taipei 11529, Taiwan
- *Corresponding author: E-mail: ; Fax, +886-2-26515693
| |
Collapse
|
550
|
Manavella PA, Koenig D, Rubio-Somoza I, Burbano HA, Becker C, Weigel D. Tissue-specific silencing of Arabidopsis SU(VAR)3-9 HOMOLOG8 by miR171a. PLANT PHYSIOLOGY 2013; 161:805-12. [PMID: 23204429 PMCID: PMC3561020 DOI: 10.1104/pp.112.207068] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/10/2012] [Accepted: 11/27/2012] [Indexed: 05/02/2023]
Abstract
MicroRNAs (miRNAs) are produced from double-stranded precursors, from which a short duplex is excised. The strand of the duplex that remains more abundant is usually the active form, the miRNA, while steady-state levels of the other strand, the miRNA*, are generally lower. The executive engines of miRNA-directed gene silencing are RNA-induced silencing complexes (RISCs). During RISC maturation, the miRNA/miRNA* duplex associates with the catalytic subunit, an ARGONAUTE (AGO) protein. Subsequently, the guide strand, which directs gene silencing, is retained, while the passenger strand is degraded. Under certain circumstances, the miRNA*s can be retained as guide strands. miR170 and miR171 are prototypical miRNAs in Arabidopsis (Arabidopsis thaliana) with well-defined targets. We found that the corresponding star molecules, the sequence-identical miR170* and miR171a*, have several features of active miRNAs, such as sequence conservation and AGO1 association. We confirmed that active AGO1-miR171a* complexes are common in Arabidopsis and that they trigger silencing of SU(VAR)3-9 HOMOLOG8, a new miR171a* target that was acquired very recently in the Arabidopsis lineage. Our study demonstrates that each miR171a strand can be loaded onto RISC with separate regulatory outcomes.
Collapse
Affiliation(s)
- Pablo A. Manavella
- Department of Molecular Biology, Max Planck Institute for Developmental Biology, D–72076 Tuebingen, Germany
| | - Daniel Koenig
- Department of Molecular Biology, Max Planck Institute for Developmental Biology, D–72076 Tuebingen, Germany
| | - Ignacio Rubio-Somoza
- Department of Molecular Biology, Max Planck Institute for Developmental Biology, D–72076 Tuebingen, Germany
| | - Hernán A. Burbano
- Department of Molecular Biology, Max Planck Institute for Developmental Biology, D–72076 Tuebingen, Germany
| | - Claude Becker
- Department of Molecular Biology, Max Planck Institute for Developmental Biology, D–72076 Tuebingen, Germany
| | - Detlef Weigel
- Department of Molecular Biology, Max Planck Institute for Developmental Biology, D–72076 Tuebingen, Germany
| |
Collapse
|