1
|
Morales N, Ogbonna AC, Ellerbrock BJ, Bauchet GJ, Tantikanjana T, Tecle IY, Powell AF, Lyon D, Menda N, Simoes CC, Saha S, Hosmani P, Flores M, Panitz N, Preble RS, Agbona A, Rabbi I, Kulakow P, Peteti P, Kawuki R, Esuma W, Kanaabi M, Chelangat DM, Uba E, Olojede A, Onyeka J, Shah T, Karanja M, Egesi C, Tufan H, Paterne A, Asfaw A, Jannink JL, Wolfe M, Birkett CL, Waring DJ, Hershberger JM, Gore MA, Robbins KR, Rife T, Courtney C, Poland J, Arnaud E, Laporte MA, Kulembeka H, Salum K, Mrema E, Brown A, Bayo S, Uwimana B, Akech V, Yencho C, de Boeck B, Campos H, Swennen R, Edwards JD, Mueller LA. Breedbase: a digital ecosystem for modern plant breeding. G3 GENES|GENOMES|GENETICS 2022; 12:6564228. [PMID: 35385099 PMCID: PMC9258556 DOI: 10.1093/g3journal/jkac078] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Accepted: 02/14/2022] [Indexed: 01/17/2023]
Abstract
Modern breeding methods integrate next-generation sequencing and phenomics to identify plants with the best characteristics and greatest genetic merit for use as parents in subsequent breeding cycles to ultimately create improved cultivars able to sustain high adoption rates by farmers. This data-driven approach hinges on strong foundations in data management, quality control, and analytics. Of crucial importance is a central database able to (1) track breeding materials, (2) store experimental evaluations, (3) record phenotypic measurements using consistent ontologies, (4) store genotypic information, and (5) implement algorithms for analysis, prediction, and selection decisions. Because of the complexity of the breeding process, breeding databases also tend to be complex, difficult, and expensive to implement and maintain. Here, we present a breeding database system, Breedbase (https://breedbase.org/, last accessed 4/18/2022). Originally initiated as Cassavabase (https://cassavabase.org/, last accessed 4/18/2022) with the NextGen Cassava project (https://www.nextgencassava.org/, last accessed 4/18/2022), and later developed into a crop-agnostic system, it is presently used by dozens of different crops and projects. The system is web based and is available as open source software. It is available on GitHub (https://github.com/solgenomics/, last accessed 4/18/2022) and packaged in a Docker image for deployment (https://hub.docker.com/u/breedbase, last accessed 4/18/2022). The Breedbase system enables breeding programs to better manage and leverage their data for decision making within a fully integrated digital ecosystem.
Collapse
Affiliation(s)
- Nicolas Morales
- Boyce Thompson Institute , Ithaca, NY 14853, USA
- Cornell University , Ithaca, NY 14853, USA
| | - Alex C Ogbonna
- Boyce Thompson Institute , Ithaca, NY 14853, USA
- Cornell University , Ithaca, NY 14853, USA
| | | | | | | | | | | | - David Lyon
- Boyce Thompson Institute , Ithaca, NY 14853, USA
| | - Naama Menda
- Boyce Thompson Institute , Ithaca, NY 14853, USA
| | | | - Surya Saha
- Boyce Thompson Institute , Ithaca, NY 14853, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | - Ezenwanyi Uba
- National Root Crops Research Institute (NRCRI) , 463109 Umudike, Nigeria
| | - Adeyemi Olojede
- National Root Crops Research Institute (NRCRI) , 463109 Umudike, Nigeria
| | - Joseph Onyeka
- National Root Crops Research Institute (NRCRI) , 463109 Umudike, Nigeria
| | | | | | - Chiedozie Egesi
- Boyce Thompson Institute , Ithaca, NY 14853, USA
- IITA Ibadan , 200001 Ibadan, Nigeria
- National Root Crops Research Institute (NRCRI) , 463109 Umudike, Nigeria
| | - Hale Tufan
- Cornell University , Ithaca, NY 14853, USA
| | | | | | - Jean-Luc Jannink
- Cornell University , Ithaca, NY 14853, USA
- USDA-ARS , Ithaca, NY 14853, USA
| | | | - Clay L Birkett
- Cornell University , Ithaca, NY 14853, USA
- USDA-ARS , Ithaca, NY 14853, USA
| | - David J Waring
- Cornell University , Ithaca, NY 14853, USA
- USDA-ARS , Ithaca, NY 14853, USA
| | | | | | | | - Trevor Rife
- Kansas State University , Manhattan, KS 66506, USA
| | | | - Jesse Poland
- Kansas State University , Manhattan, KS 66506, USA
| | | | | | | | | | | | | | | | | | | | - Craig Yencho
- North Carolina State University (NCSU) , Raleigh, NC 27695, USA
| | | | | | | | | | | |
Collapse
|
2
|
Pan Q, Wei J, Guo F, Huang S, Gong Y, Liu H, Liu J, Li L. Trait ontology analysis based on association mapping studies bridges the gap between crop genomics and Phenomics. BMC Genomics 2019; 20:443. [PMID: 31159731 PMCID: PMC6547493 DOI: 10.1186/s12864-019-5812-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2018] [Accepted: 05/20/2019] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND Trait ontology (TO) analysis is a powerful system for functional annotation and enrichment analysis of genes. However, given the complexity of the molecular mechanisms underlying phenomes, only a few hundred gene-to-TO relationships in plants have been elucidated to date, limiting the pace of research in this "big data" era. RESULTS Here, we curated all the available trait associated sites (TAS) information from 79 association mapping studies of maize (Zea mays L.) and rice (Oryza sativa L.) lines with diverse genetic backgrounds and built a large-scale TAS-derived TO system for functional annotation of genes in various crops. Our TO system contains information for up to 18,042 genes (6345 in maize at the 25 k level and 11,697 in rice at the 50 k level), including gene-to-TO relationships, which covers over one fifth of the annotated gene sets for maize and rice. A comparison of Gene Ontology (GO) vs. TO analysis demonstrated that the TAS-derived TO system is an efficient alternative tool for gene functional annotation and enrichment analysis. We therefore combined information from the TO, GO, metabolic pathway, and co-expression network databases and constructed the TAS system, which is publicly available at http://tas.hzau.edu.cn . TAS provides a user-friendly interface for functional annotation of genes, enrichment analysis, genome-wide extraction of trait-associated genes, and crosschecking of different functional annotation databases. CONCLUSIONS TAS bridges the gap between genomic and phenomic information in crops. This easy-to-use tool will be useful for geneticists, biologists, and breeders in the agricultural community, as it facilitates the dissection of molecular mechanisms conferring agronomic traits in an easy, genome-wide manner.
Collapse
Affiliation(s)
- Qingchun Pan
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Junfeng Wei
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Feng Guo
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Suiyong Huang
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Yong Gong
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Hao Liu
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Jianxiao Liu
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China.
| | - Lin Li
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China.
| |
Collapse
|
3
|
Salhi A, Negrão S, Essack M, Morton MJL, Bougouffa S, Razali R, Radovanovic A, Marchand B, Kulmanov M, Hoehndorf R, Tester M, Bajic VB. DES-TOMATO: A Knowledge Exploration System Focused On Tomato Species. Sci Rep 2017; 7:5968. [PMID: 28729549 PMCID: PMC5519719 DOI: 10.1038/s41598-017-05448-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2017] [Accepted: 05/25/2017] [Indexed: 12/29/2022] Open
Abstract
Tomato is the most economically important horticultural crop used as a model to study plant biology and particularly fruit development. Knowledge obtained from tomato research initiated improvements in tomato and, being transferrable to other such economically important crops, has led to a surge of tomato-related research and published literature. We developed DES-TOMATO knowledgebase (KB) for exploration of information related to tomato. Information exploration is enabled through terms from 26 dictionaries and combination of these terms. To illustrate the utility of DES-TOMATO, we provide several examples how one can efficiently use this KB to retrieve known or potentially novel information. DES-TOMATO is free for academic and nonprofit users and can be accessed at http://cbrc.kaust.edu.sa/des_tomato/, using any of the mainstream web browsers, including Firefox, Safari and Chrome.
Collapse
Affiliation(s)
- Adil Salhi
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia
| | - Sónia Negrão
- King Abdullah University of Science and Technology (KAUST), Division of Biological and Environmental Sciences and Engineering, Thuwal, 23955-6900, Saudi Arabia
| | - Magbubah Essack
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia
| | - Mitchell J L Morton
- King Abdullah University of Science and Technology (KAUST), Division of Biological and Environmental Sciences and Engineering, Thuwal, 23955-6900, Saudi Arabia
| | - Salim Bougouffa
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia
| | - Rozaimi Razali
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia
| | - Aleksandar Radovanovic
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia
| | | | - Maxat Kulmanov
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia
| | - Robert Hoehndorf
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia
- King Abdullah University of Science and Technology (KAUST), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Thuwal, 23955-6900, Saudi Arabia
| | - Mark Tester
- King Abdullah University of Science and Technology (KAUST), Division of Biological and Environmental Sciences and Engineering, Thuwal, 23955-6900, Saudi Arabia
| | - Vladimir B Bajic
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia.
- King Abdullah University of Science and Technology (KAUST), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Thuwal, 23955-6900, Saudi Arabia.
| |
Collapse
|
4
|
Abstract
Collaborations between the scientific community and members of the Gene Ontology (GO) Consortium have led to an increase in the number and specificity of GO terms, as well as increasing the number of GO annotations. A variety of approaches have been taken to encourage research scientists to contribute to the GO, but the success of these approaches has been variable. This chapter reviews both the successes and failures of engaging the scientific community in GO development and annotation, as well as, providing motivation and advice to encourage individual researchers to contribute to GO.
Collapse
Affiliation(s)
- Ruth C Lovering
- Functional Gene Annotation Initiative, Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, 5 University Street, London, WC1E 6JF, UK.
| |
Collapse
|
5
|
Tecle IY, Edwards JD, Menda N, Egesi C, Rabbi IY, Kulakow P, Kawuki R, Jannink JL, Mueller LA. solGS: a web-based tool for genomic selection. BMC Bioinformatics 2014; 15:398. [PMID: 25495537 PMCID: PMC4269960 DOI: 10.1186/s12859-014-0398-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2014] [Accepted: 11/26/2014] [Indexed: 11/18/2022] Open
Abstract
Background Genomic selection (GS) promises to improve accuracy in estimating breeding values and genetic gain for quantitative traits compared to traditional breeding methods. Its reliance on high-throughput genome-wide markers and statistical complexity, however, is a serious challenge in data management, analysis, and sharing. A bioinformatics infrastructure for data storage and access, and user-friendly web-based tool for analysis and sharing output is needed to make GS more practical for breeders. Results We have developed a web-based tool, called solGS, for predicting genomic estimated breeding values (GEBVs) of individuals, using a Ridge-Regression Best Linear Unbiased Predictor (RR-BLUP) model. It has an intuitive web-interface for selecting a training population for modeling and estimating genomic estimated breeding values of selection candidates. It estimates phenotypic correlation and heritability of traits and selection indices of individuals. Raw data is stored in a generic database schema, Chado Natural Diversity, co-developed by multiple database groups. Analysis output is graphically visualized and can be interactively explored online or downloaded in text format. An instance of its implementation can be accessed at the NEXTGEN Cassava breeding database, http://cassavabase.org/solgs. Conclusions solGS enables breeders to store raw data and estimate GEBVs of individuals online, in an intuitive and interactive workflow. It can be adapted to any breeding program. Electronic supplementary material The online version of this article (doi:10.1186/s12859-014-0398-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Isaak Y Tecle
- Boyce Thompson Institute for Plant Research, Cornell University, Ithaca, NY, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
6
|
Fernandez-Pozo N, Menda N, Edwards JD, Saha S, Tecle IY, Strickler SR, Bombarely A, Fisher-York T, Pujar A, Foerster H, Yan A, Mueller LA. The Sol Genomics Network (SGN)--from genotype to phenotype to breeding. Nucleic Acids Res 2014; 43:D1036-41. [PMID: 25428362 PMCID: PMC4383978 DOI: 10.1093/nar/gku1195] [Citation(s) in RCA: 396] [Impact Index Per Article: 39.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
The Sol Genomics Network (SGN, http://solgenomics.net) is a web portal with genomic and phenotypic data, and analysis tools for the Solanaceae family and close relatives. SGN hosts whole genome data for an increasing number of Solanaceae family members including tomato, potato, pepper, eggplant, tobacco and Nicotiana benthamiana. The database also stores loci and phenotype data, which researchers can upload and edit with user-friendly web interfaces. Tools such as BLAST, GBrowse and JBrowse for browsing genomes, expression and map data viewers, a locus community annotation system and a QTL analysis tools are available. A new tool was recently implemented to improve Virus-Induced Gene Silencing (VIGS) constructs called the SGN VIGS tool. With the growing genomic and phenotypic data in the database, SGN is now advancing to develop new web-based breeding tools and implement the code and database structure for other species or clade-specific databases.
Collapse
Affiliation(s)
| | - Naama Menda
- Boyce Thompson Institute for Plant Research, Ithaca, NY 14853, USA
| | - Jeremy D Edwards
- Dale Bumpers National Rice Research Center, Stuttgart, AR 72160, USA
| | - Surya Saha
- Boyce Thompson Institute for Plant Research, Ithaca, NY 14853, USA Department of Plant Pathology and Plant-Microbe Biology, Cornell University, Ithaca, NY 14853, USA
| | - Isaak Y Tecle
- Boyce Thompson Institute for Plant Research, Ithaca, NY 14853, USA
| | | | - Aureliano Bombarely
- Department of Horticulture, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061-0002, USA
| | | | - Anuradha Pujar
- Boyce Thompson Institute for Plant Research, Ithaca, NY 14853, USA
| | - Hartmut Foerster
- Boyce Thompson Institute for Plant Research, Ithaca, NY 14853, USA
| | - Aimin Yan
- Boyce Thompson Institute for Plant Research, Ithaca, NY 14853, USA
| | - Lukas A Mueller
- Boyce Thompson Institute for Plant Research, Ithaca, NY 14853, USA Department of Plant Breeding, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
7
|
Pujar A, Menda N, Bombarely A, Edwards JD, Strickler SR, Mueller LA. From manual curation to visualization of gene families and networks across Solanaceae plant species. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2013; 2013:bat028. [PMID: 23681907 PMCID: PMC3655285 DOI: 10.1093/database/bat028] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
High-quality manual annotation methods and practices need to be scaled to the increased rate of genomic data production. Curation based on gene families and gene networks is one approach that can significantly increase both curation efficiency and quality. The Sol Genomics Network (SGN; http://solgenomics.net) is a comparative genomics platform, with genetic, genomic and phenotypic information of the Solanaceae family and its closely related species that incorporates a community-based gene and phenotype curation system. In this article, we describe a manual curation system for gene families aimed at facilitating curation, querying and visualization of gene interaction patterns underlying complex biological processes, including an interface for efficiently capturing information from experiments with large data sets reported in the literature. Well-annotated multigene families are useful for further exploration of genome organization and gene evolution across species. As an example, we illustrate the system with the multigene transcription factor families, WRKY and Small Auxin Up-regulated RNA (SAUR), which both play important roles in responding to abiotic stresses in plants. Database URL:http://solgenomics.net/
Collapse
Affiliation(s)
- Anuradha Pujar
- Boyce Thompson Institute for Plant Research, 533, Tower Road, Ithaca, NY 14853, USA
| | | | | | | | | | | |
Collapse
|
8
|
Uitdewilligen JGAML, Wolters AMA, D’hoop BB, Borm TJA, Visser RGF, van Eck HJ. A next-generation sequencing method for genotyping-by-sequencing of highly heterozygous autotetraploid potato. PLoS One 2013; 8:e62355. [PMID: 23667470 PMCID: PMC3648547 DOI: 10.1371/journal.pone.0062355] [Citation(s) in RCA: 238] [Impact Index Per Article: 21.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2012] [Accepted: 03/20/2013] [Indexed: 11/23/2022] Open
Abstract
Assessment of genomic DNA sequence variation and genotype calling in autotetraploids implies the ability to distinguish among five possible alternative allele copy number states. This study demonstrates the accuracy of genotyping-by-sequencing (GBS) of a large collection of autotetraploid potato cultivars using next-generation sequencing. It is still costly to reach sufficient read depths on a genome wide scale, across the cultivated gene pool. Therefore, we enriched cultivar-specific DNA sequencing libraries using an in-solution hybridisation method (SureSelect). This complexity reduction allowed to confine our study to 807 target genes distributed across the genomes of 83 tetraploid cultivars and one reference (DM 1–3 511). Indexed sequencing libraries were paired-end sequenced in 7 pools of 12 samples using Illumina HiSeq2000. After filtering and processing the raw sequence data, 12.4 Gigabases of high-quality sequence data was obtained, which mapped to 2.1 Mb of the potato reference genome, with a median average read depth of 63× per cultivar. We detected 129,156 sequence variants and genotyped the allele copy number of each variant for every cultivar. In this cultivar panel a variant density of 1 SNP/24 bp in exons and 1 SNP/15 bp in introns was obtained. The average minor allele frequency (MAF) of a variant was 0.14. Potato germplasm displayed a large number of relatively rare variants and/or haplotypes, with 61% of the variants having a MAF below 0.05. A very high average nucleotide diversity (π = 0.0107) was observed. Nucleotide diversity varied among potato chromosomes. Several genes under selection were identified. Genotyping-by-sequencing results, with allele copy number estimates, were validated with a KASP genotyping assay. This validation showed that read depths of ∼60–80× can be used as a lower boundary for reliable assessment of allele copy number of sequence variants in autotetraploids. Genotypic data were associated with traits, and alleles strongly influencing maturity and flesh colour were identified.
Collapse
Affiliation(s)
- Jan G. A. M. L. Uitdewilligen
- Laboratory of Plant Breeding, Wageningen University, Wageningen, The Netherlands
- The Graduate School for Experimental Plant Sciences, Wageningen, The Netherlands
| | - Anne-Marie A. Wolters
- Laboratory of Plant Breeding, Wageningen University, Wageningen, The Netherlands
- The Graduate School for Experimental Plant Sciences, Wageningen, The Netherlands
| | - Bjorn B. D’hoop
- Laboratory of Plant Breeding, Wageningen University, Wageningen, The Netherlands
| | - Theo J. A. Borm
- Laboratory of Plant Breeding, Wageningen University, Wageningen, The Netherlands
- The Graduate School for Experimental Plant Sciences, Wageningen, The Netherlands
| | - Richard G. F. Visser
- Laboratory of Plant Breeding, Wageningen University, Wageningen, The Netherlands
- The Graduate School for Experimental Plant Sciences, Wageningen, The Netherlands
- Centre for BioSystems Genomics, Wageningen, The Netherlands
| | - Herman J. van Eck
- Laboratory of Plant Breeding, Wageningen University, Wageningen, The Netherlands
- The Graduate School for Experimental Plant Sciences, Wageningen, The Netherlands
- Centre for BioSystems Genomics, Wageningen, The Netherlands
- * E-mail:
| |
Collapse
|
9
|
Kour A, Greer K, Valent B, Orbach MJ, Soderlund C. MGOS: development of a community annotation database for Magnaporthe oryzae. MOLECULAR PLANT-MICROBE INTERACTIONS : MPMI 2012; 25:271-278. [PMID: 22074346 DOI: 10.1094/mpmi-07-11-0183] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Magnaporthe oryzae causes rice blast disease, which is the most serious disease of cultivated rice worldwide. We previously developed the Magnaporthe grisea-Orzya sativa (MGOS) database as a repository for the M. oryzae and rice genome sequences together with a comprehensive set of functional interaction data generated by a major consortium of U.S. researchers. The MGOS database has now undergone a major redesign to include data from the international blast research community, accessible with a new intuitive, easy-to-use interface. Registered database users can manually annotate gene sequences and features as well as add mutant data and literature on individual gene pages. Over 900 genes have been manually curated based on various biological databases and the scientific literature. Gene names and descriptions, gene ontology annotations, published and unpublished information on mutants and their phenotypes, responses in diverse microarray analyses, and related literature have been incorporated. Thus far, 362 M. oryzae genes have associated information on mutants. MGOS is now poised to become a one-stop repository for all structural and functional data available on all genes of this critically important rice pathogen.
Collapse
Affiliation(s)
- Anupreet Kour
- School of Plant Sciences, Division of Plant Pathology and Microbiology, The University of Arizona, Tucson 85721, USA
| | | | | | | | | |
Collapse
|
10
|
Kong L, Wang J, Zhao S, Gu X, Luo J, Gao G. ABrowse--a customizable next-generation genome browser framework. BMC Bioinformatics 2012; 13:2. [PMID: 22222089 PMCID: PMC3265404 DOI: 10.1186/1471-2105-13-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2011] [Accepted: 01/05/2012] [Indexed: 11/14/2022] Open
Abstract
Background With the rapid growth of genome sequencing projects, genome browser is becoming indispensable, not only as a visualization system but also as an interactive platform to support open data access and collaborative work. Thus a customizable genome browser framework with rich functions and flexible configuration is needed to facilitate various genome research projects. Results Based on next-generation web technologies, we have developed a general-purpose genome browser framework ABrowse which provides interactive browsing experience, open data access and collaborative work support. By supporting Google-map-like smooth navigation, ABrowse offers end users highly interactive browsing experience. To facilitate further data analysis, multiple data access approaches are supported for external platforms to retrieve data from ABrowse. To promote collaborative work, an online user-space is provided for end users to create, store and share comments, annotations and landmarks. For data providers, ABrowse is highly customizable and configurable. The framework provides a set of utilities to import annotation data conveniently. To build ABrowse on existing annotation databases, data providers could specify SQL statements according to database schema. And customized pages for detailed information display of annotation entries could be easily plugged in. For developers, new drawing strategies could be integrated into ABrowse for new types of annotation data. In addition, standard web service is provided for data retrieval remotely, providing underlying machine-oriented programming interface for open data access. Conclusions ABrowse framework is valuable for end users, data providers and developers by providing rich user functions and flexible customization approaches. The source code is published under GNU Lesser General Public License v3.0 and is accessible at http://www.abrowse.org/. To demonstrate all the features of ABrowse, a live demo for Arabidopsis thaliana genome has been built at http://arabidopsis.cbi.edu.cn/.
Collapse
Affiliation(s)
- Lei Kong
- College of Life Sciences, State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, Peking University, Beijing, 100871, P.R. China
| | | | | | | | | | | |
Collapse
|
11
|
Jung S, Menda N, Redmond S, Buels RM, Friesen M, Bendana Y, Sanderson LA, Lapp H, Lee T, MacCallum B, Bett KE, Cain S, Clements D, Mueller LA, Main D. The Chado Natural Diversity module: a new generic database schema for large-scale phenotyping and genotyping data. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2011; 2011:bar051. [PMID: 22120662 PMCID: PMC3225077 DOI: 10.1093/database/bar051] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Linking phenotypic with genotypic diversity has become a major requirement for basic and applied genome-centric biological research. To meet this need, a comprehensive database backend for efficiently storing, querying and analyzing large experimental data sets is necessary. Chado, a generic, modular, community-based database schema is widely used in the biological community to store information associated with genome sequence data. To meet the need to also accommodate large-scale phenotyping and genotyping projects, a new Chado module called Natural Diversity has been developed. The module strictly adheres to the Chado remit of being generic and ontology driven. The flexibility of the new module is demonstrated in its capacity to store any type of experiment that either uses or generates specimens or stock organisms. Experiments may be grouped or structured hierarchically, whereas any kind of biological entity can be stored as the observed unit, from a specimen to be used in genotyping or phenotyping experiments, to a group of species collected in the field that will undergo further lab analysis. We describe details of the Natural Diversity module, including the design approach, the relational schema and use cases implemented in several databases.
Collapse
Affiliation(s)
- Sook Jung
- Department of Horticulture and Landscape, Washington State University, Pullman, WA 99164, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Harnsomburana J, Green JM, Barb AS, Schaeffer M, Vincent L, Shyu CR. Computable visually observed phenotype ontological framework for plants. BMC Bioinformatics 2011; 12:260. [PMID: 21702966 PMCID: PMC3149582 DOI: 10.1186/1471-2105-12-260] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2010] [Accepted: 06/24/2011] [Indexed: 01/31/2023] Open
Abstract
Background The ability to search for and precisely compare similar phenotypic appearances within and across species has vast potential in plant science and genetic research. The difficulty in doing so lies in the fact that many visual phenotypic data, especially visually observed phenotypes that often times cannot be directly measured quantitatively, are in the form of text annotations, and these descriptions are plagued by semantic ambiguity, heterogeneity, and low granularity. Though several bio-ontologies have been developed to standardize phenotypic (and genotypic) information and permit comparisons across species, these semantic issues persist and prevent precise analysis and retrieval of information. A framework suitable for the modeling and analysis of precise computable representations of such phenotypic appearances is needed. Results We have developed a new framework called the Computable Visually Observed Phenotype Ontological Framework for plants. This work provides a novel quantitative view of descriptions of plant phenotypes that leverages existing bio-ontologies and utilizes a computational approach to capture and represent domain knowledge in a machine-interpretable form. This is accomplished by means of a robust and accurate semantic mapping module that automatically maps high-level semantics to low-level measurements computed from phenotype imagery. The framework was applied to two different plant species with semantic rules mined and an ontology constructed. Rule quality was evaluated and showed high quality rules for most semantics. This framework also facilitates automatic annotation of phenotype images and can be adopted by different plant communities to aid in their research. Conclusions The Computable Visually Observed Phenotype Ontological Framework for plants has been developed for more efficient and accurate management of visually observed phenotypes, which play a significant role in plant genomics research. The uniqueness of this framework is its ability to bridge the knowledge of informaticians and plant science researchers by translating descriptions of visually observed phenotypes into standardized, machine-understandable representations, thus enabling the development of advanced information retrieval and phenotype annotation analysis tools for the plant science community.
Collapse
Affiliation(s)
- Jaturon Harnsomburana
- Department of Computer Science, University of Missouri, 201 EBW, Columbia, MO 65211, USA
| | | | | | | | | | | |
Collapse
|
13
|
Defoin-Platel M, Hassani-Pak K, Rawlings C. Gaining confidence in cross-species annotation transfer: from simple molecular function to complex phenotypic traits. ASPECTS OF APPLIED BIOLOGY 2011; 107:79-87. [PMID: 22319070 PMCID: PMC3272443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Cross-species annotation transfer is a widely used approach for transferring information about simple molecular functions or pathways from one protein in one species to its ortholog in another species. In crop species, the phenotypic traits of interest, such as grain yield, are very complex and are often related to multiple biological processes and systems. It is still unclear to what extent the high level annotations describing phenotypic traits can also be reliably transferred across species. In this work, we have developed a procedure to measure precisely the transferability of these functional annotations from one species to another and demonstrate its application to Arabidopsis and several crop species. This comparative analysis is a step towards assigning higher level biological function to genes and gene networks as part of the wider genotype to phenotype challenge.
Collapse
|
14
|
Bazzini AA, Asís R, González V, Bassi S, Conte M, Soria M, Fernie AR, Asurmendi S, Carrari F. miSolRNA: A tomato micro RNA relational database. BMC PLANT BIOLOGY 2010; 10:240. [PMID: 21059227 PMCID: PMC3095322 DOI: 10.1186/1471-2229-10-240] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2010] [Accepted: 11/08/2010] [Indexed: 05/22/2023]
Abstract
BACKGROUND The economic importance of Solanaceae plant species is well documented and tomato has become a model for functional genomics studies. In plants, important processes are regulated by microRNAs (miRNA). DESCRIPTION We describe here a data base integrating genetic map positions of miRNA-targeted genes, their expression profiles and their relations with quantitative fruit metabolic loci and yield associated traits. miSolRNA provides a metadata source to facilitate the construction of hypothesis aimed at defining physiological modes of action of regulatory process underlying the metabolism of the tomato fruit. CONCLUSIONS The MiSolRNA database allows the simple extraction of metadata for the proposal of new hypothesis concerning possible roles of miRNAs in the regulation of tomato fruit metabolism. It permits i) to map miRNAs and their predicted target sites both on expressed (SGN-UNIGENES) and newly annotated sequences (BAC sequences released), ii) to co-locate any predicted miRNA-target interaction with metabolic QTL found in tomato fruits, iii) to retrieve expression data of target genes in tomato fruit along their developmental period and iv) to design further experiments for unresolved questions in complex trait biology based on the use of genetic materials that have been proven to be a useful tools for map-based cloning experiments in Solanaceae plant species.
Collapse
Affiliation(s)
- Ariel A Bazzini
- Instituto de Biotecnología, Instituto Nacional de Tecnología Agropecuaria (IB-INTA) (Partner group of Institution 5), P.O. BOX 25, B1712WAA Castelar, Argentina
| | - Ramón Asís
- Instituto de Biotecnología, Instituto Nacional de Tecnología Agropecuaria (IB-INTA) (Partner group of Institution 5), P.O. BOX 25, B1712WAA Castelar, Argentina
- CIBICI, Facultad de Ciencias Químicas Universidad Nacional de Córdoba, CC 5000, Haya de la Torre y Medina Allende, Córdoba, Argentina
| | | | | | - Mariana Conte
- Instituto de Biotecnología, Instituto Nacional de Tecnología Agropecuaria (IB-INTA) (Partner group of Institution 5), P.O. BOX 25, B1712WAA Castelar, Argentina
| | - Marcelo Soria
- Facultad de Agronomía. Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Alisdair R Fernie
- Max Planck Institute for Molecular Plant Physiology, Wissenschaftspark Golm, Am Mühlenberg 1, Potsdam-Golm, D-14476, Germany
| | - Sebastián Asurmendi
- Instituto de Biotecnología, Instituto Nacional de Tecnología Agropecuaria (IB-INTA) (Partner group of Institution 5), P.O. BOX 25, B1712WAA Castelar, Argentina
| | - Fernando Carrari
- Instituto de Biotecnología, Instituto Nacional de Tecnología Agropecuaria (IB-INTA) (Partner group of Institution 5), P.O. BOX 25, B1712WAA Castelar, Argentina
| |
Collapse
|
15
|
Tecle IY, Menda N, Buels RM, van der Knaap E, Mueller LA. solQTL: a tool for QTL analysis, visualization and linking to genomes at SGN database. BMC Bioinformatics 2010; 11:525. [PMID: 20964836 PMCID: PMC2984588 DOI: 10.1186/1471-2105-11-525] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2010] [Accepted: 10/21/2010] [Indexed: 11/23/2022] Open
Abstract
Background A common approach to understanding the genetic basis of complex traits is through identification of associated quantitative trait loci (QTL). Fine mapping QTLs requires several generations of backcrosses and analysis of large populations, which is time-consuming and costly effort. Furthermore, as entire genomes are being sequenced and an increasing amount of genetic and expression data are being generated, a challenge remains: linking phenotypic variation to the underlying genomic variation. To identify candidate genes and understand the molecular basis underlying the phenotypic variation of traits, bioinformatic approaches are needed to exploit information such as genetic map, expression and whole genome sequence data of organisms in biological databases. Description The Sol Genomics Network (SGN, http://solgenomics.net) is a primary repository for phenotypic, genetic, genomic, expression and metabolic data for the Solanaceae family and other related Asterids species and houses a variety of bioinformatics tools. SGN has implemented a new approach to QTL data organization, storage, analysis, and cross-links with other relevant data in internal and external databases. The new QTL module, solQTL, http://solgenomics.net/qtl/, employs a user-friendly web interface for uploading raw phenotype and genotype data to the database, R/QTL mapping software for on-the-fly QTL analysis and algorithms for online visualization and cross-referencing of QTLs to relevant datasets and tools such as the SGN Comparative Map Viewer and Genome Browser. Here, we describe the development of the solQTL module and demonstrate its application. Conclusions solQTL allows Solanaceae researchers to upload raw genotype and phenotype data to SGN, perform QTL analysis and dynamically cross-link to relevant genetic, expression and genome annotations. Exploration and synthesis of the relevant data is expected to help facilitate identification of candidate genes underlying phenotypic variation and markers more closely linked to QTLs. solQTL is freely available on SGN and can be used in private or public mode.
Collapse
Affiliation(s)
- Isaak Y Tecle
- Boyce Thompson Institute for Plant Research, Tower Rd, Ithaca, NY 14853, USA
| | | | | | | | | |
Collapse
|
16
|
Bombarely A, Menda N, Tecle IY, Buels RM, Strickler S, Fischer-York T, Pujar A, Leto J, Gosselin J, Mueller LA. The Sol Genomics Network (solgenomics.net): growing tomatoes using Perl. Nucleic Acids Res 2010; 39:D1149-55. [PMID: 20935049 PMCID: PMC3013765 DOI: 10.1093/nar/gkq866] [Citation(s) in RCA: 174] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
The Sol Genomics Network (SGN; http://solgenomics.net/) is a clade-oriented database (COD) containing biological data for species in the Solanaceae and their close relatives, with data types ranging from chromosomes and genes to phenotypes and accessions. SGN hosts several genome maps and sequences, including a pre-release of the tomato (Solanum lycopersicum cv Heinz 1706) reference genome. A new transcriptome component has been added to store RNA-seq and microarray data. SGN is also an open source software project, continuously developing and improving a complex system for storing, integrating and analyzing data. All code and development work is publicly visible on GitHub (http://github.com). The database architecture combines SGN-specific schemas and the community-developed Chado schema (http://gmod.org/wiki/Chado) for compatibility with other genome databases. The SGN curation model is community-driven, allowing researchers to add and edit information using simple web tools. Currently, over a hundred community annotators help curate the database. SGN can be accessed at http://solgenomics.net/.
Collapse
Affiliation(s)
- Aureliano Bombarely
- Boyce Thompson Institute for Plant Research, Tower Road, Ithaca, NY 14853, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Stehr H, Duarte JM, Lappe M, Bhak J, Bolser DM. PDBWiki: added value through community annotation of the Protein Data Bank. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2010; 2010:baq009. [PMID: 20624717 PMCID: PMC2911844 DOI: 10.1093/database/baq009] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
The success of community projects such as Wikipedia has recently prompted a discussion about the applicability of such tools in the life sciences. Currently, there are several such ‘science-wikis’ that aim to collect specialist knowledge from the community into centralized resources. However, there is no consensus about how to achieve this goal. For example, it is not clear how to best integrate data from established, centralized databases with that provided by ‘community annotation’. We created PDBWiki, a scientific wiki for the community annotation of protein structures. The wiki consists of one structured page for each entry in the the Protein Data Bank (PDB) and allows the user to attach categorized comments to the entries. Additionally, each page includes a user editable list of cross-references to external resources. As in a database, it is possible to produce tabular reports and ‘structure galleries’ based on user-defined queries or lists of entries. PDBWiki runs in parallel to the PDB, separating original database content from user annotations. PDBWiki demonstrates how collaboration features can be integrated with primary data from a biological database. It can be used as a system for better understanding how to capture community knowledge in the biological sciences. For users of the PDB, PDBWiki provides a bug-tracker, discussion forum and community annotation system. To date, user participation has been modest, but is increasing. The user editable cross-references section has proven popular, with the number of linked resources more than doubling from 17 originally to 39 today. Database URL: http://www.pdbwiki.org
Collapse
Affiliation(s)
- Henning Stehr
- Otto-Warburg-Laboratory, Max Planck Institute for Molecular Genetics, Berlin, Germany.
| | | | | | | | | |
Collapse
|
18
|
Hirschman J, Berardini TZ, Drabkin HJ, Howe D. A MOD(ern) perspective on literature curation. Mol Genet Genomics 2010; 283:415-25. [PMID: 20221640 PMCID: PMC2854346 DOI: 10.1007/s00438-010-0525-8] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2009] [Accepted: 02/06/2010] [Indexed: 12/13/2022]
Abstract
Curation of biological data is a multi-faceted task whose goal is to create a structured, comprehensive, integrated, and accurate resource of current biological knowledge. These structured data facilitate the work of the scientific community by providing knowledge about genes or genomes and by generating validated connections between the data that yield new information and stimulate new research approaches. For the model organism databases (MODs), an important source of data is research publications. Every published paper containing experimental information about a particular model organism is a candidate for curation. All such papers are examined carefully by curators for relevant information. Here, four curators from different MODs describe the literature curation process and highlight approaches taken by the four MODs to address: (1) the decision process by which papers are selected, and (2) the identification and prioritization of the data contained in the paper. We will highlight some of the challenges that MOD biocurators face, and point to ways in which researchers and publishers can support the work of biocurators and the value of such support.
Collapse
Affiliation(s)
- Jodi Hirschman
- Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA.
| | | | | | | |
Collapse
|
19
|
Shrestha R, Arnaud E, Mauleon R, Senger M, Davenport GF, Hancock D, Morrison N, Bruskiewich R, McLaren G. Multifunctional crop trait ontology for breeders' data: field book, annotation, data discovery and semantic enrichment of the literature. AOB PLANTS 2010; 2010:plq008. [PMID: 22476066 PMCID: PMC3000699 DOI: 10.1093/aobpla/plq008] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2010] [Revised: 04/19/2010] [Accepted: 05/21/2010] [Indexed: 05/21/2023]
Abstract
BACKGROUND AND AIMS Agricultural crop databases maintained in gene banks of the Consultative Group on International Agricultural Research (CGIAR) are valuable sources of information for breeders. These databases provide comparative phenotypic and genotypic information that can help elucidate functional aspects of plant and agricultural biology. To facilitate data sharing within and between these databases and the retrieval of information, the crop ontology (CO) database was designed to provide controlled vocabulary sets for several economically important plant species. METHODOLOGY Existing public ontologies and equivalent catalogues of concepts covering the range of crop science information and descriptors for crops and crop-related traits were collected from breeders, physiologists, agronomists, and researchers in the CGIAR consortium. For each crop, relationships between terms were identified and crop-specific trait ontologies were constructed following the Open Biomedical Ontologies (OBO) format standard using the OBO-Edit tool. All terms within an ontology were assigned a globally unique CO term identifier. PRINCIPAL RESULTS The CO currently comprises crop-specific traits for chickpea (Cicer arietinum), maize (Zea mays), potato (Solanum tuberosum), rice (Oryza sativa), sorghum (Sorghum spp.) and wheat (Triticum spp.). Several plant-structure and anatomy-related terms for banana (Musa spp.), wheat and maize are also included. In addition, multi-crop passport terms are included as controlled vocabularies for sharing information on germplasm. Two web-based online resources were built to make these COs available to the scientific community: the 'CO Lookup Service' for browsing the CO; and the 'Crops Terminizer', an ontology text mark-up tool. CONCLUSIONS The controlled vocabularies of the CO are being used to curate several CGIAR centres' agronomic databases. The use of ontology terms to describe agronomic phenotypes and the accurate mapping of these descriptions into databases will be important steps in comparative phenotypic and genotypic studies across species and gene-discovery experiments.
Collapse
Affiliation(s)
- Rosemary Shrestha
- IRRI-CIMMYT Crop Research Informatics Laboratory (CRIL), Centro Internacional de Mejoramiento de Máiz y Trigo (CIMMYT), Apdo.Postal 6-641, 06600 Mexico, D.F., Mexico
- Corresponding author's e-mail address: ;
| | - Elizabeth Arnaud
- Bioversity International, via dei Tre Denari, 472/a, 00057 Maccarese, Rome, Italy
- Corresponding author's e-mail address: ;
| | - Ramil Mauleon
- IRRI-CIMMYT Crop Research Informatics Laboratory (CRIL), International Rice Research Institute (IRRI), DAPO Box 7777, Metro Manila, Philippines
| | - Martin Senger
- IRRI-CIMMYT Crop Research Informatics Laboratory (CRIL), International Rice Research Institute (IRRI), DAPO Box 7777, Metro Manila, Philippines
| | - Guy F. Davenport
- IRRI-CIMMYT Crop Research Informatics Laboratory (CRIL), Centro Internacional de Mejoramiento de Máiz y Trigo (CIMMYT), Apdo.Postal 6-641, 06600 Mexico, D.F., Mexico
| | - David Hancock
- Department of Computer Science, University of Manchester, Oxford Road, Manchester, UK
| | - Norman Morrison
- Department of Computer Science, University of Manchester, Oxford Road, Manchester, UK
| | - Richard Bruskiewich
- IRRI-CIMMYT Crop Research Informatics Laboratory (CRIL), International Rice Research Institute (IRRI), DAPO Box 7777, Metro Manila, Philippines
| | - Graham McLaren
- Generation Challenge Programme (GCP), c/o Centro Internacional de Mejoramiento de Máiz y Trigo (CIMMYT), Apdo. Postal 6-641, 06600 Mexico, D.F., Mexico
| |
Collapse
|
20
|
Mazourek M, Pujar A, Borovsky Y, Paran I, Mueller L, Jahn MM. A dynamic interface for capsaicinoid systems biology. PLANT PHYSIOLOGY 2009; 150:1806-21. [PMID: 19553373 PMCID: PMC2719146 DOI: 10.1104/pp.109.136549] [Citation(s) in RCA: 90] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/02/2009] [Accepted: 06/13/2009] [Indexed: 05/19/2023]
Abstract
Capsaicinoids are the pungent alkaloids that give hot peppers (Capsicum spp.) their spiciness. While capsaicinoids are relatively simple molecules, much is unknown about their biosynthesis, which spans diverse metabolisms of essential amino acids, phenylpropanoids, benzenoids, and fatty acids. Pepper is not a model organism, but it has access to the resources developed in model plants through comparative approaches. To aid research in this system, we have implemented a comprehensive model of capsaicinoid biosynthesis and made it publicly available within the SolCyc database at the SOL Genomics Network (http://www.sgn.cornell.edu). As a preliminary test of this model, and to build its value as a resource, targeted transcripts were cloned as candidates for nearly all of the structural genes for capsaicinoid biosynthesis. In support of the role of these transcripts in capsaicinoid biosynthesis beyond correct spatial and temporal expression, their predicted subcellular localizations were compared against the biosynthetic model and experimentally determined compartmentalization in Arabidopsis (Arabidopsis thaliana). To enable their use in a positional candidate gene approach in the Solanaceae, these genes were genetically mapped in pepper. These data were integrated into the SOL Genomics Network, a clade-oriented database that incorporates community annotation of genes, enzymes, phenotypes, mutants, and genomic loci. Here, we describe the creation and integration of these resources as a holistic and dynamic model of the characteristic specialized metabolism of pepper.
Collapse
Affiliation(s)
- Michael Mazourek
- Department of Plant Breeding and Genetics, Cornell University, Ithaca, New York 14853, USA.
| | | | | | | | | | | |
Collapse
|
21
|
Kuromori T, Takahashi S, Kondou Y, Shinozaki K, Matsui M. Phenome analysis in plant species using loss-of-function and gain-of-function mutants. PLANT & CELL PHYSIOLOGY 2009; 50:1215-31. [PMID: 19502383 PMCID: PMC2709550 DOI: 10.1093/pcp/pcp078] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2009] [Accepted: 05/29/2009] [Indexed: 05/20/2023]
Abstract
Analysis of genetic mutations is one of the most effective ways to investigate gene function. We now have methods that allow for mass production of mutant lines and cells in a variety of model species. Recently, large numbers of mutant lines have been generated by both 'loss-of-function' and 'gain-of-function' techniques. In parallel, phenotypic information covering various mutant resources has been acquired and released in web-based databases. As a result, significant progress in comprehensive phenotype analysis is being made through the use of these tools. Arabidopsis and rice are two major model plant species in which genome sequencing projects have been completed. Arabidopsis is the most widely used experimental plant, with a large number of mutant resources and several examples of systematic phenotype analysis. Rice is a major crop species and is used as a model plant, with an increasing number of mutant resources. Other plant species are also being employed in functional genetics research. In this review, the present status of mutant resources for large-scale studies of gene function in plant research and the current perspective on using loss-of-function and gain-of-function mutants in phenome research will be discussed.
Collapse
Affiliation(s)
- Takashi Kuromori
- Gene Discovery Research Group, RIKEN Plant Science Center, Yokohama, Kanagawa, 230-0045 Japan
| | - Shinya Takahashi
- Plant Functional Genomics Research Group, RIKEN Plant Science Center, Yokohama, Kanagawa, 230-0045 Japan
- Department of Applied Biological Science, Faculty of Science and Technology, Tokyo University of Science, Noda, Chiba, 278-8510 Japan
| | - Youichi Kondou
- Plant Functional Genomics Research Group, RIKEN Plant Science Center, Yokohama, Kanagawa, 230-0045 Japan
| | - Kazuo Shinozaki
- Gene Discovery Research Group, RIKEN Plant Science Center, Yokohama, Kanagawa, 230-0045 Japan
| | - Minami Matsui
- Plant Functional Genomics Research Group, RIKEN Plant Science Center, Yokohama, Kanagawa, 230-0045 Japan
- *Corresponding author: E-mail, ; Fax, +81-45-503-9584
| |
Collapse
|