1
|
Eloe-Fadrosh EA, Mungall CJ, Miller MA, Smith M, Patil SS, Kelliher JM, Johnson LYD, Rodriguez FE, Chain PSG, Hu B, Thornton MB, McCue LA, McHardy AC, Harris NL, Reddy TBK, Mukherjee S, Hunter CI, Walls R, Schriml LM. A Practical Approach to Using the Genomic Standards Consortium MIxS Reporting Standard for Comparative Genomics and Metagenomics. Methods Mol Biol 2024; 2802:587-609. [PMID: 38819573 DOI: 10.1007/978-1-0716-3838-5_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Comparative analysis of (meta)genomes necessitates aggregation, integration, and synthesis of well-annotated data using standards. The Genomic Standards Consortium (GSC) collaborates with the research community to develop and maintain the Minimum Information about any (x) Sequence (MIxS) reporting standard for genomic data. To facilitate the use of the GSC's MIxS reporting standard, we provide a description of the structure and terminology, how to navigate ontologies for required terms in MIxS, and demonstrate practical usage through a soil metagenome example.
Collapse
Affiliation(s)
- Emiley A Eloe-Fadrosh
- Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
| | - Christopher J Mungall
- Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Mark Andrew Miller
- Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Montana Smith
- Pacific Northwest National Laboratory, Richland, WA, USA
| | - Sujay Sanjeev Patil
- Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Julia M Kelliher
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Leah Y D Johnson
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | | | - Patrick S G Chain
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Bin Hu
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Michael B Thornton
- Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Lee Ann McCue
- Pacific Northwest National Laboratory, Richland, WA, USA
| | - Alice Carolyn McHardy
- Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | - Nomi L Harris
- Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - T B K Reddy
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Supratim Mukherjee
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Christopher I Hunter
- GigaScience Press, Hong Kong Science Park, Pak Shek Kok, New Territories, Hong Kong
| | | | - Lynn M Schriml
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| |
Collapse
|
2
|
Deng CH, Naithani S, Kumari S, Cobo-Simón I, Quezada-Rodríguez EH, Skrabisova M, Gladman N, Correll MJ, Sikiru AB, Afuwape OO, Marrano A, Rebollo I, Zhang W, Jung S. Genotype and phenotype data standardization, utilization and integration in the big data era for agricultural sciences. Database (Oxford) 2023; 2023:baad088. [PMID: 38079567 PMCID: PMC10712715 DOI: 10.1093/database/baad088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 10/17/2023] [Accepted: 11/28/2023] [Indexed: 12/18/2023]
Abstract
Large-scale genotype and phenotype data have been increasingly generated to identify genetic markers, understand gene function and evolution and facilitate genomic selection. These datasets hold immense value for both current and future studies, as they are vital for crop breeding, yield improvement and overall agricultural sustainability. However, integrating these datasets from heterogeneous sources presents significant challenges and hinders their effective utilization. We established the Genotype-Phenotype Working Group in November 2021 as a part of the AgBioData Consortium (https://www.agbiodata.org) to review current data types and resources that support archiving, analysis and visualization of genotype and phenotype data to understand the needs and challenges of the plant genomic research community. For 2021-22, we identified different types of datasets and examined metadata annotations related to experimental design/methods/sample collection, etc. Furthermore, we thoroughly reviewed publicly funded repositories for raw and processed data as well as secondary databases and knowledgebases that enable the integration of heterogeneous data in the context of the genome browser, pathway networks and tissue-specific gene expression. Based on our survey, we recommend a need for (i) additional infrastructural support for archiving many new data types, (ii) development of community standards for data annotation and formatting, (iii) resources for biocuration and (iv) analysis and visualization tools to connect genotype data with phenotype data to enhance knowledge synthesis and to foster translational research. Although this paper only covers the data and resources relevant to the plant research community, we expect that similar issues and needs are shared by researchers working on animals. Database URL: https://www.agbiodata.org.
Collapse
Affiliation(s)
- Cecilia H Deng
- Molecular and Digital Breeding, New Cultivar Innovation, The New Zealand Institute for Plant and Food Research Limited, 120 Mt Albert Road, Auckland 1025, New Zealand
| | - Sushma Naithani
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Sunita Kumari
- Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, New York, NY 11724, USA
| | - Irene Cobo-Simón
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
- Institute of Forest Science (ICIFOR-INIA, CSIC), Madrid, Spain
| | - Elsa H Quezada-Rodríguez
- Departamento de Producción Agrícola y Animal, Universidad Autónoma Metropolitana-Xochimilco, Ciudad de México, México
- Centro de Ciencias de la Complejidad, Universidad Nacional Autónoma de México, Ciudad de México, México
| | - Maria Skrabisova
- Department of Biochemistry, Faculty of Science, Palacky University, Olomouc, Czech Republic
| | - Nick Gladman
- Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, New York, NY 11724, USA
- U.S. Department of Agriculture-Agricultural Research Service, NEA Robert W. Holley Center for Agriculture and Health, Cornell University, Ithaca, NY 14853, USA
| | - Melanie J Correll
- Agricultural and Biological Engineering Department, University of Florida, 1741 Museum Rd, Gainesville, FL 32611, USA
| | | | | | - Annarita Marrano
- Phoenix Bioinformatics, 39899 Balentine Drive, Suite 200, Newark, CA 94560, USA
| | | | - Wentao Zhang
- National Research Council Canada, 110 Gymnasium Pl, Saskatoon, Saskatchewan S7N 0W9, Canada
| | - Sook Jung
- Department of Horticulture, Washington State University, 303c Plant Sciences Building, Pullman, WA 99164-6414, USA
| |
Collapse
|
3
|
Naithani S, Deng CH, Sahu SK, Jaiswal P. Exploring Pan-Genomes: An Overview of Resources and Tools for Unraveling Structure, Function, and Evolution of Crop Genes and Genomes. Biomolecules 2023; 13:1403. [PMID: 37759803 PMCID: PMC10527062 DOI: 10.3390/biom13091403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 08/29/2023] [Accepted: 09/12/2023] [Indexed: 09/29/2023] Open
Abstract
The availability of multiple sequenced genomes from a single species made it possible to explore intra- and inter-specific genomic comparisons at higher resolution and build clade-specific pan-genomes of several crops. The pan-genomes of crops constructed from various cultivars, accessions, landraces, and wild ancestral species represent a compendium of genes and structural variations and allow researchers to search for the novel genes and alleles that were inadvertently lost in domesticated crops during the historical process of crop domestication or in the process of extensive plant breeding. Fortunately, many valuable genes and alleles associated with desirable traits like disease resistance, abiotic stress tolerance, plant architecture, and nutrition qualities exist in landraces, ancestral species, and crop wild relatives. The novel genes from the wild ancestors and landraces can be introduced back to high-yielding varieties of modern crops by implementing classical plant breeding, genomic selection, and transgenic/gene editing approaches. Thus, pan-genomic represents a great leap in plant research and offers new avenues for targeted breeding to mitigate the impact of global climate change. Here, we summarize the tools used for pan-genome assembly and annotations, web-portals hosting plant pan-genomes, etc. Furthermore, we highlight a few discoveries made in crops using the pan-genomic approach and future potential of this emerging field of study.
Collapse
Affiliation(s)
- Sushma Naithani
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA;
| | - Cecilia H. Deng
- Molecular & Digital Breeing Group, New Cultivar Innovation, The New Zealand Institute for Plant and Food Research Limited, Private Bag 92169, Auckland 1142, New Zealand;
| | - Sunil Kumar Sahu
- State Key Laboratory of Agricultural Genomics, Key Laboratory of Genomics, Ministry of Agriculture, BGI Research, Shenzhen 518083, China;
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA;
| |
Collapse
|
4
|
Karabulut E, Erkoç K, Acı M, Aydın M, Barriball S, Braley J, Cassetta E, Craine EB, Diaz-Garcia L, Hershberger J, Meyering B, Miller AJ, Rubin MJ, Tesdell O, Schlautman B, Şakiroğlu M. Sainfoin ( Onobrychis spp.) crop ontology: supporting germplasm characterization and international research collaborations. FRONTIERS IN PLANT SCIENCE 2023; 14:1177406. [PMID: 37255566 PMCID: PMC10225502 DOI: 10.3389/fpls.2023.1177406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Accepted: 04/18/2023] [Indexed: 06/01/2023]
Abstract
Sainfoin (Onobrychis spp.) is a perennial forage legume that is also attracting attention as a perennial pulse with potential for human consumption. The dual use of sainfoin underpins diverse research and breeding programs focused on improving sainfoin lines for forage and pulses, which is driving the generation of complex datasets describing high dimensional phenotypes in the post-omics era. To ensure that multiple user groups, for example, breeders selecting for forage and those selecting for edible seed, can utilize these rich datasets, it is necessary to develop common ontologies and accessible ontology platforms. One such platform, Crop Ontology, was created in 2008 by the Consortium of International Agricultural Research Centers (CGIAR) to host crop-specific trait ontologies that support standardized plant breeding databases. In the present study, we describe the sainfoin crop ontology (CO). An in-depth literature review was performed to develop a comprehensive list of traits measured and reported in sainfoin. Because the same traits can be measured in different ways, ultimately, a set of 98 variables (variable = plant trait + method of measurement + scale of measurement) used to describe variation in sainfoin were identified. Variables were formatted and standardized based on guidelines provided here for inclusion in the sainfoin CO. The 98 variables contained a total of 82 traits from four trait classes of which 24 were agronomic, 31 were morphological, 19 were seed and forage quality related, and 8 were phenological. In addition to the developed variables, we have provided a roadmap for developing and submission of new traits to the sainfoin CO.
Collapse
Affiliation(s)
- Ebrar Karabulut
- Bioengineering Department, Adana Alparslan Türkeş Science and Technology University, Adana, Türkiye
| | - Kübra Erkoç
- Bioengineering Department, Adana Alparslan Türkeş Science and Technology University, Adana, Türkiye
| | - Murat Acı
- Bioengineering Department, Adana Alparslan Türkeş Science and Technology University, Adana, Türkiye
- The Land Institute, Salina, KS, United States
| | - Mahmut Aydın
- Department of Computer Engineering, Kafkas University, Kars, Türkiye
| | | | - Jackson Braley
- Donald Danforth Plant Science Center, St. Louis, MO, United States
| | | | | | - Luis Diaz-Garcia
- Department of Viticulture and Enology, University of California Davis, Davis, CA, United States
| | - Jenna Hershberger
- Plant and Environmental Sciences Department, Clemson University, Clemson, SC, United States
| | - Bo Meyering
- The Land Institute, Salina, KS, United States
| | - Allison J. Miller
- Donald Danforth Plant Science Center, St. Louis, MO, United States
- Department. of Biology, Saint Louis University, St. Louis, MO, United States
| | - Matthew J. Rubin
- Donald Danforth Plant Science Center, St. Louis, MO, United States
| | - Omar Tesdell
- Department of Geography, Birzeit University, Birzeit, West Bank, Palestine
| | | | - Muhammet Şakiroğlu
- Bioengineering Department, Adana Alparslan Türkeş Science and Technology University, Adana, Türkiye
| |
Collapse
|
5
|
Aleksander SA, Balhoff J, Carbon S, Cherry JM, Drabkin HJ, Ebert D, Feuermann M, Gaudet P, Harris NL, Hill DP, Lee R, Mi H, Moxon S, Mungall CJ, Muruganugan A, Mushayahama T, Sternberg PW, Thomas PD, Van Auken K, Ramsey J, Siegele DA, Chisholm RL, Fey P, Aspromonte MC, Nugnes MV, Quaglia F, Tosatto S, Giglio M, Nadendla S, Antonazzo G, Attrill H, Dos Santos G, Marygold S, Strelets V, Tabone CJ, Thurmond J, Zhou P, Ahmed SH, Asanitthong P, Luna Buitrago D, Erdol MN, Gage MC, Ali Kadhum M, Li KYC, Long M, Michalak A, Pesala A, Pritazahra A, Saverimuttu SCC, Su R, Thurlow KE, Lovering RC, Logie C, Oliferenko S, Blake J, Christie K, Corbani L, Dolan ME, Drabkin HJ, Hill DP, Ni L, Sitnikov D, Smith C, Cuzick A, Seager J, Cooper L, Elser J, Jaiswal P, Gupta P, Jaiswal P, Naithani S, Lera-Ramirez M, Rutherford K, Wood V, De Pons JL, Dwinell MR, Hayman GT, Kaldunski ML, Kwitek AE, Laulederkind SJF, Tutaj MA, Vedi M, Wang SJ, D'Eustachio P, Aimo L, Axelsen K, Bridge A, Hyka-Nouspikel N, Morgat A, Aleksander SA, Cherry JM, Engel SR, Karra K, Miyasato SR, Nash RS, Skrzypek MS, Weng S, Wong ED, Bakker E, Berardini TZ, Reiser L, Auchincloss A, Axelsen K, Argoud-Puy G, Blatter MC, Boutet E, Breuza L, Bridge A, Casals-Casas C, Coudert E, Estreicher A, Livia Famiglietti M, Feuermann M, Gos A, Gruaz-Gumowski N, Hulo C, Hyka-Nouspikel N, Jungo F, Le Mercier P, Lieberherr D, Masson P, Morgat A, Pedruzzi I, Pourcel L, Poux S, Rivoire C, Sundaram S, Bateman A, Bowler-Barnett E, Bye-A-Jee H, Denny P, Ignatchenko A, Ishtiaq R, Lock A, Lussi Y, Magrane M, Martin MJ, Orchard S, Raposo P, Speretta E, Tyagi N, Warner K, Zaru R, Diehl AD, Lee R, Chan J, Diamantakis S, Raciti D, Zarowiecki M, Fisher M, James-Zorn C, Ponferrada V, Zorn A, Ramachandran S, Ruzicka L, Westerfield M. The Gene Ontology knowledgebase in 2023. Genetics 2023; 224:iyad031. [PMID: 36866529 PMCID: PMC10158837 DOI: 10.1093/genetics/iyad031] [Citation(s) in RCA: 321] [Impact Index Per Article: 321.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 02/10/2023] [Accepted: 02/11/2023] [Indexed: 03/04/2023] Open
Abstract
The Gene Ontology (GO) knowledgebase (http://geneontology.org) is a comprehensive resource concerning the functions of genes and gene products (proteins and noncoding RNAs). GO annotations cover genes from organisms across the tree of life as well as viruses, though most gene function knowledge currently derives from experiments carried out in a relatively small number of model organisms. Here, we provide an updated overview of the GO knowledgebase, as well as the efforts of the broad, international consortium of scientists that develops, maintains, and updates the GO knowledgebase. The GO knowledgebase consists of three components: (1) the GO-a computational knowledge structure describing the functional characteristics of genes; (2) GO annotations-evidence-supported statements asserting that a specific gene product has a particular functional characteristic; and (3) GO Causal Activity Models (GO-CAMs)-mechanistic models of molecular "pathways" (GO biological processes) created by linking multiple GO annotations using defined relations. Each of these components is continually expanded, revised, and updated in response to newly published discoveries and receives extensive QA checks, reviews, and user feedback. For each of these components, we provide a description of the current contents, recent developments to keep the knowledgebase up to date with new discoveries, and guidance on how users can best make use of the data that we provide. We conclude with future directions for the project.
Collapse
|
6
|
Moreno P, Fexova S, George N, Manning JR, Miao Z, Mohammed S, Muñoz-Pomer A, Fullgrabe A, Bi Y, Bush N, Iqbal H, Kumbham U, Solovyev A, Zhao L, Prakash A, García-Seisdedos D, Kundu DJ, Wang S, Walzer M, Clarke L, Osumi-Sutherland D, Tello-Ruiz MK, Kumari S, Ware D, Eliasova J, Arends MJ, Nawijn MC, Meyer K, Burdett T, Marioni J, Teichmann S, Vizcaíno JA, Brazma A, Papatheodorou I. Expression Atlas update: gene and protein expression in multiple species. Nucleic Acids Res 2022; 50:D129-D140. [PMID: 34850121 PMCID: PMC8728300 DOI: 10.1093/nar/gkab1030] [Citation(s) in RCA: 72] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Revised: 10/11/2021] [Accepted: 11/19/2021] [Indexed: 01/21/2023] Open
Abstract
The EMBL-EBI Expression Atlas is an added value knowledge base that enables researchers to answer the question of where (tissue, organism part, developmental stage, cell type) and under which conditions (disease, treatment, gender, etc) a gene or protein of interest is expressed. Expression Atlas brings together data from >4500 expression studies from >65 different species, across different conditions and tissues. It makes these data freely available in an easy to visualise form, after expert curation to accurately represent the intended experimental design, re-analysed via standardised pipelines that rely on open-source community developed tools. Each study's metadata are annotated using ontologies. The data are re-analyzed with the aim of reproducing the original conclusions of the underlying experiments. Expression Atlas is currently divided into Bulk Expression Atlas and Single Cell Expression Atlas. Expression Atlas contains data from differential studies (microarray and bulk RNA-Seq) and baseline studies (bulk RNA-Seq and proteomics), whereas Single Cell Expression Atlas is currently dedicated to Single Cell RNA-Sequencing (scRNA-Seq) studies. The resource has been in continuous development since 2009 and it is available at https://www.ebi.ac.uk/gxa.
Collapse
Affiliation(s)
- Pablo Moreno
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Silvie Fexova
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Nancy George
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Jonathan R Manning
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Zhichiao Miao
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Suhaib Mohammed
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Alfonso Muñoz-Pomer
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Anja Fullgrabe
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Yalan Bi
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Natassja Bush
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Haider Iqbal
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Upendra Kumbham
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Andrey Solovyev
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Lingyun Zhao
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Ananth Prakash
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - David García-Seisdedos
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Deepti J Kundu
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Shengbo Wang
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Mathias Walzer
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Laura Clarke
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - David Osumi-Sutherland
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | | | - Sunita Kumari
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
- USDA ARS NEA, Plant Soil & Nutrition Laboratory Research Unit, Ithaca, NY 14853, USA
| | - Jana Eliasova
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Mark J Arends
- Edinburgh Pathology, University of Edinburgh, Institute of Genetics & Cancer, Edinburgh, UK
| | - Martijn C Nawijn
- Department of Pathology and Medical Biology, GRIAC research institute, University of Groningen, University Medical Center Groningen, Groningen, Netherlands
| | - Kerstin Meyer
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - John Marioni
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Sarah Teichmann
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Alvis Brazma
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Irene Papatheodorou
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| |
Collapse
|
7
|
Saxena R, Bishnoi R, Singla D. Gene Ontology: application and importance in functional annotation of the genomic data. Bioinformatics 2022. [DOI: 10.1016/b978-0-323-89775-4.00015-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
|
8
|
Alaguero-Cordovilla A, Gran-Gómez FJ, Jadczak P, Mhimdi M, Ibáñez S, Bres C, Just D, Rothan C, Pérez-Pérez JM. A quick protocol for the identification and characterization of early growth mutants in tomato. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2020; 301:110673. [PMID: 33218638 DOI: 10.1016/j.plantsci.2020.110673] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Revised: 09/03/2020] [Accepted: 09/07/2020] [Indexed: 06/11/2023]
Abstract
Root system architecture (RSA) manipulation may improve water and nutrient capture by plants under normal and extreme climate conditions. With the aim of initiating the genetic dissection of RSA in tomato, we established a defined ontology that allowed the curated annotation of the observed phenotypes on 12 traits at four consecutive growth stages. In addition, we established a quick approach for the molecular identification of the mutations associated with the trait-of-interest by using a whole-genome sequencing approach that does not require the building of an additional mapping population. As a proof-of-concept, we screened 4543 seedlings from 300 tomato M3 lines (Solanum lycopersicum L. cv. Micro-Tom) generated by chemical mutagenesis with ethyl methanesulfonate. We studied the growth and early development of both the root system (primary and lateral roots) and the aerial part of the seedlings as well as the wound-induced adventitious roots emerging from the hypocotyl. We identified 659 individuals (belonging to 203 M3 lines) whose early seedling and RSA phenotypes differed from those of their reference background. We confirmed the genetic segregation of the mutant phenotypes affecting primary root length, seedling viability and early RSA in 31 M4 families derived from 15 M3 lines selected in our screen. Finally, we identified a missense mutation in the SlCESA3 gene causing a seedling-lethal phenotype with short roots. Our results validated the experimental approach used for the identification of tomato mutants during early growth, which will allow the molecular identification of the genes involved.
Collapse
Affiliation(s)
| | | | - Paula Jadczak
- Instituto de Bioingeniería, Universidad Miguel Hernández, 03202, Elche, Alicante, Spain.
| | - Mariem Mhimdi
- Instituto de Bioingeniería, Universidad Miguel Hernández, 03202, Elche, Alicante, Spain.
| | - Sergio Ibáñez
- Instituto de Bioingeniería, Universidad Miguel Hernández, 03202, Elche, Alicante, Spain.
| | - Cécile Bres
- INRAE and University of Bordeaux, UMR 1332 Biologie du Fruit et Pathologie, F-33140, Villenave d'Ornon, France.
| | - Daniel Just
- INRAE and University of Bordeaux, UMR 1332 Biologie du Fruit et Pathologie, F-33140, Villenave d'Ornon, France.
| | - Christophe Rothan
- INRAE and University of Bordeaux, UMR 1332 Biologie du Fruit et Pathologie, F-33140, Villenave d'Ornon, France.
| | | |
Collapse
|
9
|
Nédellec C, Ibanescu L, Bossy R, Sourdille P. WTO, an ontology for wheat traits and phenotypes in scientific publications. Genomics Inform 2020; 18:e14. [PMID: 32634868 PMCID: PMC7362939 DOI: 10.5808/gi.2020.18.2.e14] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Revised: 06/09/2020] [Accepted: 06/10/2020] [Indexed: 11/20/2022] Open
Abstract
Phenotyping is a major issue for wheat agriculture to meet the challenges of adaptation of wheat varieties to climate change and chemical input reduction in crop. The need to improve the reuse of observations and experimental data has led to the creation of reference ontologies to standardize descriptions of phenotypes and to facilitate their comparison. The scientific literature is largely under-exploited, although extremely rich in phenotype descriptions associated with cultivars and genetic information. In this paper we propose the Wheat Trait Ontology (WTO) that is suitable for the extraction and management of scientific information from scientific papers, and its combination with data from genomic and experimental databases. We describe the principles of WTO construction and show examples of WTO use for the extraction and management of phenotype descriptions obtained from scientific documents.
Collapse
Affiliation(s)
- Claire Nédellec
- Paris-Saclay University, INRAE, MaIAGE, F-78350 Jouy-en-Josas, France
| | - Liliana Ibanescu
- Paris-Saclay University, INRAE, UMR MIA-Paris, AgroParisTech, F-75005, Paris, France
| | - Robert Bossy
- Paris-Saclay University, INRAE, MaIAGE, F-78350 Jouy-en-Josas, France
| | - Pierre Sourdille
- University Clermont-Auvergne, INRAE, UMR 1095 GDEC, F-63000 Clermont-Ferrand, France
| |
Collapse
|
10
|
Rocca-Serra P, Sansone SA. Experiment design driven FAIRification of omics data matrices, an exemplar. Sci Data 2019; 6:271. [PMID: 31831744 PMCID: PMC6908569 DOI: 10.1038/s41597-019-0286-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2019] [Accepted: 10/31/2019] [Indexed: 01/03/2023] Open
Abstract
We outline a principled approach to data FAIRification rooted in the notions of experimental design, and whose main intent is to clarify the semantics of data matrices. Using two related metabolomics datasets associated to journal articles, we perform retrospective data and metadata curation and re-annotation, using community, open, interoperability standards. The results are semantically-anchored data matrices, deposited in public archives, which are readable by software agents for data-level queries, and which can support the reproducibility and reuse of the data underpinning the publications.
Collapse
Affiliation(s)
- Philippe Rocca-Serra
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, Oxford, OX1 3QG, United Kingdom.
| | - Susanna-Assunta Sansone
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, Oxford, OX1 3QG, United Kingdom.
| |
Collapse
|
11
|
Zhang Y, Wang H. Building an information infrastructure of spectroscopic profiling data for food-drug quality and safety management. ENTERP INF SYST-UK 2019. [DOI: 10.1080/17517575.2019.1684567] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Affiliation(s)
- Yinsheng Zhang
- School of Management and E-Business, Zhejiang Gongshang University, Hangzhou, China
- School of Information Sciences, University of Illinois at Urbana Champaign, Champaign, IL, USA
| | - Haiyan Wang
- School of Management and E-Business, Zhejiang Gongshang University, Hangzhou, China
| |
Collapse
|
12
|
Tello-Ruiz MK, Naithani S, Stein JC, Gupta P, Campbell M, Olson A, Wei S, Preece J, Geniza MJ, Jiao Y, Lee YK, Wang B, Mulvaney J, Chougule K, Elser J, Al-Bader N, Kumari S, Thomason J, Kumar V, Bolser DM, Naamati G, Tapanari E, Fonseca N, Huerta L, Iqbal H, Keays M, Munoz-Pomer Fuentes A, Tang A, Fabregat A, D'Eustachio P, Weiser J, Stein LD, Petryszak R, Papatheodorou I, Kersey PJ, Lockhart P, Taylor C, Jaiswal P, Ware D. Gramene 2018: unifying comparative genomics and pathway resources for plant research. Nucleic Acids Res 2019; 46:D1181-D1189. [PMID: 29165610 PMCID: PMC5753211 DOI: 10.1093/nar/gkx1111] [Citation(s) in RCA: 91] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Accepted: 10/25/2017] [Indexed: 12/24/2022] Open
Abstract
Gramene (http://www.gramene.org) is a knowledgebase for comparative functional analysis in major crops and model plant species. The current release, #54, includes over 1.7 million genes from 44 reference genomes, most of which were organized into 62,367 gene families through orthologous and paralogous gene classification, whole-genome alignments, and synteny. Additional gene annotations include ontology-based protein structure and function; genetic, epigenetic, and phenotypic diversity; and pathway associations. Gramene's Plant Reactome provides a knowledgebase of cellular-level plant pathway networks. Specifically, it uses curated rice reference pathways to derive pathway projections for an additional 66 species based on gene orthology, and facilitates display of gene expression, gene-gene interactions, and user-defined omics data in the context of these pathways. As a community portal, Gramene integrates best-of-class software and infrastructure components including the Ensembl genome browser, Reactome pathway browser, and Expression Atlas widgets, and undergoes periodic data and software upgrades. Via powerful, intuitive search interfaces, users can easily query across various portals and interactively analyze search results by clicking on diverse features such as genomic context, highly augmented gene trees, gene expression anatomograms, associated pathways, and external informatics resources. All data in Gramene are accessible through both visual and programmatic interfaces.
Collapse
Affiliation(s)
| | - Sushma Naithani
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Joshua C Stein
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Parul Gupta
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Michael Campbell
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Andrew Olson
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Sharon Wei
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Justin Preece
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Matthew J Geniza
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Yinping Jiao
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Young Koung Lee
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA.,Division of Biological Sciences and Institute for Basic Science, Wonkwang University, Iksan 54538, Korea
| | - Bo Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Joseph Mulvaney
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Justin Elser
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Noor Al-Bader
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Sunita Kumari
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - James Thomason
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Vivek Kumar
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Daniel M Bolser
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Guy Naamati
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Electra Tapanari
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Nuno Fonseca
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Laura Huerta
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Haider Iqbal
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Maria Keays
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | | | - Amy Tang
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Antonio Fabregat
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Peter D'Eustachio
- Department of Biochemistry & Molecular Pharmacology, NYU School of Medicine, New York, NY 10016, USA
| | - Joel Weiser
- Informatics and Bio-computing Program, Ontario Institute of Cancer Research, Toronto, M5G 1L7, Canada
| | - Lincoln D Stein
- Adaptive Oncology Program, Ontario Institute for Cancer Research, Toronto M5G 0A3, Canada
| | - Robert Petryszak
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Irene Papatheodorou
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Paul J Kersey
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Patti Lockhart
- American Society of Plant Biologists, 15501 Monona Drive, Rockville, MD 20855-2768, USA
| | - Crispin Taylor
- American Society of Plant Biologists, 15501 Monona Drive, Rockville, MD 20855-2768, USA
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA.,USDA ARS NAA Robert W. Holley Center for Agriculture and Health, Agricultural Research Service, Ithaca, NY 14853, USA
| |
Collapse
|
13
|
Cooper L, Meier A, Laporte MA, Elser JL, Mungall C, Sinn BT, Cavaliere D, Carbon S, Dunn NA, Smith B, Qu B, Preece J, Zhang E, Todorovic S, Gkoutos G, Doonan JH, Stevenson DW, Arnaud E, Jaiswal P. The Planteome database: an integrated resource for reference ontologies, plant genomics and phenomics. Nucleic Acids Res 2019; 46:D1168-D1180. [PMID: 29186578 PMCID: PMC5753347 DOI: 10.1093/nar/gkx1152] [Citation(s) in RCA: 87] [Impact Index Per Article: 17.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2017] [Accepted: 11/21/2017] [Indexed: 01/08/2023] Open
Abstract
The Planteome project (http://www.planteome.org) provides a suite of reference and species-specific ontologies for plants and annotations to genes and phenotypes. Ontologies serve as common standards for semantic integration of a large and growing corpus of plant genomics, phenomics and genetics data. The reference ontologies include the Plant Ontology, Plant Trait Ontology and the Plant Experimental Conditions Ontology developed by the Planteome project, along with the Gene Ontology, Chemical Entities of Biological Interest, Phenotype and Attribute Ontology, and others. The project also provides access to species-specific Crop Ontologies developed by various plant breeding and research communities from around the world. We provide integrated data on plant traits, phenotypes, and gene function and expression from 95 plant taxa, annotated with reference ontology terms. The Planteome project is developing a plant gene annotation platform; Planteome Noctua, to facilitate community engagement. All the Planteome ontologies are publicly available and are maintained at the Planteome GitHub site (https://github.com/Planteome) for sharing, tracking revisions and new requests. The annotated data are freely accessible from the ontology browser (http://browser.planteome.org/amigo) and our data repository.
Collapse
Affiliation(s)
- Laurel Cooper
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Austin Meier
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | | | - Justin L Elser
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Chris Mungall
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | | | | | - Seth Carbon
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Nathan A Dunn
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Barry Smith
- Department of Philosophy, University at Buffalo, Buffalo, NY 14260, USA
| | - Botong Qu
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Justin Preece
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Eugene Zhang
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Sinisa Todorovic
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Georgios Gkoutos
- Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, B15 2TT, UK
| | - John H Doonan
- National Plant Phenomics Centre, Institute of Biological, Environmental, and Rural Sciences, Aberystwyth University, Aberystwyth SY23 3DA, UK
| | | | - Elizabeth Arnaud
- Bioversity International, Parc Scientifique Agropolis II, 34397 Montpellier Cedex 5, France
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| |
Collapse
|
14
|
Shameer K, Naika MB, Shafi KM, Sowdhamini R. Decoding systems biology of plant stress for sustainable agriculture development and optimized food production. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2019; 145:19-39. [DOI: 10.1016/j.pbiomolbio.2018.12.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/01/2017] [Revised: 10/23/2018] [Accepted: 12/06/2018] [Indexed: 12/13/2022]
|
15
|
Pan Q, Wei J, Guo F, Huang S, Gong Y, Liu H, Liu J, Li L. Trait ontology analysis based on association mapping studies bridges the gap between crop genomics and Phenomics. BMC Genomics 2019; 20:443. [PMID: 31159731 PMCID: PMC6547493 DOI: 10.1186/s12864-019-5812-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2018] [Accepted: 05/20/2019] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND Trait ontology (TO) analysis is a powerful system for functional annotation and enrichment analysis of genes. However, given the complexity of the molecular mechanisms underlying phenomes, only a few hundred gene-to-TO relationships in plants have been elucidated to date, limiting the pace of research in this "big data" era. RESULTS Here, we curated all the available trait associated sites (TAS) information from 79 association mapping studies of maize (Zea mays L.) and rice (Oryza sativa L.) lines with diverse genetic backgrounds and built a large-scale TAS-derived TO system for functional annotation of genes in various crops. Our TO system contains information for up to 18,042 genes (6345 in maize at the 25 k level and 11,697 in rice at the 50 k level), including gene-to-TO relationships, which covers over one fifth of the annotated gene sets for maize and rice. A comparison of Gene Ontology (GO) vs. TO analysis demonstrated that the TAS-derived TO system is an efficient alternative tool for gene functional annotation and enrichment analysis. We therefore combined information from the TO, GO, metabolic pathway, and co-expression network databases and constructed the TAS system, which is publicly available at http://tas.hzau.edu.cn . TAS provides a user-friendly interface for functional annotation of genes, enrichment analysis, genome-wide extraction of trait-associated genes, and crosschecking of different functional annotation databases. CONCLUSIONS TAS bridges the gap between genomic and phenomic information in crops. This easy-to-use tool will be useful for geneticists, biologists, and breeders in the agricultural community, as it facilitates the dissection of molecular mechanisms conferring agronomic traits in an easy, genome-wide manner.
Collapse
Affiliation(s)
- Qingchun Pan
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Junfeng Wei
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Feng Guo
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Suiyong Huang
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Yong Gong
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Hao Liu
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Jianxiao Liu
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China.
| | - Lin Li
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China.
| |
Collapse
|
16
|
Lobet G, Paez-Garcia A, Schneider H, Junker A, Atkinson JA, Tracy S. Demystifying roots: A need for clarification and extended concepts in root phenotyping. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2019; 282:11-13. [PMID: 31003606 DOI: 10.1016/j.plantsci.2018.09.015] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2017] [Revised: 09/05/2018] [Accepted: 09/21/2018] [Indexed: 05/13/2023]
Abstract
Plant roots have major roles in plant anchorage, resource acquisition and offer environmental benefits including carbon sequestration and soil erosion mitigation. As such, the study of root system architecture, anatomy and functional properties is of crucial interest to plant breeding, with the aim of sustainable yield production and environmental stewardship. Due to the importance of the root system studies, there is a need for clarification of terms and concepts in the root phenotyping community. In particular in this contribution, we advocate for the use of a reference naming system (ontologies) for roots and root phenes. Such uniformity would not only allow better understanding of research results, but would also enable a better sharing of data. In addition, we highlight the need to incorporate the concept of plasticity in breeding programs, as it is an essential component of root system development in heterogeneous environments.
Collapse
Affiliation(s)
- Guillaume Lobet
- Agrosphere, IBG3, Forschungszentrum Jülich, Jülich, Germany; Earth and Life Institute, Université catholique de Louvain, Louvain-la-Neuve, Belgium.
| | | | - Hannah Schneider
- Department of Plant Science, The Pennslyvania State University, University Park, USA.
| | - Astrid Junker
- Acclimation Dynamics & Phenotyping Group, Dept. of Molecular Genetics at Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Germany.
| | - Jonathan A Atkinson
- School of Biosciences, University of Nottingham, Sutton Bonington Campus, UK.
| | - Saoirse Tracy
- School of Agriculture and Food Science, University College Dublin, Ireland.
| |
Collapse
|
17
|
Pommier C, Michotey C, Cornut G, Roumet P, Duchêne E, Flores R, Lebreton A, Alaux M, Durand S, Kimmel E, Letellier T, Merceron G, Laine M, Guerche C, Loaec M, Steinbach D, Laporte MA, Arnaud E, Quesneville H, Adam-Blondon AF. Applying FAIR Principles to Plant Phenotypic Data Management in GnpIS. PLANT PHENOMICS (WASHINGTON, D.C.) 2019; 2019:1671403. [PMID: 33313522 PMCID: PMC7718628 DOI: 10.34133/2019/1671403] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/08/2019] [Accepted: 04/08/2019] [Indexed: 05/19/2023]
Abstract
GnpIS is a data repository for plant phenomics that stores whole field and greenhouse experimental data including environment measures. It allows long-term access to datasets following the FAIR principles: Findable, Accessible, Interoperable, and Reusable, by using a flexible and original approach. It is based on a generic and ontology driven data model and an innovative software architecture that uncouples data integration, storage, and querying. It takes advantage of international standards including the Crop Ontology, MIAPPE, and the Breeding API. GnpIS allows handling data for a wide range of species and experiment types, including multiannual perennial plants experimental network or annual plant trials with either raw data, i.e., direct measures, or computed traits. It also ensures the integration and the interoperability among phenotyping datasets and with genotyping data. This is achieved through a careful curation and annotation of the key resources conducted in close collaboration with the communities providing data. Our repository follows the Open Science data publication principles by ensuring citability of each dataset. Finally, GnpIS compliance with international standards enables its interoperability with other data repositories hence allowing data links between phenotype and other data types. GnpIS can therefore contribute to emerging international federations of information systems.
Collapse
Affiliation(s)
- C. Pommier
- URGI, INRA, Université Paris-Saclay, 78026 Versailles, France
| | - C. Michotey
- URGI, INRA, Université Paris-Saclay, 78026 Versailles, France
| | - G. Cornut
- URGI, INRA, Université Paris-Saclay, 78026 Versailles, France
| | - P. Roumet
- AGAP, Univ Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | - E. Duchêne
- UMR SVQV, 28 rue de Herrlisheim, B.P. 20507, 68021 Colmar, France
| | - R. Flores
- URGI, INRA, Université Paris-Saclay, 78026 Versailles, France
| | - A. Lebreton
- URGI, INRA, Université Paris-Saclay, 78026 Versailles, France
| | - M. Alaux
- URGI, INRA, Université Paris-Saclay, 78026 Versailles, France
| | - S. Durand
- URGI, INRA, Université Paris-Saclay, 78026 Versailles, France
| | - E. Kimmel
- URGI, INRA, Université Paris-Saclay, 78026 Versailles, France
| | - T. Letellier
- URGI, INRA, Université Paris-Saclay, 78026 Versailles, France
| | - G. Merceron
- URGI, INRA, Université Paris-Saclay, 78026 Versailles, France
| | - M. Laine
- URGI, INRA, Université Paris-Saclay, 78026 Versailles, France
| | - C. Guerche
- URGI, INRA, Université Paris-Saclay, 78026 Versailles, France
| | - M. Loaec
- URGI, INRA, Université Paris-Saclay, 78026 Versailles, France
| | - D. Steinbach
- URGI, INRA, Université Paris-Saclay, 78026 Versailles, France
| | - M. A. Laporte
- Bioversity International, parc Scientifique Agropolis II, 34397 Montpellier cedex 5, France
| | - E. Arnaud
- Bioversity International, parc Scientifique Agropolis II, 34397 Montpellier cedex 5, France
| | - H. Quesneville
- URGI, INRA, Université Paris-Saclay, 78026 Versailles, France
| | | |
Collapse
|
18
|
Song H, Lin K, Hu J, Pang E. An Updated Functional Annotation of Protein-Coding Genes in the Cucumber Genome. FRONTIERS IN PLANT SCIENCE 2018; 9:325. [PMID: 29599790 PMCID: PMC5863696 DOI: 10.3389/fpls.2018.00325] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/18/2017] [Accepted: 02/27/2018] [Indexed: 06/08/2023]
Abstract
Background: Although the cucumber reference genome and its annotation were published several years ago, the functional annotation of predicted genes, particularly protein-coding genes, still requires further improvement. In general, accurately determining orthologous relationships between genes allows for better and more robust functional assignments of predicted genes. As one of the most reliable strategies, the determination of collinearity information may facilitate reliable orthology inferences among genes from multiple related genomes. Currently, the identification of collinear segments has mainly been based on conservation of gene order and orientation. Over the course of plant genome evolution, various evolutionary events have disrupted or distorted the order of genes along chromosomes, making it difficult to use those genes as genome-wide markers for plant genome comparisons. Results: Using the localized LASTZ/MULTIZ analysis pipeline, we aligned 15 genomes, including cucumber and other related angiosperm plants, and identified a set of genomic segments that are short in length, stable in structure, uniform in distribution and highly conserved across all 15 plants. Compared with protein-coding genes, these conserved segments were more suitable for use as genomic markers for detecting collinear segments among distantly divergent plants. Guided by this set of identified collinear genomic segments, we inferred 94,486 orthologous protein-coding gene pairs (OPPs) between cucumber and 14 other angiosperm species, which were used as proxies for transferring functional terms to cucumber genes from the annotations of the other 14 genomes. In total, 10,885 protein-coding genes were assigned Gene Ontology (GO) terms which was nearly 1,300 more than results collected in Uniprot-proteomic database. Our results showed that annotation accuracy would been improved compared with other existing approaches. Conclusions: In this study, we provided an alternative resource for the functional annotation of predicted cucumber protein-coding genes, which we expect will be beneficial for the cucumber's biological study, accessible from http://cmb.bnu.edu.cn/functional_annotation. Meanwhile, using the cucumber reference genome as a case study, we presented an efficient strategy for transferring gene functional information from previously well-characterized protein-coding genes in model species to newly sequenced or "non-model" plant species.
Collapse
Affiliation(s)
- Hongtao Song
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing, China
| | - Kui Lin
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing, China
| | - Jinglu Hu
- Graduate School of Information, Production and Systems, Waseda University, Kitakyushu-shi, Japan
| | - Erli Pang
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing, China
| |
Collapse
|
19
|
Endara L, Cui H, Burleigh JG. Extraction of phenotypic traits from taxonomic descriptions for the tree of life using natural language processing. APPLICATIONS IN PLANT SCIENCES 2018; 6:e1035. [PMID: 29732265 PMCID: PMC5895189 DOI: 10.1002/aps3.1035] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/05/2017] [Accepted: 01/31/2018] [Indexed: 05/09/2023]
Abstract
PREMISE OF THE STUDY Phenotypic data sets are necessary to elucidate the genealogy of life, but assembling phenotypic data for taxa across the tree of life can be technically challenging and prohibitively time consuming. We describe a semi-automated protocol to facilitate and expedite the assembly of phenotypic character matrices of plants from formal taxonomic descriptions. This pipeline uses new natural language processing (NLP) techniques and a glossary of over 9000 botanical terms. METHODS AND RESULTS Our protocol includes the Explorer of Taxon Concepts (ETC), an online application that assembles taxon-by-character matrices from taxonomic descriptions, and MatrixConverter, a Java application that enables users to evaluate and discretize the characters extracted by ETC. We demonstrate this protocol using descriptions from Araucariaceae. CONCLUSIONS The NLP pipeline unlocks the phenotypic data found in taxonomic descriptions and makes them usable for evolutionary analyses.
Collapse
Affiliation(s)
- Lorena Endara
- Department of BiologyUniversity of FloridaGainesvilleFlorida32611USA
| | - Hong Cui
- School of InformationUniversity of ArizonaTucsonArizona85719USA
| | | |
Collapse
|
20
|
Harper L, Campbell J, Cannon EKS, Jung S, Poelchau M, Walls R, Andorf C, Arnaud E, Berardini TZ, Birkett C, Cannon S, Carson J, Condon B, Cooper L, Dunn N, Elsik CG, Farmer A, Ficklin SP, Grant D, Grau E, Herndon N, Hu ZL, Humann J, Jaiswal P, Jonquet C, Laporte MA, Larmande P, Lazo G, McCarthy F, Menda N, Mungall CJ, Munoz-Torres MC, Naithani S, Nelson R, Nesdill D, Park C, Reecy J, Reiser L, Sanderson LA, Sen TZ, Staton M, Subramaniam S, Tello-Ruiz MK, Unda V, Unni D, Wang L, Ware D, Wegrzyn J, Williams J, Woodhouse M, Yu J, Main D. AgBioData consortium recommendations for sustainable genomics and genetics databases for agriculture. Database (Oxford) 2018; 2018:5096675. [PMID: 30239679 PMCID: PMC6146126 DOI: 10.1093/database/bay088] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2018] [Revised: 07/19/2018] [Accepted: 07/30/2018] [Indexed: 01/07/2023]
Abstract
The future of agricultural research depends on data. The sheer volume of agricultural biological data being produced today makes excellent data management essential. Governmental agencies, publishers and science funders require data management plans for publicly funded research. Furthermore, the value of data increases exponentially when they are properly stored, described, integrated and shared, so that they can be easily utilized in future analyses. AgBioData (https://www.agbiodata.org) is a consortium of people working at agricultural biological databases, data archives and knowledgbases who strive to identify common issues in database development, curation and management, with the goal of creating database products that are more Findable, Accessible, Interoperable and Reusable. We strive to promote authentic, detailed, accurate and explicit communication between all parties involved in scientific data. As a step toward this goal, we present the current state of biocuration, ontologies, metadata and persistence, database platforms, programmatic (machine) access to data, communication and sustainability with regard to data curation. Each section describes challenges and opportunities for these topics, along with recommendations and best practices.
Collapse
Affiliation(s)
- Lisa Harper
- Corn Insects and Crop Genetics Research Unit, USDA-ARS, Ames, IA, USA
| | | | - Ethalinda K S Cannon
- Corn Insects and Crop Genetics Research Unit, USDA-ARS, Ames, IA, USA
- Computer Science, Iowa State University, Ames, IA, USA
| | - Sook Jung
- Horticulture, Washington State University, Pullman, WA, USA
| | - Monica Poelchau
- National Agricultural Library, USDA Agricultural Research Service, Beltsville, MD, USA
| | | | - Carson Andorf
- Corn Insects and Crop Genetics Research Unit, USDA-ARS, Ames, IA, USA
- Computer Science, Iowa State University, Ames, IA, USA
| | - Elizabeth Arnaud
- Bioversity International, Informatics Unit, Conservation and Availability Programme, Parc Scientifique Agropolis II, Montpellier, France
| | - Tanya Z Berardini
- The Arabidopsis Information Resource, Phoenix Bioinformatics, Fremont, CA, USA
| | | | - Steve Cannon
- Corn Insects and Crop Genetics Research Unit, USDA-ARS, Ames, IA, USA
| | - James Carson
- Texas Advanced Computing Center, The University of Texas at Austin, Austin, TX, USA
| | - Bradford Condon
- Entomology and Plant Pathology, University of Tennessee Knoxville, Knoxville, TN, USA
| | - Laurel Cooper
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, USA
| | - Nathan Dunn
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Christine G Elsik
- Division of Animal Sciences and Division of Plant Sciences, University of Missouri, Columbia, MO, USA
| | - Andrew Farmer
- National Center for Genome Resources, Santa Fe, NM, USA
| | | | - David Grant
- Corn Insects and Crop Genetics Research Unit, USDA-ARS, Ames, IA, USA
| | - Emily Grau
- National Center for Genome Resources, Santa Fe, NM, USA
| | - Nic Herndon
- Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Zhi-Liang Hu
- Animal Science, Iowa State University, Ames, USA
| | - Jodi Humann
- Horticulture, Washington State University, Pullman, WA, USA
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, USA
| | - Clement Jonquet
- Laboratory of Informatics, Robotics, Microelectronics of Montpellier, University of Montpellier & CNRS, Montpellier, France
| | - Marie-Angélique Laporte
- Bioversity International, Informatics Unit, Conservation and Availability Programme, Parc Scientifique Agropolis II, Montpellier, France
| | | | - Gerard Lazo
- Crop Improvement and Genetics Research Unit, USDA-ARS, Albany, CA, USA
| | - Fiona McCarthy
- School of Animal and Comparative Biomedical Sciences, University of Arizona, Tucson, AZ, USA
| | | | | | | | - Sushma Naithani
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, USA
| | - Rex Nelson
- Corn Insects and Crop Genetics Research Unit, USDA-ARS, Ames, IA, USA
| | - Daureen Nesdill
- Marriott Library, University of Utah, Salt Lake City, UT, USA
| | - Carissa Park
- Animal Science, Iowa State University, Ames, USA
| | - James Reecy
- Animal Science, Iowa State University, Ames, USA
| | - Leonore Reiser
- The Arabidopsis Information Resource, Phoenix Bioinformatics, Fremont, CA, USA
| | | | - Taner Z Sen
- Crop Improvement and Genetics Research Unit, USDA-ARS, Albany, CA, USA
| | - Margaret Staton
- Entomology and Plant Pathology, University of Tennessee Knoxville, Knoxville, TN, USA
| | | | | | - Victor Unda
- Horticulture, Washington State University, Pullman, WA, USA
| | - Deepak Unni
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Liya Wang
- Plant Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Doreen Ware
- USDA, Plant, Soil and Nutrition Research, Ithaca, NY, USA
- Plant Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Jill Wegrzyn
- Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Jason Williams
- Cold Spring Harbor Laboratory, DNA Learning Center, Cold Spring Harbor, NY, USA
| | - Margaret Woodhouse
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, USA
| | - Jing Yu
- Horticulture, Washington State University, Pullman, WA, USA
| | - Doreen Main
- Horticulture, Washington State University, Pullman, WA, USA
| |
Collapse
|
21
|
Balduzzi M, Binder BM, Bucksch A, Chang C, Hong L, Iyer-Pascuzzi AS, Pradal C, Sparks EE. Reshaping Plant Biology: Qualitative and Quantitative Descriptors for Plant Morphology. FRONTIERS IN PLANT SCIENCE 2017; 8:117. [PMID: 28217137 PMCID: PMC5289971 DOI: 10.3389/fpls.2017.00117] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2016] [Accepted: 01/19/2017] [Indexed: 05/04/2023]
Abstract
An emerging challenge in plant biology is to develop qualitative and quantitative measures to describe the appearance of plants through the integration of mathematics and biology. A major hurdle in developing these metrics is finding common terminology across fields. In this review, we define approaches for analyzing plant geometry, topology, and shape, and provide examples for how these terms have been and can be applied to plants. In leaf morphological quantifications both geometry and shape have been used to gain insight into leaf function and evolution. For the analysis of cell growth and expansion, we highlight the utility of geometric descriptors for understanding sepal and hypocotyl development. For branched structures, we describe how topology has been applied to quantify root system architecture to lend insight into root function. Lastly, we discuss the importance of using morphological descriptors in ecology to assess how communities interact, function, and respond within different environments. This review aims to provide a basic description of the mathematical principles underlying morphological quantifications.
Collapse
Affiliation(s)
| | - Brad M. Binder
- Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee-KnoxvilleKnoxville, TN, USA
| | - Alexander Bucksch
- Department of Plant Biology, University of GeorgiaAthens, GA, USA
- Warnell School of Forestry and Environmental Resources, University of GeorgiaAthens, GA, USA
- Institute of Bioinformatics, University of GeorgiaAthens, GA, USA
| | - Cynthia Chang
- Division of Biological Sciences, University of Washington-BothellBothell, WA, USA
| | - Lilan Hong
- Weill Institute for Cell and Molecular Biology and Section of Plant Biology, School of Integrative Plant Sciences, Cornell UniversityIthaca, NY, USA
| | | | - Christophe Pradal
- INRIA, Virtual PlantsMontpellier, France
- CIRAD, UMR AGAPMontpellier, France
| | | |
Collapse
|
22
|
Nakamura Y, Kudo T, Terashima S, Saito M, Nambara E, Yano K. CATchUP: A Web Database for Spatiotemporally Regulated Genes. PLANT & CELL PHYSIOLOGY 2017; 58:e3. [PMID: 28013273 DOI: 10.1093/pcp/pcw199] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2016] [Accepted: 11/06/2016] [Indexed: 06/06/2023]
Abstract
For proper control of biological activity, some key genes are highly expressed in a particular spatiotemporal domain. Mining of such spatiotemporally expressed genes using large-scale gene expression data derived from a broad range of experimental sources facilitates our understanding of genome-scale functional gene networks. However, comprehensive information on spatiotemporally expressed genes is lacking in plants. To collect such information, we devised a new index, Δdmax, which is the maximum difference in relative gene expression levels between sample runs which are neighboring when sorted by the levels. Employing this index, we comprehensively evaluated transcripts using large-scale RNA sequencing (RNA-Seq) data stored in the Sequence Read Archive for eight plant species: Arabidopsis thaliana (Arabidopsis), Solanum lycopersicum (tomato), Solanum tuberosum (potato), Oryza sativa (rice), Sorghum bicolor (sorghum), Vitis vinifera (grape), Medicago truncatula (Medicago), and Glycine max (soybean). Based on the frequency distribution of the Δdmax values, approximately 70,000 transcripts showing 0.3 or larger Δdmax values were extracted for the eight species. Information on these genes including the Δdmax values, functional annotations, conservation among species, and experimental conditions where the genes show high expression levels is provided in a new database, CATchUP (http://plantomics.mind.meiji.ac.jp/CATchUP). The CATchUP database assists in identifying genes specifically expressed under particular conditions with powerful search functions and an intuitive graphical user interface.
Collapse
Affiliation(s)
- Yukino Nakamura
- Bioinformatics Laboratory, School of Agriculture, Meiji University, Higashi-mita, Tama-ku, Kawasaki, Kanagawa, Japan
| | - Toru Kudo
- Bioinformatics Laboratory, School of Agriculture, Meiji University, Higashi-mita, Tama-ku, Kawasaki, Kanagawa, Japan
| | - Shin Terashima
- Bioinformatics Laboratory, School of Agriculture, Meiji University, Higashi-mita, Tama-ku, Kawasaki, Kanagawa, Japan
| | - Misa Saito
- Bioinformatics Laboratory, School of Agriculture, Meiji University, Higashi-mita, Tama-ku, Kawasaki, Kanagawa, Japan
| | - Eiji Nambara
- Department of Cell & Systems Biology, University of Toronto, Willcocks Street, Toronto, Ontario, Canada
| | - Kentaro Yano
- Bioinformatics Laboratory, School of Agriculture, Meiji University, Higashi-mita, Tama-ku, Kawasaki, Kanagawa, Japan
| |
Collapse
|