1
|
Rele CP, Sandlin KM, Leung W, Reed LK. Manual annotation of Drosophila genes: a Genomics Education Partnership protocol. F1000Res 2023; 11:1579. [PMID: 37854289 PMCID: PMC10579860 DOI: 10.12688/f1000research.126839.3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/26/2023] [Indexed: 10/20/2023] Open
Abstract
Annotating the genomes of multiple species allows us to analyze the evolution of their genes. While many eukaryotic genome assemblies already include computational gene predictions, these predictions can benefit from review and refinement through manual gene annotation. The Genomics Education Partnership (GEP; https://thegep.org/) developed a structural annotation protocol for protein-coding genes that enables undergraduate student and faculty researchers to create high-quality gene annotations that can be utilized in subsequent scientific investigations. For example, this protocol has been utilized by the GEP faculty to engage undergraduate students in the comparative annotation of genes involved in the insulin signaling pathway in 27 Drosophila species, using D. melanogaster as the reference genome. Students construct gene models using multiple lines of computational and empirical evidence including expression data (e.g., RNA-Seq), sequence similarity (e.g., BLAST and multiple sequence alignment), and computational gene predictions. Quality control measures require each gene be annotated by at least two students working independently, followed by reconciliation of the submitted gene models by a more experienced student. This article provides an overview of the annotation protocol and describes how discrepancies in student submitted gene models are resolved to produce a final, high-quality gene set suitable for subsequent analyses. The protocol can be adapted to other scientific questions (e.g., expansion of the Drosophila Muller F element) and species (e.g., parasitoid wasps) to provide additional opportunities for undergraduate students to participate in genomics research. These student annotation efforts can substantially improve the quality of gene annotations in publicly available genomic databases.
Collapse
Affiliation(s)
- Chinmay P. Rele
- Department of Biological Sciences, The University of Alabama, Tuscaloosa, Alabama, 35487, USA
| | - Katie M. Sandlin
- Department of Biological Sciences, The University of Alabama, Tuscaloosa, Alabama, 35487, USA
| | - Wilson Leung
- Department of Biology, Washington University in St. Louis, St. Louis, Missouri, 63130, USA
| | - Laura K. Reed
- Department of Biological Sciences, The University of Alabama, Tuscaloosa, Alabama, 35487, USA
| |
Collapse
|
2
|
Diogo-Jr R, de Resende Von Pinho EV, Pinto RT, Zhang L, Condori-Apfata JA, Pereira PA, Vilela DR. Maize heat shock proteins-prospection, validation, categorization and in silico analysis of the different ZmHSP families. STRESS BIOLOGY 2023; 3:37. [PMID: 37981586 PMCID: PMC10482818 DOI: 10.1007/s44154-023-00104-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Accepted: 07/05/2023] [Indexed: 11/21/2023]
Abstract
Among the plant molecular mechanisms capable of effectively mitigating the effects of adverse weather conditions, the heat shock proteins (HSPs), a group of chaperones with multiple functions, stand out. At a time of full progress on the omic sciences, they look very promising in the genetic engineering field, especially in order to conceive superior genotypes, potentially tolerant to abiotic stresses (AbSts). Recently, some works concerning certain families of maize HSPs (ZmHSPs) were published. However, there was still a lack of a study that, with a high degree of criteria, would fully conglomerate them. Using distinct but complementary strategies, we have prospected as many ZmHSPs candidates as possible, gathering more than a thousand accessions. After detailed data mining, we accounted for 182 validated ones, belonging to seven families, which were subcategorized into classes with potential for functional parity. In them, we identified dozens of motifs with some degree of similarity with proteins from different kingdoms, which may help explain some of their still poorly understood means of action. Through in silico and in vitro approaches, we compared their expression levels after controlled exposure to several AbSts' sources, applied at diverse tissues, on varied phenological stages. Based on gene ontology concepts, we still analyzed them from different perspectives of term enrichment. We have also searched, in model plants and close species, for potentially orthologous genes. With all these new insights, which culminated in a plentiful supplementary material, rich in tables, we aim to constitute a fertile consultation source for those maize researchers attracted by these interesting stress proteins.
Collapse
Affiliation(s)
- Rubens Diogo-Jr
- Department of Horticulture and Landscape Architecture, Purdue University, West Lafayette, IN, (47907), USA.
- Department of Agriculture, Federal University of Lavras (UFLA), Lavras, MG, (37200-900), Brazil.
| | | | - Renan Terassi Pinto
- Faculty of Philosophy and Sciences at Ribeirao Preto, University of Sao Paulo (USP), Ribeirao Preto, SP, (14040-901), Brazil
| | - Lingrui Zhang
- Department of Horticulture and Landscape Architecture, Purdue University, West Lafayette, IN, (47907), USA
| | - Jorge Alberto Condori-Apfata
- Department of Horticulture and Landscape Architecture, Purdue University, West Lafayette, IN, (47907), USA
- Faculty of Engineering and Agricultural Sciences, Universidad Nacional Toribio Rodriguez de Mendoza de Amazonas (UNTRM), Chachapoyas, AM, (01001), Peru
| | - Paula Andrade Pereira
- Department of Agriculture, Federal University of Lavras (UFLA), Lavras, MG, (37200-900), Brazil
| | - Danielle Rezende Vilela
- Department of Agriculture, Federal University of Lavras (UFLA), Lavras, MG, (37200-900), Brazil
| |
Collapse
|
3
|
Grimplet J. Genomic and Bioinformatic Resources for Perennial Fruit Species. Curr Genomics 2022; 23:217-233. [PMID: 36777875 PMCID: PMC9875543 DOI: 10.2174/1389202923666220428102632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Revised: 03/12/2022] [Accepted: 03/12/2022] [Indexed: 11/22/2022] Open
Abstract
In the post-genomic era, data management and development of bioinformatic tools are critical for the adequate exploitation of genomics data. In this review, we address the actual situation for the subset of crops represented by the perennial fruit species. The agronomical singularity of these species compared to plant and crop model species provides significant challenges on the implementation of good practices generally not addressed in other species. Studies are usually performed over several years in non-controlled environments, usage of rootstock is common, and breeders heavily rely on vegetative propagation. A reference genome is now available for all the major species as well as many members of the economically important genera for breeding purposes. Development of pangenome for these species is beginning to gain momentum which will require a substantial effort in term of bioinformatic tool development. The available tools for genome annotation and functional analysis will also be presented.
Collapse
Affiliation(s)
- Jérôme Grimplet
- Centro de Investigación y Tecnología Agroalimentaria de Aragón (CITA), Unidad de Hortofruticultura, Gobierno de Aragón, Avda. Montañana, Zaragoza, Spain;,Instituto Agroalimentario de Aragón–IA2 (CITA-Universidad de Zaragoza), Calle Miguel Servet, Zaragoza, Spain,Address correspondence to this author at the Centro de Investigación y Tecnología Agroalimentaria de Aragón (CITA), Unidad de Hortofruticultura, Gobierno de Aragón, Avda. Montañana, Zaragoza, Spain; Instituto Agroalimentario de Aragón–IA2 (CITA-Universidad de Zaragoza), Calle Miguel Servet, Zaragoza, Spain; Tel: +34976713635; E-mail:
| |
Collapse
|
4
|
Finkers R, van Kaauwen M, Ament K, Burger-Meijer K, Egging R, Huits H, Kodde L, Kroon L, Shigyo M, Sato S, Vosman B, van Workum W, Scholten O. Insights from the first genome assembly of Onion (Allium cepa). G3 (BETHESDA, MD.) 2021; 11. [PMID: 34544132 DOI: 10.1101/2021.03.05.434149] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 07/06/2021] [Indexed: 05/18/2023]
Abstract
Onion is an important vegetable crop with an estimated genome size of 16 Gb. We describe the de novo assembly and ab initio annotation of the genome of a doubled haploid onion line DHCU066619, which resulted in a final assembly of 14.9 Gb with an N50 of 464 Kb. Of this, 2.4 Gb was ordered into eight pseudomolecules using four genetic linkage maps. The remainder of the genome is available in 89.6 K scaffolds. Only 72.4% of the genome could be identified as repetitive sequences and consist, to a large extent, of (retro) transposons. In addition, an estimated 20% of the putative (retro) transposons had accumulated a large number of mutations, hampering their identification, but facilitating their assembly. These elements are probably already quite old. The ab initio gene prediction indicated 540,925 putative gene models, which is far more than expected, possibly due to the presence of pseudogenes. Of these models, 47,066 showed RNASeq support. No gene rich regions were found, genes are uniformly distributed over the genome. Analysis of synteny with Allium sativum (garlic) showed collinearity but also major rearrangements between both species. This assembly is the first high-quality genome sequence available for the study of onion and will be a valuable resource for further research.
Collapse
Affiliation(s)
- Richard Finkers
- Plant Breeding, Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands
| | - Martijn van Kaauwen
- Plant Breeding, Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands
| | - Kai Ament
- Bejo Zaden B.V., 1749 CZ Warmerhuizen, The Netherlands
| | - Karin Burger-Meijer
- Plant Breeding, Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands
| | | | - Henk Huits
- Bejo Zaden B.V., 1749 CZ Warmerhuizen, The Netherlands
| | - Linda Kodde
- Plant Breeding, Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands
| | - Laurens Kroon
- Bejo Zaden B.V., 1749 CZ Warmerhuizen, The Netherlands
| | - Masayoshi Shigyo
- Laboratory of Vegetable Crop Science, College of Agriculture, Graduate School of Sciences and Technology for Innovation, Yamaguchi University Yamaguchi City, Yamaguchi 753-8515, Japan
| | - Shusei Sato
- Graduate School of Life Sciences, Tohoku University, Sendai 980-8577, Japan
| | - Ben Vosman
- Plant Breeding, Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands
| | | | - Olga Scholten
- Plant Breeding, Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands
| |
Collapse
|
5
|
Finkers R, van Kaauwen M, Ament K, Burger-Meijer K, Egging R, Huits H, Kodde L, Kroon L, Shigyo M, Sato S, Vosman B, van Workum W, Scholten O. Insights from the first genome assembly of Onion (Allium cepa). G3 (BETHESDA, MD.) 2021; 11:jkab243. [PMID: 34544132 PMCID: PMC8496297 DOI: 10.1093/g3journal/jkab243] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 07/06/2021] [Indexed: 11/17/2022]
Abstract
Onion is an important vegetable crop with an estimated genome size of 16 Gb. We describe the de novo assembly and ab initio annotation of the genome of a doubled haploid onion line DHCU066619, which resulted in a final assembly of 14.9 Gb with an N50 of 464 Kb. Of this, 2.4 Gb was ordered into eight pseudomolecules using four genetic linkage maps. The remainder of the genome is available in 89.6 K scaffolds. Only 72.4% of the genome could be identified as repetitive sequences and consist, to a large extent, of (retro) transposons. In addition, an estimated 20% of the putative (retro) transposons had accumulated a large number of mutations, hampering their identification, but facilitating their assembly. These elements are probably already quite old. The ab initio gene prediction indicated 540,925 putative gene models, which is far more than expected, possibly due to the presence of pseudogenes. Of these models, 47,066 showed RNASeq support. No gene rich regions were found, genes are uniformly distributed over the genome. Analysis of synteny with Allium sativum (garlic) showed collinearity but also major rearrangements between both species. This assembly is the first high-quality genome sequence available for the study of onion and will be a valuable resource for further research.
Collapse
Affiliation(s)
- Richard Finkers
- Plant Breeding, Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands
| | - Martijn van Kaauwen
- Plant Breeding, Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands
| | - Kai Ament
- Bejo Zaden B.V., 1749 CZ Warmerhuizen, The Netherlands
| | - Karin Burger-Meijer
- Plant Breeding, Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands
| | | | - Henk Huits
- Bejo Zaden B.V., 1749 CZ Warmerhuizen, The Netherlands
| | - Linda Kodde
- Plant Breeding, Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands
| | - Laurens Kroon
- Bejo Zaden B.V., 1749 CZ Warmerhuizen, The Netherlands
| | - Masayoshi Shigyo
- Laboratory of Vegetable Crop Science, College of Agriculture, Graduate School of Sciences and Technology for Innovation, Yamaguchi University Yamaguchi City, Yamaguchi 753-8515, Japan
| | - Shusei Sato
- Graduate School of Life Sciences, Tohoku University, Sendai 980-8577, Japan
| | - Ben Vosman
- Plant Breeding, Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands
| | | | - Olga Scholten
- Plant Breeding, Wageningen University and Research Centre, 6700 AA Wageningen, The Netherlands
| |
Collapse
|
6
|
Nwadiugwu MC. Expression, Interaction, and Role of Pseudogene Adh6-ps1 in Cancer Phenotypes. Bioinform Biol Insights 2021; 15:11779322211040591. [PMID: 34413637 PMCID: PMC8369952 DOI: 10.1177/11779322211040591] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Accepted: 07/26/2021] [Indexed: 01/15/2023] Open
Abstract
Pseudogenes have been classified as functionless and their annotation is an ongoing problem. The Adh6-ps1-a mouse pseudogene belonging to the alcohol dehydrogenase gene complex (Adh) was analyzed to review the conservation, homology, expression, and interactions and identify any role it plays in disease phenotypes using bioinformatics databases. Results showed that Adh6-ps1 have 2 transcripts (processed and unprocessed) which may have emerged from a transposition and duplication event, respectively, and that induced inversions (Uox gene, In(3)11Rk) involving gene complexes associated with Adh6-ps1 have been implicated in a diverse range of diseases. Adh6-ps1 is highly conserved in vertebrates particularly rodents and expressed in the liver. The top 5 MirRNA targets were Mir455, Mir511, Mir1903, Mir361, and Mir669o markers. While much is unknown about Mir1903 and Mir669o, the silencing of Mir455 and Mir511 is linked with hepatocellular carcinoma (HCC), and Mir361 is implicated in endometrial cancers. Given the identified MirRNA interactions with Adh6-ps1 and its expression in HCC and reproductive systems, it may well have a role in tumorigenesis and disease phenotypes. Nonetheless, further studies are required to establish these facts to add to the growing efforts to understand pseudogenes and their potential involvement in disease conditions.
Collapse
Affiliation(s)
- Martin C Nwadiugwu
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, USA
| |
Collapse
|
7
|
Olson AJ, Ware D. Ranked choice voting for representative transcripts with TRaCE. Bioinformatics 2021; 38:261-264. [PMID: 34297055 PMCID: PMC8696091 DOI: 10.1093/bioinformatics/btab542] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 06/09/2021] [Accepted: 07/22/2021] [Indexed: 02/03/2023] Open
Abstract
SUMMARY Genome sequencing projects annotate protein-coding gene models with multiple transcripts, aiming to represent all of the available transcript evidence. However, downstream analyses often operate on only one representative transcript per gene locus, sometimes known as the canonical transcript. To choose canonical transcripts, Transcript Ranking and Canonical Election (TRaCE) holds an 'election' in which a set of RNA-seq samples rank transcripts by annotation edit distance. These sample-specific votes are tallied along with other criteria such as protein length and InterPro domain coverage. The winner is selected as the canonical transcript, but the election proceeds through multiple rounds of voting to order all the transcripts by relevance. Based on the set of expression data provided, TRaCE can identify the most common isoforms from a broad expression atlas or prioritize alternative transcripts expressed in specific contexts. AVAILABILITY AND IMPLEMENTATION Transcript ranking code can be found on GitHub at {{https://github.com/warelab/TRaCE}}. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Doreen Ware
- Plant Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11768, USA,USDA ARS Robert W. Holley Center for Agriculture and Health Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
8
|
Tello-Ruiz MK, Naithani S, Gupta P, Olson A, Wei S, Preece J, Jiao Y, Wang B, Chougule K, Garg P, Elser J, Kumari S, Kumar V, Contreras-Moreira B, Naamati G, George N, Cook J, Bolser D, D'Eustachio P, Stein LD, Gupta A, Xu W, Regala J, Papatheodorou I, Kersey PJ, Flicek P, Taylor C, Jaiswal P, Ware D. Gramene 2021: harnessing the power of comparative genomics and pathways for plant research. Nucleic Acids Res 2021; 49:D1452-D1463. [PMID: 33170273 DOI: 10.1093/nar/gkaa979] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 10/09/2020] [Indexed: 01/27/2023] Open
Abstract
Gramene (http://www.gramene.org), a knowledgebase founded on comparative functional analyses of genomic and pathway data for model plants and major crops, supports agricultural researchers worldwide. The resource is committed to open access and reproducible science based on the FAIR data principles. Since the last NAR update, we made nine releases; doubled the genome portal's content; expanded curated genes, pathways and expression sets; and implemented the Domain Informational Vocabulary Extraction (DIVE) algorithm for extracting gene function information from publications. The current release, #63 (October 2020), hosts 93 reference genomes-over 3.9 million genes in 122 947 families with orthologous and paralogous classifications. Plant Reactome portrays pathway networks using a combination of manual biocuration in rice (320 reference pathways) and orthology-based projections to 106 species. The Reactome platform facilitates comparison between reference and projected pathways, gene expression analyses and overlays of gene-gene interactions. Gramene integrates ontology-based protein structure-function annotation; information on genetic, epigenetic, expression, and phenotypic diversity; and gene functional annotations extracted from plant-focused journals using DIVE. We train plant researchers in biocuration of genes and pathways; host curated maize gene structures as tracks in the maize genome browser; and integrate curated rice genes and pathways in the Plant Reactome.
Collapse
Affiliation(s)
| | - Sushma Naithani
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Parul Gupta
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Andrew Olson
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Sharon Wei
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Justin Preece
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Yinping Jiao
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Bo Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Priyanka Garg
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Justin Elser
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Sunita Kumari
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Vivek Kumar
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Bruno Contreras-Moreira
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Guy Naamati
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Nancy George
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Justin Cook
- Informatics and Bio-computing Program, Ontario Institute of Cancer Research, Toronto M5G 1L7, Canada
| | - Daniel Bolser
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK.,Current affiliation: Geromics Inc., Cambridge CB1 3NF, UK
| | - Peter D'Eustachio
- Department of Biochemistry and Molecular Pharmacology, New York University Grossman School of Medicine, New York, NY 10016, USA
| | - Lincoln D Stein
- Adaptive Oncology Program, Ontario Institute for Cancer Research, Toronto M5G 0A3, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Amit Gupta
- Texas Advanced Computing Center, University of Texas at Austin, Austin, TX 78758, USA
| | - Weijia Xu
- Texas Advanced Computing Center, University of Texas at Austin, Austin, TX 78758, USA
| | - Jennifer Regala
- American Society of Plant Biologists, Rockville, MD 20855-2768, USA.,Current affiliation: American Urological Association, Linthicum, MD 21090, USA
| | - Irene Papatheodorou
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Paul J Kersey
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK.,Current affiliation: Royal Botanic Gardens, Kew Richmond, Surrey TW9 3AE, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Crispin Taylor
- American Society of Plant Biologists, Rockville, MD 20855-2768, USA
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA.,USDA ARS NAA Robert W. Holley Center for Agriculture and Health, Ithaca, NY 14853, USA
| |
Collapse
|
9
|
Tello-Ruiz MK, Naithani S, Gupta P, Olson A, Wei S, Preece J, Jiao Y, Wang B, Chougule K, Garg P, Elser J, Kumari S, Kumar V, Contreras-Moreira B, Naamati G, George N, Cook J, Bolser D, D'Eustachio P, Stein LD, Gupta A, Xu W, Regala J, Papatheodorou I, Kersey PJ, Flicek P, Taylor C, Jaiswal P, Ware D. Gramene 2021: harnessing the power of comparative genomics and pathways for plant research. Nucleic Acids Res 2021; 49:D1452-D1463. [PMID: 33170273 DOI: 10.1093/nar/gkaa979/5973447] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 10/09/2020] [Indexed: 05/20/2023] Open
Abstract
Gramene (http://www.gramene.org), a knowledgebase founded on comparative functional analyses of genomic and pathway data for model plants and major crops, supports agricultural researchers worldwide. The resource is committed to open access and reproducible science based on the FAIR data principles. Since the last NAR update, we made nine releases; doubled the genome portal's content; expanded curated genes, pathways and expression sets; and implemented the Domain Informational Vocabulary Extraction (DIVE) algorithm for extracting gene function information from publications. The current release, #63 (October 2020), hosts 93 reference genomes-over 3.9 million genes in 122 947 families with orthologous and paralogous classifications. Plant Reactome portrays pathway networks using a combination of manual biocuration in rice (320 reference pathways) and orthology-based projections to 106 species. The Reactome platform facilitates comparison between reference and projected pathways, gene expression analyses and overlays of gene-gene interactions. Gramene integrates ontology-based protein structure-function annotation; information on genetic, epigenetic, expression, and phenotypic diversity; and gene functional annotations extracted from plant-focused journals using DIVE. We train plant researchers in biocuration of genes and pathways; host curated maize gene structures as tracks in the maize genome browser; and integrate curated rice genes and pathways in the Plant Reactome.
Collapse
Affiliation(s)
| | - Sushma Naithani
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Parul Gupta
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Andrew Olson
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Sharon Wei
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Justin Preece
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Yinping Jiao
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Bo Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Priyanka Garg
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Justin Elser
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Sunita Kumari
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Vivek Kumar
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Bruno Contreras-Moreira
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Guy Naamati
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Nancy George
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Justin Cook
- Informatics and Bio-computing Program, Ontario Institute of Cancer Research, Toronto M5G 1L7, Canada
| | - Daniel Bolser
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
- Current affiliation: Geromics Inc., Cambridge CB1 3NF, UK
| | - Peter D'Eustachio
- Department of Biochemistry and Molecular Pharmacology, New York University Grossman School of Medicine, New York, NY 10016, USA
| | - Lincoln D Stein
- Adaptive Oncology Program, Ontario Institute for Cancer Research, Toronto M5G 0A3, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Amit Gupta
- Texas Advanced Computing Center, University of Texas at Austin, Austin, TX 78758, USA
| | - Weijia Xu
- Texas Advanced Computing Center, University of Texas at Austin, Austin, TX 78758, USA
| | - Jennifer Regala
- American Society of Plant Biologists, Rockville, MD 20855-2768, USA
- Current affiliation: American Urological Association, Linthicum, MD 21090, USA
| | - Irene Papatheodorou
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Paul J Kersey
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
- Current affiliation: Royal Botanic Gardens, Kew Richmond, Surrey TW9 3AE, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Crispin Taylor
- American Society of Plant Biologists, Rockville, MD 20855-2768, USA
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
- USDA ARS NAA Robert W. Holley Center for Agriculture and Health, Ithaca, NY 14853, USA
| |
Collapse
|