1
|
Liu FF, Wan YX, Cao WX, Zhang QQ, Li Y, Li Y, Zhang PZ, Si HQ. Novel function of a putative TaCOBL ortholog associated with cold response. Mol Biol Rep 2023; 50:4375-4384. [PMID: 36944863 DOI: 10.1007/s11033-023-08297-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2022] [Accepted: 01/19/2023] [Indexed: 03/23/2023]
Abstract
The plant COBRA protein family plays an important role in secondary cell wall biosynthesis and the orientation of cell expansion. The COBRA gene family has been well studied in Arabidopsis thaliana, maize, rice, etc., but no systematic studies were conducted in wheat. In this study, the full-length sequence of TaCOBLs was obtained by homology cloning from wheat, and a conserved motif analysis confirmed that TaCOBLs belonged to the COBRA protein family. qRT-PCR results showed that the TaCOBL transcripts were induced by abiotic stresses, including cold, drought, salinity, and abscisic acid (ABA). Two haplotypes of TaCOBL-5B (Hap5B-a and Hap5B-b), harboring one indel (----/TATA) in the 5' flanking region (- 550 bp), were found on chromosome 5BS. A co-dominant marker, Ta5BF/Ta5BR, was developed based on the polymorphism of the two TaCOBL-5B haplotypes. Significant correlations between the two TaCOBL-5B haplotypes and cold resistance were observed under four environmental conditions. Hap5B-a, a favored haplotype acquired during wheat polyploidization, may positively contribute to enhanced cold resistance in wheat. Based on the promoter activity analysis, the Hap5B-a promoter containing a TATA-box was more active than that of Hap5B-b without the TATA-box under low temperature. Our study provides valuable information indicating that the TaCOBL genes are associated with cold response in wheat.
Collapse
Affiliation(s)
- Fang-Fang Liu
- Crop Research Institute, Anhui Academy of Agricultural Sciences, Anhui Key Laboratory of Crop Quality Improvement, Hefei, 230031, China
| | - Ying-Xiu Wan
- Crop Research Institute, Anhui Academy of Agricultural Sciences, Anhui Key Laboratory of Crop Quality Improvement, Hefei, 230031, China
| | - Wen-Xin Cao
- Crop Research Institute, Anhui Academy of Agricultural Sciences, Anhui Key Laboratory of Crop Quality Improvement, Hefei, 230031, China
| | - Qi-Qi Zhang
- Crop Research Institute, Anhui Academy of Agricultural Sciences, Anhui Key Laboratory of Crop Quality Improvement, Hefei, 230031, China
| | - Yao Li
- Crop Research Institute, Anhui Academy of Agricultural Sciences, Anhui Key Laboratory of Crop Quality Improvement, Hefei, 230031, China
| | - Yan Li
- Crop Research Institute, Anhui Academy of Agricultural Sciences, Anhui Key Laboratory of Crop Quality Improvement, Hefei, 230031, China
| | - Ping-Zhi Zhang
- Crop Research Institute, Anhui Academy of Agricultural Sciences, Anhui Key Laboratory of Crop Quality Improvement, Hefei, 230031, China.
| | - Hong-Qi Si
- School of Agronomy, Anhui Agricultural University, Hefei, 230036, China.
| |
Collapse
|
2
|
Niehues H, van der Krieken DA, Ederveen THA, Jansen PAM, van Niftrik L, Mesman R, Netea MG, Smits JPH, Schalkwijk J, van den Bogaard EH, Zeeuwen PLJM. Antimicrobial late cornified envelope (LCE) proteins: the psoriasis risk factor LCE3B/C-del affects microbiota composition. J Invest Dermatol 2021; 142:1947-1955.e6. [PMID: 34942199 DOI: 10.1016/j.jid.2021.11.036] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2021] [Revised: 11/03/2021] [Accepted: 11/22/2021] [Indexed: 12/20/2022]
Abstract
Late cornified envelope (LCE) proteins are predominantly expressed in the skin and other cornified epithelia. Based on sequence similarity, this eighteen-member homologous gene family has been subdivided into six groups. The LCE3 proteins have been the focus of dermatological research, as the combined deletion of LCE3B and LCE3C genes (LCE3B/C-del) is a risk factor for psoriasis. We previously reported that LCE3B/C-del increases expression of the LCE3A gene and that LCE3 proteins exert antibacterial activity. In the current study we analyzed the antimicrobial properties of other family members and the role of LCE3B/C-del in modulation of microbiota composition of the skin and oral cavity. Differences in killing efficiency and specificity between the LCE proteins and their target microbes were found, and the amino acid content, rather than the order, of the well-conserved central domain of the LCE3A protein was found responsible for its antibacterial activity. In vivo, LCE3B/C-del correlated with a higher beta-diversity in the skin and oral microbiota. From these results we conclude that all LCE proteins possess antimicrobial activity. Tissue-specific and genotype-dependent antimicrobial protein profiles impact skin and oral microbiota composition, which could direct towards LCE3B/C-del associated dysbiosis and a possible role for microbiota in the pathophysiology of psoriasis.
Collapse
Affiliation(s)
- Hanna Niehues
- Department of Dermatology, Radboud Institute for Molecular Life Sciences (RIMLS), Radboud University Nijmegen Medical Center (Radboudumc), Nijmegen, The Netherlands
| | - Danique A van der Krieken
- Department of Dermatology, Radboud Institute for Molecular Life Sciences (RIMLS), Radboud University Nijmegen Medical Center (Radboudumc), Nijmegen, The Netherlands
| | - Thomas H A Ederveen
- Center for Molecular and Biomolecular Informatics, RIMLS, Radboudumc, Nijmegen, The Netherlands
| | - Patrick A M Jansen
- Department of Dermatology, Radboud Institute for Molecular Life Sciences (RIMLS), Radboud University Nijmegen Medical Center (Radboudumc), Nijmegen, The Netherlands
| | - Laura van Niftrik
- Department of Microbiology, Institute for Water and Wetland Research, Faculty of Science, Radboud University, Nijmegen, The Netherlands
| | - Rob Mesman
- Department of Microbiology, Institute for Water and Wetland Research, Faculty of Science, Radboud University, Nijmegen, The Netherlands
| | - Mihai G Netea
- Department of Internal Medicine, RIMLS, Radboudumc, Nijmegen, The Netherlands; Department of Immunology and Metabolism, Life and Medical Sciences Institute, University of Bonn, Bonn, Germany
| | - Jos P H Smits
- Department of Dermatology, Radboud Institute for Molecular Life Sciences (RIMLS), Radboud University Nijmegen Medical Center (Radboudumc), Nijmegen, The Netherlands
| | - Joost Schalkwijk
- Department of Dermatology, Radboud Institute for Molecular Life Sciences (RIMLS), Radboud University Nijmegen Medical Center (Radboudumc), Nijmegen, The Netherlands
| | - Ellen H van den Bogaard
- Department of Dermatology, Radboud Institute for Molecular Life Sciences (RIMLS), Radboud University Nijmegen Medical Center (Radboudumc), Nijmegen, The Netherlands
| | - Patrick L J M Zeeuwen
- Department of Dermatology, Radboud Institute for Molecular Life Sciences (RIMLS), Radboud University Nijmegen Medical Center (Radboudumc), Nijmegen, The Netherlands.
| |
Collapse
|
3
|
Abstract
Advances in genomics have made whole genome studies increasingly feasible across the life sciences. However, new technologies and algorithmic advances do not guarantee flawless genomic sequences or annotation. Bias, errors, and artifacts can enter at any stage of the process from library preparation to annotation. When planning an experiment that utilizes a genome sequence as the basis for the design, there are a few basic checks that, if performed, may better inform the experimental design and ideally help avoid a failed experiment or inconclusive result.
Collapse
|
4
|
Silvester N, Alako B, Amid C, Cerdeño-Tarrága A, Clarke L, Cleland I, Harrison PW, Jayathilaka S, Kay S, Keane T, Leinonen R, Liu X, Martínez-Villacorta J, Menchi M, Reddy K, Pakseresht N, Rajan J, Rossello M, Smirnov D, Toribio AL, Vaughan D, Zalunin V, Cochrane G. The European Nucleotide Archive in 2017. Nucleic Acids Res 2019; 46:D36-D40. [PMID: 29140475 PMCID: PMC5753375 DOI: 10.1093/nar/gkx1125] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2017] [Accepted: 10/25/2017] [Indexed: 12/03/2022] Open
Abstract
For 35 years the European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena) has been responsible for making the world’s public sequencing data available to the scientific community. Advances in sequencing technology have driven exponential growth in the volume of data to be processed and stored and a substantial broadening of the user community. Here, we outline ENA services and content in 2017 and provide insight into a selection of current key areas of development in ENA driven by challenges arising from the above growth.
Collapse
Affiliation(s)
- Nicole Silvester
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Blaise Alako
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Clara Amid
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Ana Cerdeño-Tarrága
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Laura Clarke
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Iain Cleland
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Peter W Harrison
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Suran Jayathilaka
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Simon Kay
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Thomas Keane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Xin Liu
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Josué Martínez-Villacorta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Manuela Menchi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Kethi Reddy
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Nima Pakseresht
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Jeena Rajan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Marc Rossello
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Dmitriy Smirnov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Ana L Toribio
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Daniel Vaughan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Vadim Zalunin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
5
|
Hönigschmid P, Bykova N, Schneider R, Ivankov D, Frishman D. Evolutionary Interplay between Symbiotic Relationships and Patterns of Signal Peptide Gain and Loss. Genome Biol Evol 2018; 10:928-938. [PMID: 29608732 PMCID: PMC5952966 DOI: 10.1093/gbe/evy049] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/02/2018] [Indexed: 01/18/2023] Open
Abstract
Can orthologous proteins differ in terms of their ability to be secreted? To answer this question, we investigated the distribution of signal peptides within the orthologous groups of Enterobacterales. Parsimony analysis and sequence comparisons revealed a large number of signal peptide gain and loss events, in which signal peptides emerge or disappear in the course of evolution. Signal peptide losses prevail over gains, an effect which is especially pronounced in the transition from the free-living or commensal to the endosymbiotic lifestyle. The disproportionate decline in the number of signal peptide-containing proteins in endosymbionts cannot be explained by the overall reduction of their genomes. Signal peptides can be gained and lost either by acquisition/elimination of the corresponding N-terminal regions or by gradual accumulation of mutations. The evolutionary dynamics of signal peptides in bacterial proteins represents a powerful mechanism of functional diversification.
Collapse
Affiliation(s)
- Peter Hönigschmid
- Department of Bioinformatics, Technische Universität München, Wissenschaftszentrum Weihenstephan, Freising, Germany
| | - Nadya Bykova
- Institute for Information Transmission Problems (Kharkevich Institute), RAS, Moscow, Russia
| | - René Schneider
- Department of Bioinformatics, Technische Universität München, Wissenschaftszentrum Weihenstephan, Freising, Germany
| | - Dmitry Ivankov
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| | - Dmitrij Frishman
- Department of Bioinformatics, Technische Universität München, Wissenschaftszentrum Weihenstephan, Freising, Germany.,Laboratory of Bioinformatics, RASA Research Center, St. Petersburg State Polytechnical University, Russia
| |
Collapse
|
6
|
Structure of the peptidoglycan polymerase RodA resolved by evolutionary coupling analysis. Nature 2018; 556:118-121. [PMID: 29590088 PMCID: PMC6035859 DOI: 10.1038/nature25985] [Citation(s) in RCA: 83] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2017] [Accepted: 02/08/2018] [Indexed: 11/29/2022]
Abstract
The Shape, Elongation, Division, and Sporulation (“SEDS”) proteins are a large family of ubiquitous and essential transmembrane enzymes with critical roles in bacterial cell wall biology. The exact function of SEDS proteins was long enigmatic, but recent work1–3 has revealed that the prototypical SEDS family member RodA is a peptidoglycan polymerase – a role previously attributed exclusively to members of the penicillin binding protein family4. This discovery has made RodA and other SEDS proteins promising targets for the development of next-generation antibiotics. However, little is known regarding the molecular basis for SEDS activity, and no structural data are available for RodA or any homolog thereof. Here, we report the crystal structure of Thermus thermophilus RodA at a resolution of 2.9 Å, determined using evolutionary covariance-based fold prediction to enable molecular replacement. The structure reveals a novel ten-pass transmembrane fold with large extracellular loops, one of which is partially disordered. The protein contains a highly conserved cavity in the transmembrane domain, reminiscent of ligand binding sites in transmembrane receptors. Mutagenesis experiments in Bacillus subtilis and Escherichia coli show that perturbation of this cavity abolishes RodA function both in vitro and in vivo, indicating it is catalytically essential. These results provide a framework for understanding bacterial cell wall synthesis and SEDS protein function.
Collapse
|
7
|
Aarts E, Ederveen THA, Naaijen J, Zwiers MP, Boekhorst J, Timmerman HM, Smeekens SP, Netea MG, Buitelaar JK, Franke B, van Hijum SAFT, Arias Vasquez A. Gut microbiome in ADHD and its relation to neural reward anticipation. PLoS One 2017; 12:e0183509. [PMID: 28863139 PMCID: PMC5581161 DOI: 10.1371/journal.pone.0183509] [Citation(s) in RCA: 187] [Impact Index Per Article: 26.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2017] [Accepted: 08/04/2017] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Microorganisms in the human intestine (i.e. the gut microbiome) have an increasingly recognized impact on human health, including brain functioning. Attention-deficit/hyperactivity disorder (ADHD) is a neurodevelopmental disorder associated with abnormalities in dopamine neurotransmission and deficits in reward processing and its underlying neuro-circuitry including the ventral striatum. The microbiome might contribute to ADHD etiology via the gut-brain axis. In this pilot study, we investigated potential differences in the microbiome between ADHD cases and undiagnosed controls, as well as its relation to neural reward processing. METHODS We used 16S rRNA marker gene sequencing (16S) to identify bacterial taxa and their predicted gene functions in 19 ADHD and 77 control participants. Using functional magnetic resonance imaging (fMRI), we interrogated the effect of observed microbiome differences in neural reward responses in a subset of 28 participants, independent of diagnosis. RESULTS For the first time, we describe gut microbial makeup of adolescents and adults diagnosed with ADHD. We found that the relative abundance of several bacterial taxa differed between cases and controls, albeit marginally significant. A nominal increase in the Bifidobacterium genus was observed in ADHD cases. In a hypothesis-driven approach, we found that the observed increase was linked to significantly enhanced 16S-based predicted bacterial gene functionality encoding cyclohexadienyl dehydratase in cases relative to controls. This enzyme is involved in the synthesis of phenylalanine, a precursor of dopamine. Increased relative abundance of this functionality was significantly associated with decreased ventral striatal fMRI responses during reward anticipation, independent of ADHD diagnosis and age. CONCLUSIONS Our results show increases in gut microbiome predicted function of dopamine precursor synthesis between ADHD cases and controls. This increase in microbiome function relates to decreased neural responses to reward anticipation. Decreased neural reward anticipation constitutes one of the hallmarks of ADHD.
Collapse
Affiliation(s)
- Esther Aarts
- Centre for Cognitive Neuroimaging, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
- * E-mail:
| | - Thomas H. A. Ederveen
- Center for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Jilly Naaijen
- Department of Cognitive Neuroscience, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Marcel P. Zwiers
- Centre for Cognitive Neuroimaging, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Jos Boekhorst
- Center for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands
- NIZO, Ede, The Netherlands
| | | | - Sanne P. Smeekens
- Department of Internal Medicine and Radboud Center for Infectious Diseases, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Mihai G. Netea
- Department of Internal Medicine and Radboud Center for Infectious Diseases, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Jan K. Buitelaar
- Department of Cognitive Neuroscience, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
- Karakter Child and Adolescent Psychiatry University Centre, Nijmegen, The Netherlands
| | - Barbara Franke
- Department of Psychiatry, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Sacha A. F. T. van Hijum
- Center for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands
- NIZO, Ede, The Netherlands
| | - Alejandro Arias Vasquez
- Department of Cognitive Neuroscience, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
- Department of Psychiatry, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, The Netherlands
| |
Collapse
|
8
|
Filiz E, Vatansever R, Ozyigit II, Uras ME, Sen U, Anjum NA, Pereira E. Genome-wide identification and expression profiling of EIL gene family in woody plant representative poplar (Populus trichocarpa). Arch Biochem Biophys 2017. [PMID: 28625764 DOI: 10.1016/j.abb.2017.06.012] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
This study aimed to improve current understanding on ethylene-insensitive 3-like (EIL) members, least explored in woody plants such as poplar (Populus trichocarpa Torr. & Grey). Herein, seven putative EIL members were identified in P. trichocarpa genome and were roughly annotated either as EIN3-like sequence associated with ethylene pathway or EIL3-like sequences related with sulfur (S)-pathway. Motif-distribution pattern of proteins also corroborated this annotation. They were distributed on six chromosomes (chr1, 3, 4 and 8-10), and were revealed to encode a protein of 509-662 residues with nuclear localization. The presence of ethylene insensitive 3 (EIN3; PF04873) domain (covering first 80-280 residues from N-terminus) was confirmed by Hidden Markov Model-based search. The first half of EIL proteins (∼80-280 residues including EIN3 domain) was substantially conserved. The second half (∼300-600 residues) was considerably diverged. Additionally, first half of proteins harbored acidic, proline-rich and glutamine-rich sites, and supported the essentiality of these regions in the transcriptional-activation and protein-function. Moreover, identified six segmental and one-tandem duplications demonstrated the negative or purifying selective nature of mutations. Furthermore, expression profile analysis indicated the possibility of a crosstalk between EIN3- and EIL3-like genes, and co-expression networks implicated their interactions with very diverse panels of biological molecules.
Collapse
Affiliation(s)
- Ertugrul Filiz
- Duzce University, Department of Crop and Animal Production, Cilimli Vocational School, 81750, Cilimli, Duzce, Turkey.
| | - Recep Vatansever
- Marmara University, Faculty of Science and Arts, Department of Biology, 34722, Goztepe, Istanbul, Turkey
| | - Ibrahim Ilker Ozyigit
- Marmara University, Faculty of Science and Arts, Department of Biology, 34722, Goztepe, Istanbul, Turkey
| | - Mehmet Emin Uras
- Marmara University, Faculty of Science and Arts, Department of Biology, 34722, Goztepe, Istanbul, Turkey
| | - Ugur Sen
- Marmara University, Faculty of Science and Arts, Department of Biology, 34722, Goztepe, Istanbul, Turkey
| | - Naser A Anjum
- CESAM-Centre for Environmental & Marine Studies and Department of Chemistry, University of Aveiro, 3810-193 Aveiro, Portugal
| | - Eduarda Pereira
- CESAM-Centre for Environmental & Marine Studies and Department of Chemistry, University of Aveiro, 3810-193 Aveiro, Portugal
| |
Collapse
|
9
|
Fujisawa T, Narikawa R, Maeda SI, Watanabe S, Kanesaki Y, Kobayashi K, Nomata J, Hanaoka M, Watanabe M, Ehira S, Suzuki E, Awai K, Nakamura Y. CyanoBase: a large-scale update on its 20th anniversary. Nucleic Acids Res 2016; 45:D551-D554. [PMID: 27899668 PMCID: PMC5210588 DOI: 10.1093/nar/gkw1131] [Citation(s) in RCA: 74] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2016] [Revised: 10/26/2016] [Accepted: 11/11/2016] [Indexed: 01/18/2023] Open
Abstract
The first ever cyanobacterial genome sequence was determined two decades ago and CyanoBase (http://genome.microbedb.jp/cyanobase), the first database for cyanobacteria was simultaneously developed to allow this genomic information to be used more efficiently. Since then, CyanoBase has constantly been extended and has received several updates. Here, we describe a new large-scale update of the database, which coincides with its 20th anniversary. We have expanded the number of cyanobacterial genomic sequences from 39 to 376 species, which consists of 86 complete and 290 draft genomes. We have also optimized the user interface for large genomic data to include the use of semantic web technologies and JBrowse and have extended community-based reannotation resources through the re-annotation of Synechocystis sp. PCC 6803 by the cyanobacterial research community. These updates have markedly improved CyanoBase, providing cyanobacterial genome annotations as references for cyanobacterial research.
Collapse
Affiliation(s)
- Takatomo Fujisawa
- Center for Information Biology, National Institute of Genetics, Research Organization of Information and Systems, Yata, Mishima 411-8540, Japan
| | - Rei Narikawa
- Department of Biological Science, Faculty of Science, Shizuoka University, Suruga-ku, Shizuoka 422-8529, Japan
| | - Shin-Ichi Maeda
- Graduate School of Bioagricultural Sciences, Nagoya University, Nagoya 464-8601 Japan
| | - Satoru Watanabe
- Department of Bioscience, Tokyo University of Agriculture, Tokyo, Japan
| | - Yu Kanesaki
- NODAI Genome Research Center, Tokyo University of Agriculture, Tokyo, Japan
| | - Koichi Kobayashi
- Department of Life Sciences, Graduate School of Arts and Sciences, The University of Tokyo, Tokyo, Japan
| | - Jiro Nomata
- Laboratory for Chemistry and Life Science, Tokyo Institute of Technology, Nagatsuta 4259, Midori-ku, Yokohama 226-8503, Japan
| | - Mitsumasa Hanaoka
- Graduate School of Horticulture, Chiba University, 648 Matsudo, Matsudo, Chiba 271-8510 Japan
| | - Mai Watanabe
- Department of Life Sciences, Graduate School of Arts and Sciences, The University of Tokyo, Tokyo, Japan
| | - Shigeki Ehira
- Graduate School of Science and Engineering, Tokyo Metropolitan University, 1-1 Minami Osawa, Hachioji, Tokyo 192-0397, Japan
| | - Eiji Suzuki
- Department of Biological Production, Faculty of Bioresource Sciences, Akita Prefectural University, Shimoshinjyo-Nakano, Akita 010-0195, Japan
| | - Koichiro Awai
- Department of Biological Science, Faculty of Science, Shizuoka University, Suruga-ku, Shizuoka 422-8529, Japan
| | - Yasukazu Nakamura
- Center for Information Biology, National Institute of Genetics, Research Organization of Information and Systems, Yata, Mishima 411-8540, Japan
| |
Collapse
|
10
|
Liu F, Si H, Wang C, Sun G, Zhou E, Chen C, Ma C. Molecular evolution of Wcor15 gene enhanced our understanding of the origin of A, B and D genomes in Triticum aestivum. Sci Rep 2016; 6:31706. [PMID: 27526862 PMCID: PMC4985644 DOI: 10.1038/srep31706] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2016] [Accepted: 07/25/2016] [Indexed: 11/29/2022] Open
Abstract
The allohexaploid bread wheat originally derived from three closely related species with A, B and D genome. Although numerous studies were performed to elucidate its origin and phylogeny, no consensus conclusion has reached. In this study, we cloned and sequenced the genes Wcor15-2A, Wcor15-2B and Wcor15-2D in 23 diploid, 10 tetraploid and 106 hexaploid wheat varieties and analyzed their molecular evolution to reveal the origin of the A, B and D genome in Triticum aestivum. Comparative analyses of sequences in diploid, tetraploid and hexaploid wheats suggest that T. urartu, Ae. speltoides and Ae. tauschii subsp. strangulata are most likely the donors of the Wcor15-2A, Wcor15-2B and Wcor15-2D locus in common wheat, respectively. The Wcor15 genes from subgenomes A and D were very conservative without insertion and deletion of bases during evolution of diploid, tetraploid and hexaploid. Non-coding region of Wcor15-2B gene from B genome might mutate during the first polyploidization from Ae. speltoides to tetraploid wheat, however, no change has occurred for this gene during the second allopolyploidization from tetraploid to hexaploid. Comparison of the Wcor15 gene shed light on understanding of the origin of the A, B and D genome of common wheat.
Collapse
Affiliation(s)
- Fangfang Liu
- School of Agronomy, Anhui Agricultural University, Hefei 230036, China.,Key Laboratory of Wheat Biology and Genetic Improvement on South Yellow &Huai River Valley, Ministry of Agriculture, Hefei 230036, China
| | - Hongqi Si
- School of Agronomy, Anhui Agricultural University, Hefei 230036, China.,Key Laboratory of Wheat Biology and Genetic Improvement on South Yellow &Huai River Valley, Ministry of Agriculture, Hefei 230036, China
| | - Chengcheng Wang
- School of Agronomy, Anhui Agricultural University, Hefei 230036, China.,Key Laboratory of Wheat Biology and Genetic Improvement on South Yellow &Huai River Valley, Ministry of Agriculture, Hefei 230036, China
| | - Genlou Sun
- School of Agronomy, Anhui Agricultural University, Hefei 230036, China.,Biology Department, Saint Mary's University, Halifax, NS, B3H 3C3 Canada
| | - Erting Zhou
- School of Agronomy, Anhui Agricultural University, Hefei 230036, China
| | - Can Chen
- School of Agronomy, Anhui Agricultural University, Hefei 230036, China
| | - Chuanxi Ma
- School of Agronomy, Anhui Agricultural University, Hefei 230036, China.,Key Laboratory of Wheat Biology and Genetic Improvement on South Yellow &Huai River Valley, Ministry of Agriculture, Hefei 230036, China.,National United Engineering Laboratory for Crop Stress Resistance Breeding, Hefei 230036, China.,Anhui Key Laboratory of Crop Biology, Hefei 230036, China
| |
Collapse
|
11
|
Abstract
Web-based protein structure databases come in a wide variety of types and levels of information content. Those having the most general interest are the various atlases that describe each experimentally determined protein structure and provide useful links, analyses, and schematic diagrams relating to its 3D structure and biological function. Also of great interest are the databases that classify 3D structures by their folds as these can reveal evolutionary relationships which may be hard to detect from sequence comparison alone. Related to these are the numerous servers that compare folds-particularly useful for newly solved structures, and especially those of unknown function. Beyond these are a vast number of databases for the more specialized user, dealing with specific families, diseases, structural features, and so on.
Collapse
Affiliation(s)
- Roman A Laskowski
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| |
Collapse
|
12
|
Abstract
GenBank(®) is a comprehensive database of publicly available DNA sequences for 300,000 named organisms, more than 110,000 within the embryophyta, obtained through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Daily data exchange with the European Nucleotide Archive (ENA) in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system that integrates data from the major DNA and protein sequence databases with taxonomy, genome, mapping, protein structure and domain information, as well as the biomedical journal literature in PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. GenBank usage scenarios ranging from local analyses of the data available via FTP to online analyses supported by the NCBI web-based tools are discussed. To access GenBank and its related retrieval and analysis services, go to the NCBI home page at www.ncbi.nlm.nih.gov .
Collapse
Affiliation(s)
- Eric W Sayers
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 45, 45 Center Drive, Bethesda, MD, 20892, USA.
| | - Ilene Karsch-Mizrachi
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 45, 45 Center Drive, Bethesda, MD, 20892, USA
| |
Collapse
|
13
|
Kobayashi M, Ohyanagi H, Yano K. Databases for Solanaceae and Cucurbitaceae Research. BIOTECHNOLOGY IN AGRICULTURE AND FORESTRY 2016. [DOI: 10.1007/978-3-662-48535-4_3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
14
|
Shin J, Lee I. Co-Inheritance Analysis within the Domains of Life Substantially Improves Network Inference by Phylogenetic Profiling. PLoS One 2015; 10:e0139006. [PMID: 26394049 PMCID: PMC4578931 DOI: 10.1371/journal.pone.0139006] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2015] [Accepted: 09/07/2015] [Indexed: 01/23/2023] Open
Abstract
Phylogenetic profiling, a network inference method based on gene inheritance profiles, has been widely used to construct functional gene networks in microbes. However, its utility for network inference in higher eukaryotes has been limited. An improved algorithm with an in-depth understanding of pathway evolution may overcome this limitation. In this study, we investigated the effects of taxonomic structures on co-inheritance analysis using 2,144 reference species in four query species: Escherichia coli, Saccharomyces cerevisiae, Arabidopsis thaliana, and Homo sapiens. We observed three clusters of reference species based on a principal component analysis of the phylogenetic profiles, which correspond to the three domains of life-Archaea, Bacteria, and Eukaryota-suggesting that pathways inherit primarily within specific domains or lower-ranked taxonomic groups during speciation. Hence, the co-inheritance pattern within a taxonomic group may be eroded by confounding inheritance patterns from irrelevant taxonomic groups. We demonstrated that co-inheritance analysis within domains substantially improved network inference not only in microbe species but also in the higher eukaryotes, including humans. Although we observed two sub-domain clusters of reference species within Eukaryota, co-inheritance analysis within these sub-domain taxonomic groups only marginally improved network inference. Therefore, we conclude that co-inheritance analysis within domains is the optimal approach to network inference with the given reference species. The construction of a series of human gene networks with increasing sample sizes of the reference species for each domain revealed that the size of the high-accuracy networks increased as additional reference species genomes were included, suggesting that within-domain co-inheritance analysis will continue to expand human gene networks as genomes of additional species are sequenced. Taken together, we propose that co-inheritance analysis within the domains of life will greatly potentiate the use of the expected onslaught of sequenced genomes in the study of molecular pathways in higher eukaryotes.
Collapse
Affiliation(s)
- Junha Shin
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Insuk Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| |
Collapse
|
15
|
Lappalainen I, Almeida-King J, Kumanduri V, Senf A, Spalding JD, Ur-Rehman S, Saunders G, Kandasamy J, Caccamo M, Leinonen R, Vaughan B, Laurent T, Rowland F, Marin-Garcia P, Barker J, Jokinen P, Torres AC, de Argila JR, Llobet OM, Medina I, Puy MS, Alberich M, de la Torre S, Navarro A, Paschall J, Flicek P. The European Genome-phenome Archive of human data consented for biomedical research. Nat Genet 2015; 47:692-5. [PMID: 26111507 PMCID: PMC5426533 DOI: 10.1038/ng.3312] [Citation(s) in RCA: 240] [Impact Index Per Article: 26.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The European Genome-phenome Archive (EGA) is a permanent archive that promotes distribution and sharing of genetic and phenotype data consented for specific approved uses, but not fully open public distribution. The EGA follows strict protocols for information management, data storage, security and dissemination. Authorized access to the data is managed in partnership with the data providing organizations. The EGA includes major reference data collections for human genetics research.
Collapse
Affiliation(s)
- Ilkka Lappalainen
- European Molecular Biology Laboratory-European Bioinformatics Institute, Hinxton, UK
| | - Jeff Almeida-King
- European Molecular Biology Laboratory-European Bioinformatics Institute, Hinxton, UK
| | - Vasudev Kumanduri
- European Molecular Biology Laboratory-European Bioinformatics Institute, Hinxton, UK
| | - Alexander Senf
- European Molecular Biology Laboratory-European Bioinformatics Institute, Hinxton, UK
| | - John Dylan Spalding
- European Molecular Biology Laboratory-European Bioinformatics Institute, Hinxton, UK
| | - Saif Ur-Rehman
- European Molecular Biology Laboratory-European Bioinformatics Institute, Hinxton, UK
| | - Gary Saunders
- European Molecular Biology Laboratory-European Bioinformatics Institute, Hinxton, UK
| | - Jag Kandasamy
- European Molecular Biology Laboratory-European Bioinformatics Institute, Hinxton, UK
| | - Mario Caccamo
- European Molecular Biology Laboratory-European Bioinformatics Institute, Hinxton, UK
| | - Rasko Leinonen
- European Molecular Biology Laboratory-European Bioinformatics Institute, Hinxton, UK
| | - Brendan Vaughan
- European Molecular Biology Laboratory-European Bioinformatics Institute, Hinxton, UK
| | - Thomas Laurent
- European Molecular Biology Laboratory-European Bioinformatics Institute, Hinxton, UK
| | - Francis Rowland
- European Molecular Biology Laboratory-European Bioinformatics Institute, Hinxton, UK
| | - Pablo Marin-Garcia
- European Molecular Biology Laboratory-European Bioinformatics Institute, Hinxton, UK
| | - Jonathan Barker
- European Molecular Biology Laboratory-European Bioinformatics Institute, Hinxton, UK
| | - Petteri Jokinen
- European Molecular Biology Laboratory-European Bioinformatics Institute, Hinxton, UK
| | | | | | | | - Ignacio Medina
- European Molecular Biology Laboratory-European Bioinformatics Institute, Hinxton, UK
| | | | | | | | - Arcadi Navarro
- 1] Centre for Genomic Regulation, Barcelona, Spain. [2] Institute of Evolutionary Biology, Universitat Pompeu Fabra-Consejo Superior de Investigaciones Científicas (CSIC), Barcelona, Spain. [3] Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Justin Paschall
- European Molecular Biology Laboratory-European Bioinformatics Institute, Hinxton, UK
| | - Paul Flicek
- European Molecular Biology Laboratory-European Bioinformatics Institute, Hinxton, UK
| |
Collapse
|
16
|
Frankish A, Uszczynska B, Ritchie GRS, Gonzalez JM, Pervouchine D, Petryszak R, Mudge JM, Fonseca N, Brazma A, Guigo R, Harrow J. Comparison of GENCODE and RefSeq gene annotation and the impact of reference geneset on variant effect prediction. BMC Genomics 2015; 16 Suppl 8:S2. [PMID: 26110515 PMCID: PMC4502323 DOI: 10.1186/1471-2164-16-s8-s2] [Citation(s) in RCA: 59] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Background A vast amount of DNA variation is being identified by increasingly large-scale exome and genome sequencing projects. To be useful, variants require accurate functional annotation and a wide range of tools are available to this end. McCarthy et al recently demonstrated the large differences in prediction of loss-of-function (LoF) variation when RefSeq and Ensembl transcripts are used for annotation, highlighting the importance of the reference transcripts on which variant functional annotation is based. Results We describe a detailed analysis of the similarities and differences between the gene and transcript annotation in the GENCODE and RefSeq genesets. We demonstrate that the GENCODE Comprehensive set is richer in alternative splicing, novel CDSs, novel exons and has higher genomic coverage than RefSeq, while the GENCODE Basic set is very similar to RefSeq. Using RNAseq data we show that exons and introns unique to one geneset are expressed at a similar level to those common to both. We present evidence that the differences in gene annotation lead to large differences in variant annotation where GENCODE and RefSeq are used as reference transcripts, although this is predominantly confined to non-coding transcripts and UTR sequence, with at most ~30% of LoF variants annotated discordantly. We also describe an investigation of dominant transcript expression, showing that it both supports the utility of the GENCODE Basic set in providing a smaller set of more highly expressed transcripts and provides a useful, biologically-relevant filter for further reducing the complexity of the transcriptome. Conclusions The reference transcripts selected for variant functional annotation do have a large effect on the outcome. The GENCODE Comprehensive transcripts contain more exons, have greater genomic coverage and capture many more variants than RefSeq in both genome and exome datasets, while the GENCODE Basic set shows a higher degree of concordance with RefSeq and has fewer unique features. We propose that the GENCODE Comprehensive set has great utility for the discovery of new variants with functional potential, while the GENCODE Basic set is more suitable for applications demanding less complex interpretation of functional variants.
Collapse
|
17
|
Holliday GL, Bairoch A, Bagos PG, Chatonnet A, Craik DJ, Finn RD, Henrissat B, Landsman D, Manning G, Nagano N, O’Donovan C, Pruitt KD, Rawlings ND, Saier M, Sowdhamini R, Spedding M, Srinivasan N, Vriend G, Babbitt PC, Bateman A. Key challenges for the creation and maintenance of specialist protein resources. Proteins 2015; 83:1005-13. [PMID: 25820941 PMCID: PMC4446195 DOI: 10.1002/prot.24803] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2015] [Revised: 03/06/2015] [Accepted: 03/20/2015] [Indexed: 11/12/2022]
Abstract
As the volume of data relating to proteins increases, researchers rely more and more on the analysis of published data, thus increasing the importance of good access to these data that vary from the supplemental material of individual articles, all the way to major reference databases with professional staff and long-term funding. Specialist protein resources fill an important middle ground, providing interactive web interfaces to their databases for a focused topic or family of proteins, using specialized approaches that are not feasible in the major reference databases. Many are labors of love, run by a single lab with little or no dedicated funding and there are many challenges to building and maintaining them. This perspective arose from a meeting of several specialist protein resources and major reference databases held at the Wellcome Trust Genome Campus (Cambridge, UK) on August 11 and 12, 2014. During this meeting some common key challenges involved in creating and maintaining such resources were discussed, along with various approaches to address them. In laying out these challenges, we aim to inform users about how these issues impact our resources and illustrate ways in which our working together could enhance their accuracy, currency, and overall value.
Collapse
Affiliation(s)
- Gemma L Holliday
- Department of Bioengineering and Therapeutic Sciences, University of CaliforniaSan Francisco, California, 94158
| | - Amos Bairoch
- SIB—Swiss Institute of Bioinformatics, University of GenevaGeneva, Switzerland
| | - Pantelis G Bagos
- Department of Computer Science and Biomedical Informatics, University of ThessalyLamia, 35100, Greece
| | - Arnaud Chatonnet
- INRA, Umr866 Dynamique Musculaire Et MétabolismeMontpellier, F-34000, France
- Université MontpellierMontpellier, F-34000, France
| | - David J Craik
- Institute for Molecular Bioscience. The University of QueenslandBrisbane, Queensland, 4072, Australia
| | - Robert D Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI)Wellcome Trust Genome Campus, Hinxton, Cambridge, Cb10 1SD, United Kingdom
| | - Bernard Henrissat
- Architecture Et Fonction Des Macromolécules Biologiques, CNRS, Aix-Marseille UniversitéMarseille, 13288, France
- Department of Biological Sciences, King Abdulaziz UniversityJeddah, Saudi Arabia
| | - David Landsman
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of HealthBethesda, Maryland, 20892
| | - Gerard Manning
- Department of Bioinformatics & Computational Biology, Genentech1 DNA Way, South San Francisco, California, 98010
| | - Nozomi Nagano
- Computational Biology Research Center, National Institute of Advanced Industrial Science and TechnologyTokyo, 135-0064, Japan
| | - Claire O’Donovan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI)Wellcome Trust Genome Campus, Hinxton, Cambridge, Cb10 1SD, United Kingdom
| | - Kim D Pruitt
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of HealthBethesda, Maryland, 20892
| | - Neil D Rawlings
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI)Wellcome Trust Genome Campus, Hinxton, Cambridge, Cb10 1SD, United Kingdom
- Wellcome Trust Sanger InstituteWellcome Trust Genome Campus, Hinxton, Cambridge, Cb10 1SD, United Kingdom
| | - Milton Saier
- Department of Molecular Biology, University of California at San DiegoLa Jolla, California, 92093
| | - Ramanathan Sowdhamini
- National Centre for Biological Sciences, TIFRGKVK Campus, Bellary Road, Bangalore, 560065, India
| | - Michael Spedding
- Chair NC-IUPHAR, Spedding Research Solutions SARL6 Rue Ampere, Le Vesinet, 78110, France
| | | | - Gert Vriend
- Centre for Molecular and Biomolecular Informatics (CMBI), Radboud University Medical Center, Geert Grooteplein Zuid 26-28, 6525 GANijmegen, The Netherlands
| | - Patricia C Babbitt
- Department of Bioengineering and Therapeutic Sciences, University of CaliforniaSan Francisco, California, 94158
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI)Wellcome Trust Genome Campus, Hinxton, Cambridge, Cb10 1SD, United Kingdom
| |
Collapse
|
18
|
Hutchins JRA. What's that gene (or protein)? Online resources for exploring functions of genes, transcripts, and proteins. Mol Biol Cell 2015; 25:1187-201. [PMID: 24723265 PMCID: PMC3982986 DOI: 10.1091/mbc.e13-10-0602] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
The genomic era has enabled research projects that use approaches including genome-scale screens, microarray analysis, next-generation sequencing, and mass spectrometry-based proteomics to discover genes and proteins involved in biological processes. Such methods generate data sets of gene, transcript, or protein hits that researchers wish to explore to understand their properties and functions and thus their possible roles in biological systems of interest. Recent years have seen a profusion of Internet-based resources to aid this process. This review takes the viewpoint of the curious biologist wishing to explore the properties of protein-coding genes and their products, identified using genome-based technologies. Ten key questions are asked about each hit, addressing functions, phenotypes, expression, evolutionary conservation, disease association, protein structure, interactors, posttranslational modifications, and inhibitors. Answers are provided by presenting the latest publicly available resources, together with methods for hit-specific and data set-wide information retrieval, suited to any genome-based analytical technique and experimental species. The utility of these resources is demonstrated for 20 factors regulating cell proliferation. Results obtained using some of these are discussed in more depth using the p53 tumor suppressor as an example. This flexible and universally applicable approach for characterizing experimental hits helps researchers to maximize the potential of their projects for biological discovery.
Collapse
Affiliation(s)
- James R A Hutchins
- Institute of Human Genetics, Centre National de la Recherche Scientifique (CNRS), 34396 Montpellier, France
| |
Collapse
|
19
|
Gray KA, Yates B, Seal RL, Wright MW, Bruford EA. Genenames.org: the HGNC resources in 2015. Nucleic Acids Res 2015; 43:D1079-85. [PMID: 25361968 PMCID: PMC4383909 DOI: 10.1093/nar/gku1071] [Citation(s) in RCA: 362] [Impact Index Per Article: 40.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2014] [Revised: 10/15/2014] [Accepted: 10/16/2014] [Indexed: 12/19/2022] Open
Abstract
The HUGO Gene Nomenclature Committee (HGNC) based at the European Bioinformatics Institute (EMBL-EBI) assigns unique symbols and names to human genes. To date the HGNC have assigned over 39,000 gene names and, representing an increase of over 5000 entries in the past two years. As well as increasing the size of our database, we have continued redesigning our website http://www.genenames.org and have modified, updated and improved many aspects of the site including a faster and more powerful search, a vastly improved HCOP tool and a REST service to increase the number of ways users can retrieve our data. This article provides an overview of our current online data and resources, and highlights the changes we have made in recent years.
Collapse
Affiliation(s)
- Kristian A Gray
- HUGO Gene Nomenclature Committee, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Bethan Yates
- HUGO Gene Nomenclature Committee, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Ruth L Seal
- HUGO Gene Nomenclature Committee, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Mathew W Wright
- HUGO Gene Nomenclature Committee, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Elspeth A Bruford
- HUGO Gene Nomenclature Committee, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
20
|
Kodama Y, Mashima J, Kosuge T, Katayama T, Fujisawa T, Kaminuma E, Ogasawara O, Okubo K, Takagi T, Nakamura Y. The DDBJ Japanese Genotype-phenotype Archive for genetic and phenotypic human data. Nucleic Acids Res 2014; 43:D18-22. [PMID: 25477381 PMCID: PMC4383935 DOI: 10.1093/nar/gku1120] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
The DNA Data Bank of Japan Center (DDBJ Center; http://www.ddbj.nig.ac.jp) maintains and provides public archival, retrieval and analytical services for biological information. Since October 2013, DDBJ Center has operated the Japanese Genotype-phenotype Archive (JGA) in collaboration with our partner institute, the National Bioscience Database Center (NBDC) of the Japan Science and Technology Agency. DDBJ Center provides the JGA database system which securely stores genotype and phenotype data collected from individuals whose consent agreements authorize data release only for specific research use. NBDC has established guidelines and policies for sharing human-derived data and reviews data submission and usage requests from researchers. In addition to the JGA project, DDBJ Center develops Semantic Web technologies for data integration and sharing in collaboration with the Database Center for Life Science. This paper describes the overview of the JGA project, updates to the DDBJ databases, and services for data retrieval, analysis and integration.
Collapse
Affiliation(s)
- Yuichi Kodama
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Jun Mashima
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Takehide Kosuge
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Toshiaki Katayama
- National Bioscience Database Center, Japan Science and Technology Agency, Tokyo 102-8666, Japan
| | - Takatomo Fujisawa
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Eli Kaminuma
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Osamu Ogasawara
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Kousaku Okubo
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Toshihisa Takagi
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan Database Center for Life Science, Chiba 277-0871, Japan
| | - Yasukazu Nakamura
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| |
Collapse
|
21
|
Silvester N, Alako B, Amid C, Cerdeño-Tárraga A, Cleland I, Gibson R, Goodgame N, Ten Hoopen P, Kay S, Leinonen R, Li W, Liu X, Lopez R, Pakseresht N, Pallreddy S, Plaister S, Radhakrishnan R, Rossello M, Senf A, Smirnov D, Toribio AL, Vaughan D, Zalunin V, Cochrane G. Content discovery and retrieval services at the European Nucleotide Archive. Nucleic Acids Res 2014; 43:D23-9. [PMID: 25404130 PMCID: PMC4383942 DOI: 10.1093/nar/gku1129] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is Europe's primary resource for nucleotide sequence information. With the growing volume and diversity of public sequencing data comes the need for increased sophistication in data organisation, presentation and search services so as to maximise its discoverability and usability. In response to this, ENA has been introducing and improving checklists for use during submission and expanding its search facilities to provide targeted search results. Here, we give a brief update on ENA content and some major developments undertaken in data submission services during 2014. We then describe in more detail the services we offer for data discovery and retrieval.
Collapse
Affiliation(s)
- Nicole Silvester
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Blaise Alako
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Clara Amid
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ana Cerdeño-Tárraga
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Iain Cleland
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Richard Gibson
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Neil Goodgame
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Petra Ten Hoopen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Simon Kay
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Weizhong Li
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Xin Liu
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rodrigo Lopez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nima Pakseresht
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Swapna Pallreddy
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sheila Plaister
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rajesh Radhakrishnan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Marc Rossello
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alexander Senf
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dmitriy Smirnov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ana Luisa Toribio
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniel Vaughan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Vadim Zalunin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
22
|
Montague E, Janko I, Stanberry L, Lee E, Choiniere J, Anderson N, Stewart E, Broomall W, Higdon R, Kolker N, Kolker E. Beyond protein expression, MOPED goes multi-omics. Nucleic Acids Res 2014; 43:D1145-51. [PMID: 25404128 PMCID: PMC4383969 DOI: 10.1093/nar/gku1175] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
MOPED (Multi-Omics Profiling Expression Database; http://moped.proteinspire.org) has transitioned from solely a protein expression database to a multi-omics resource for human and model organisms. Through a web-based interface, MOPED presents consistently processed data for gene, protein and pathway expression. To improve data quality, consistency and use, MOPED includes metadata detailing experimental design and analysis methods. The multi-omics data are integrated through direct links between genes and proteins and further connected to pathways and experiments. MOPED now contains over 5 million records, information for approximately 75 000 genes and 50 000 proteins from four organisms (human, mouse, worm, yeast). These records correspond to 670 unique combinations of experiment, condition, localization and tissue. MOPED includes the following new features: pathway expression, Pathway Details pages, experimental metadata checklists, experiment summary statistics and more advanced searching tools. Advanced searching enables querying for genes, proteins, experiments, pathways and keywords of interest. The system is enhanced with visualizations for comparing across different data types. In the future MOPED will expand the number of organisms, increase integration with pathways and provide connections to disease.
Collapse
Affiliation(s)
- Elizabeth Montague
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, WA, USA 98101 High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - Imre Janko
- High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - Larissa Stanberry
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, WA, USA 98101 High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - Elaine Lee
- High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - John Choiniere
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, WA, USA 98101 High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - Nathaniel Anderson
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, WA, USA 98101 High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - Elizabeth Stewart
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - William Broomall
- High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - Roger Higdon
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, WA, USA 98101 High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - Natali Kolker
- High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101
| | - Eugene Kolker
- Bioinformatics and High-Throughput Analysis Laboratory, Center for Developmental Therapeutics, Seattle Children's Research Institute, Seattle, WA, USA 98101 High-Throughput Analysis Core, Seattle Children's Research Institute, Seattle, WA, USA 98101 CDO Analytics, Seattle Children's, Seattle, WA, USA 98101 Data-Enabled Life Sciences Alliance (DELSA Global), Seattle, WA, USA 98101 Departments of Biomedical Informatics and Medical Education and Pediatrics, University of Washington, Seattle, WA, USA 98109 Department of Chemistry and Chemical Biology, College of Science, Northeastern University, Boston, MA 02115
| |
Collapse
|
23
|
Europe PMC: a full-text literature database for the life sciences and platform for innovation. Nucleic Acids Res 2014; 43:D1042-8. [PMID: 25378340 PMCID: PMC4383902 DOI: 10.1093/nar/gku1061] [Citation(s) in RCA: 73] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
This article describes recent developments of Europe PMC (http://europepmc.org), the leading database for life science literature. Formerly known as UKPMC, the service was rebranded in November 2012 as Europe PMC to reflect the scope of the funding agencies that support it. Several new developments have enriched Europe PMC considerably since then. Europe PMC now offers RESTful web services to access both articles and grants, powerful search tools such as citation-count sort order and data citation features, a service to add publications to your ORCID, a variety of export formats, and an External Links service that enables any related resource to be linked from Europe PMC content.
Collapse
Affiliation(s)
- The Europe PMC Consortium
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
- The British Library, 96 Euston Road, London NW1 2DB, UK
- Mimas, Roscoe Building, The University of Manchester, Oxford Road, Manchester M13 9PL, UK
- National Centre for Text Mining, School of Computer Science, University of Manchester, 131 Princess Street, Manchester M1 7DN, UK
- To whom correspondence should be addressed. Johanna McEntyre. Tel: + 44 1223 492 599; Fax: + 44 1223 494 468;
| |
Collapse
|
24
|
Petrov AI, Kay SJE, Gibson R, Kulesha E, Staines D, Bruford EA, Wright MW, Burge S, Finn RD, Kersey PJ, Cochrane G, Bateman A, Griffiths-Jones S, Harrow J, Chan PP, Lowe TM, Zwieb CW, Wower J, Williams KP, Hudson CM, Gutell R, Clark MB, Dinger M, Quek XC, Bujnicki JM, Chua NH, Liu J, Wang H, Skogerbø G, Zhao Y, Chen R, Zhu W, Cole JR, Chai B, Huang HD, Huang HY, Cherry JM, Hatzigeorgiou A, Pruitt KD. RNAcentral: an international database of ncRNA sequences. Nucleic Acids Res 2014; 43:D123-9. [PMID: 25352543 PMCID: PMC4384043 DOI: 10.1093/nar/gku991] [Citation(s) in RCA: 86] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
The field of non-coding RNA biology has been hampered by the lack of availability of a
comprehensive, up-to-date collection of accessioned RNA sequences. Here we present the
first release of RNAcentral, a database that collates and integrates information from an
international consortium of established RNA sequence databases. The initial release
contains over 8.1 million sequences, including representatives of all major functional
classes. A web portal (http://rnacentral.org) provides free access to data, search functionality,
cross-references, source code and an integrated genome browser for selected species.
Collapse
|
25
|
Hopf TA, Schärfe CPI, Rodrigues JPGLM, Green AG, Kohlbacher O, Sander C, Bonvin AMJJ, Marks DS. Sequence co-evolution gives 3D contacts and structures of protein complexes. eLife 2014; 3. [PMID: 25255213 PMCID: PMC4360534 DOI: 10.7554/elife.03430] [Citation(s) in RCA: 332] [Impact Index Per Article: 33.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2014] [Accepted: 09/23/2014] [Indexed: 12/24/2022] Open
Abstract
Protein-protein interactions are fundamental to many biological processes. Experimental screens have identified tens of thousands of interactions, and structural biology has provided detailed functional insight for select 3D protein complexes. An alternative rich source of information about protein interactions is the evolutionary sequence record. Building on earlier work, we show that analysis of correlated evolutionary sequence changes across proteins identifies residues that are close in space with sufficient accuracy to determine the three-dimensional structure of the protein complexes. We evaluate prediction performance in blinded tests on 76 complexes of known 3D structure, predict protein-protein contacts in 32 complexes of unknown structure, and demonstrate how evolutionary couplings can be used to distinguish between interacting and non-interacting protein pairs in a large complex. With the current growth of sequences, we expect that the method can be generalized to genome-wide elucidation of protein-protein interaction networks and used for interaction predictions at residue resolution.
Collapse
Affiliation(s)
- Thomas A Hopf
- Department of Systems Biology, Harvard University, Boston, United States
| | | | - João P G L M Rodrigues
- Computational Structural Biology Group, Bijvoet Center for Biomolecular Research, Utrecht University, Utrecht, Netherlands
| | - Anna G Green
- Department of Systems Biology, Harvard University, Boston, United States
| | - Oliver Kohlbacher
- Applied Bioinformatics, Quantitative Biology Center, University of Tübingen, Tübingen, Germany
| | - Chris Sander
- Computational Biology Center, Memorial Sloan Kettering Cancer Center, New York, United States
| | - Alexandre M J J Bonvin
- Computational Structural Biology Group, Bijvoet Center for Biomolecular Research, Utrecht University, Utrecht, Netherlands
| | - Debora S Marks
- Department of Systems Biology, Harvard University, Boston, United States
| |
Collapse
|
26
|
Kawano S, Watanabe T, Mizuguchi S, Araki N, Katayama T, Yamaguchi A. TogoTable: cross-database annotation system using the Resource Description Framework (RDF) data model. Nucleic Acids Res 2014; 42:W442-8. [PMID: 24829452 PMCID: PMC4086138 DOI: 10.1093/nar/gku403] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
TogoTable (http://togotable.dbcls.jp/) is a web tool that adds user-specified annotations to a table that a user uploads. Annotations are drawn from several biological databases that use the Resource Description Framework (RDF) data model. TogoTable uses database identifiers (IDs) in the table as a query key for searching. RDF data, which form a network called Linked Open Data (LOD), can be searched from SPARQL endpoints using a SPARQL query language. Because TogoTable uses RDF, it can integrate annotations from not only the reference database to which the IDs originally belong, but also externally linked databases via the LOD network. For example, annotations in the Protein Data Bank can be retrieved using GeneID through links provided by the UniProt RDF. Because RDF has been standardized by the World Wide Web Consortium, any database with annotations based on the RDF data model can be easily incorporated into this tool. We believe that TogoTable is a valuable Web tool, particularly for experimental biologists who need to process huge amounts of data such as high-throughput experimental output.
Collapse
Affiliation(s)
- Shin Kawano
- Database Center for Life Science, Research Organization of Information and Systems, 178-4-4 Wakashiba, Kashiwa, Chiba 277-0871, Japan
| | - Tsutomu Watanabe
- CrossEdge Systems Inc., 2-14-42 Higashi Yamada, Tsuzuki-ku, Yokohama, Kanagawa 224-0023, Japan
| | - Sohei Mizuguchi
- Department of Tumor Genetics and Biology, Graduate School of Medical Sciences, Kumamoto University, 1-1-1 Honjo, Chuo-ku, Kumamoto, Kumamoto 860-8556, Japan
| | - Norie Araki
- Department of Tumor Genetics and Biology, Graduate School of Medical Sciences, Kumamoto University, 1-1-1 Honjo, Chuo-ku, Kumamoto, Kumamoto 860-8556, Japan
| | - Toshiaki Katayama
- Database Center for Life Science, Research Organization of Information and Systems, 178-4-4 Wakashiba, Kashiwa, Chiba 277-0871, Japan
| | - Atsuko Yamaguchi
- Database Center for Life Science, Research Organization of Information and Systems, 178-4-4 Wakashiba, Kashiwa, Chiba 277-0871, Japan
| |
Collapse
|