1
|
Shin MG, Pico AR. Using published pathway figures in enrichment analysis and machine learning. BMC Genomics 2023; 24:713. [PMID: 38007419 PMCID: PMC10676589 DOI: 10.1186/s12864-023-09816-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Accepted: 11/18/2023] [Indexed: 11/27/2023] Open
Abstract
Pathway Figure OCR (PFOCR) is a novel kind of pathway database approaching the breadth and depth of Gene Ontology while providing rich, mechanistic diagrams and direct literature support. Here, we highlight the utility of PFOCR in disease research in comparison with popular pathway databases through an assessment of disease coverage and analytical applications. In addition to common pathway analysis use cases, we present two advanced case studies demonstrating unique advantages of PFOCR in terms of cancer subtype and grade prediction analyses.
Collapse
Affiliation(s)
- Min-Gyoung Shin
- Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA
| | - Alexander R Pico
- Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA.
| |
Collapse
|
2
|
Shin MG, Pico A. Using Published Pathway Figures in Enrichment Analysis and Machine Learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.06.548037. [PMID: 37461614 PMCID: PMC10350053 DOI: 10.1101/2023.07.06.548037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]
Abstract
Pathway Figure OCR (PFOCR) is a novel kind of pathway database approaching the breadth and depth of Gene Ontology while providing rich, mechanistic diagrams and direct literature support. PFOCR content is extracted from published pathway figures currently emerging at a rate of 1000 new pathways each month. Here, we compare the pathway information contained in PFOCR against popular pathway databases with respect to overall and disease-specific coverage. In addition to common pathways analysis use cases, we present two advanced case studies demonstrating unique advantages of PFOCR in terms of cancer subtype and grade prediction analyses.
Collapse
|
3
|
Smith HB, Drew A, Malloy JF, Walker SI. Seeding Biochemistry on Other Worlds: Enceladus as a Case Study. ASTROBIOLOGY 2021; 21:177-190. [PMID: 33064954 PMCID: PMC7876360 DOI: 10.1089/ast.2019.2197] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Accepted: 09/07/2020] [Indexed: 06/11/2023]
Abstract
The Solar System is becoming increasingly accessible to exploration by robotic missions to search for life. However, astrobiologists currently lack well-defined frameworks to quantitatively assess the chemical space accessible to life in these alien environments. Such frameworks will be critical for developing concrete predictions needed for future mission planning, both to determine the potential viability of life on other worlds and to anticipate the molecular biosignatures that life could produce. Here, we describe how uniting existing methods provides a framework to study the accessibility of biochemical space across diverse planetary environments. Our approach combines observational data from planetary missions with genomic data catalogued from across Earth and analyzed using computational methods from network theory. To demonstrate this, we use 307 biochemical networks generated from genomic data collected across Earth and "seed" these networks with molecules confirmed to be present on Saturn's moon Enceladus. By expanding through known biochemical reaction space starting from these seed compounds, we are able to determine which products of Earth's biochemistry are, in principle, reachable from compounds available in the environment on Enceladus, and how this varies across different examples of life from Earth (organisms, ecosystems, planetary-scale biochemistry). While we find that none of the 307 prokaryotes analyzed meet the threshold for viability, the reaction space covered by this process can provide a map of possible targets for detection of Earth-like life on Enceladus, as well as targets for synthetic biology approaches to seed life on Enceladus. In cases where biochemistry is not viable because key compounds are missing, we identify the environmental precursors required to make it viable, thus providing a set of compounds to prioritize for detection in future planetary exploration missions aimed at assessing the ability of Enceladus to sustain Earth-like life or directed panspermia.
Collapse
Affiliation(s)
- Harrison B. Smith
- School of Earth and Space Exploration, Arizona State University, Tempe, Arizona, USA
| | - Alexa Drew
- School of Earth and Space Exploration, Arizona State University, Tempe, Arizona, USA
| | - John F. Malloy
- School of Earth and Space Exploration, Arizona State University, Tempe, Arizona, USA
| | - Sara Imari Walker
- School of Earth and Space Exploration, Arizona State University, Tempe, Arizona, USA
- ASU-SFI Center for Biosocial Complex Systems, Arizona State University, Tempe, Arizona, USA
- Beyond Center for Fundamental Concepts in Science, Arizona State University, Tempe, Arizona, USA
- Santa Fe Institute, Santa Fe, New Mexico, USA
| |
Collapse
|
4
|
Kniss DA, Summerfield TL. Progesterone Receptor Signaling Selectively Modulates Cytokine-Induced Global Gene Expression in Human Cervical Stromal Cells. Front Genet 2020; 11:883. [PMID: 33061933 PMCID: PMC7517718 DOI: 10.3389/fgene.2020.00883] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2019] [Accepted: 07/17/2020] [Indexed: 01/09/2023] Open
Abstract
Preterm birth (PTB) is the leading cause of morbidity and mortality in infants <1 year of age. Intrauterine inflammation is a hallmark of preterm and term parturition; however, this alone cannot fully explain the pathobiology of PTB. For example, the cervix undergoes a prolonged series of biochemical and biomechanical events, including extracellular matrix (ECM) remodeling and mechanochemical changes, culminating in ripening. Vaginal progesterone (P4) prophylaxis demonstrates great promise in preventing PTB in women with a short cervix (<25 mm). We used a primary culture model of human cervical stromal fibroblasts to investigate gene expression signatures in cells treated with interleukin-1β (IL-1β) in the presence or absence of P4 following 17β-estradiol (17β-E2) priming for 7–10 days. Microarrays were used to measure global gene expression in cells treated with cytokine or P4 alone or in combination, followed by validation of select transcripts by semiquantitative polymerase chain reactions (qRT-PCR). Primary/precursor (MIR) and mature microRNAs (miR) were quantified by microarray and NanoString® platforms, respectively, and validated by qRT-PCR. Differential gene expression was computed after data normalization followed by pathway analysis using Kyoto Encyclopedia Genes and Genomes (KEGG), Panther, Gene Ontology (GO), and Ingenuity Pathway Analysis (IPA) upstream regulator algorithm tools. Treatment of fibroblasts with IL-1β alone resulted in the differential expression of 1432 transcripts (protein coding and non-coding), while P4 alone led to the expression of only 43 transcripts compared to untreated controls. Cytokines, chemokines, and their cognate receptors and prostaglandin endoperoxide synthase-2 (PTGS-2) were among the most highly upregulated transcripts following either IL-1β or IL-1β + P4. Other prominent differentially expressed transcripts were those encoding ECM proteins, ECM-degrading enzymes, and enzymes involved in glycosaminoglycan (GAG) biosynthesis. We also detected differential expression of bradykinin receptor-1 and -2 transcripts, suggesting (prominent in tissue injury/remodeling) a role for the kallikrein–kinin system in cervical responses to cytokine and/or P4 challenge. Collectively, this global gene expression study provides a rich database to interrogate stromal fibroblasts in the setting of a proinflammatory and endocrine milieu that is relevant to cervical remodeling/ripening during preparation for parturition.
Collapse
Affiliation(s)
- Douglas A Kniss
- Division of Maternal-Fetal Medicine and Laboratory of Perinatal Research, Department of Obstetrics and Gynecology, The Ohio State University, College of Medicine and Wexner Medical Center, Columbus, OH, United States.,Department of Biomedical Engineering, College of Engineering, The Ohio State University, Columbus, OH, United States
| | - Taryn L Summerfield
- Division of Maternal-Fetal Medicine and Laboratory of Perinatal Research, Department of Obstetrics and Gynecology, The Ohio State University, College of Medicine and Wexner Medical Center, Columbus, OH, United States
| |
Collapse
|
5
|
Szczerba H, Dudziak K, Krawczyk M, Targoński Z. A Genomic Perspective on the Potential of Wild-Type Rumen Bacterium Enterobacter sp. LU1 as an Industrial Platform for Bio-Based Succinate Production. Int J Mol Sci 2020; 21:ijms21144835. [PMID: 32650546 PMCID: PMC7402333 DOI: 10.3390/ijms21144835] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Revised: 07/03/2020] [Accepted: 07/05/2020] [Indexed: 12/31/2022] Open
Abstract
Enterobacter sp. LU1, a wild-type bacterium originating from goat rumen, proved to be a potential succinic acid producer in previous studies. Here, the first complete genome of this strain was obtained and analyzed from a biotechnological perspective. A hybrid sequencing approach combining short (Illumina MiSeq) and long (ONT MinION) reads allowed us to obtain a single continuous chromosome 4,636,526 bp in size, with an average 55.6% GC content that lacked plasmids. A total of 4425 genes, including 4283 protein-coding genes, 25 ribosomal RNA (rRNA)-, 84 transfer RNA (tRNA)-, and 5 non-coding RNA (ncRNA)-encoding genes and 49 pseudogenes, were predicted. It has been shown that genes involved in transport and metabolism of carbohydrates and amino acids and the transcription process constitute the major group of genes, according to the Clusters of Orthologous Groups of proteins (COGs) database. The genetic ability of the LU1 strain to metabolize a wide range of industrially relevant carbon sources has been confirmed. The genome exploration indicated that Enterobacter sp. LU1 possesses all genes that encode the enzymes involved in the glycerol metabolism pathway. It has also been shown that succinate can be produced as an end product of fermentation via the reductive branch of the tricarboxylic acid cycle (TCA) and the glyoxylate pathway. The transport system involved in succinate excretion into the growth medium and the genes involved in the response to osmotic and oxidative stress have also been recognized. Furthermore, three intact prophage regions ~70.3 kb, ~20.9 kb, and ~49.8 kb in length, 45 genomic islands (GIs), and two clustered regularly interspaced short palindromic repeats (CRISPR) were recognized in the genome. Sequencing and genome analysis of Enterobacter sp. LU1 confirms many earlier results based on physiological experiments and provides insight into their genetic background. All of these findings illustrate that the LU1 strain has great potential to be an efficient platform for bio-based succinate production.
Collapse
Affiliation(s)
- Hubert Szczerba
- Department of Biotechnology, Microbiology and Human Nutrition, University of Life Sciences in Lublin, 20-704 Lublin, Poland;
- Correspondence: ; Tel.: +48-81-462-3402
| | - Karolina Dudziak
- Chair and Department of Biochemistry and Molecular Biology, Medical University of Lublin, 20-093 Lublin, Poland;
| | | | - Zdzisław Targoński
- Department of Biotechnology, Microbiology and Human Nutrition, University of Life Sciences in Lublin, 20-704 Lublin, Poland;
| |
Collapse
|
6
|
Tokuda M, Suzuki H, Yanagiya K, Yuki M, Inoue K, Ohkuma M, Kimbara K, Shintani M. Determination of Plasmid pSN1216-29 Host Range and the Similarity in Oligonucleotide Composition Between Plasmid and Host Chromosomes. Front Microbiol 2020; 11:1187. [PMID: 32582111 PMCID: PMC7296055 DOI: 10.3389/fmicb.2020.01187] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2020] [Accepted: 05/11/2020] [Indexed: 12/17/2022] Open
Abstract
Plasmids are extrachromosomal DNA that can be horizontally transferred between different bacterial cells by conjugation. Horizontal gene transfer of plasmids can promote rapid evolution and adaptation of bacteria by imparting various traits involved in antibiotic resistance, virulence, and metabolism to their hosts. The host range of plasmids is an important feature for understanding how they spread in environmental microbial communities. Earlier bioinformatics studies have demonstrated that plasmids are likely to have similar oligonucleotide (k-mer) compositions to their host chromosomes and that evolutionary host ranges of plasmids could be predicted from this similarity. However, there are no complementary studies to assess the consistency between the predicted evolutionary host range and experimentally determined replication/transfer host range of a plasmid. In the present study, the replication/transfer host range of a model plasmid, pSN1216-29, exogenously isolated from cow manure as a newly discovered self-transmissible plasmid, was experimentally determined within microbial communities extracted from soil and cow manure. In silico prediction of evolutionary host range was performed with the pSN1216-29 using its oligonucleotide compositions independently. The results showed that oligonucleotide compositions of the plasmid pSN1216-29 had more similarities to those of hosts (transconjugants genera) than those of non-hosts (other genera). These findings can contribute to the understanding of how plasmids behave in microbial communities, and aid in the designing of appropriate plasmid vectors for different bacteria.
Collapse
Affiliation(s)
- Maho Tokuda
- Applied Chemistry and Biochemical Engineering Course, Department of Engineering, Graduate School of Integrated Science and Technology, Shizuoka University, Shizuoka, Japan
| | - Haruo Suzuki
- Institute for Advanced Biosciences, Keio University, Tsuruoka, Japan.,Faculty of Environment and Information Studies, Keio University, Fujisawa, Japan
| | - Kosuke Yanagiya
- Applied Chemistry and Biochemical Engineering Course, Department of Engineering, Graduate School of Integrated Science and Technology, Shizuoka University, Shizuoka, Japan
| | - Masahiro Yuki
- Japan Collection of Microorganisms, RIKEN BioResource Research Center, Tsukuba, Japan
| | - Kengo Inoue
- Faculty of Agriculture, University of Miyazaki, Miyazaki, Japan
| | - Moriya Ohkuma
- Japan Collection of Microorganisms, RIKEN BioResource Research Center, Tsukuba, Japan
| | - Kazuhide Kimbara
- Applied Chemistry and Biochemical Engineering Course, Department of Engineering, Graduate School of Integrated Science and Technology, Shizuoka University, Shizuoka, Japan
| | - Masaki Shintani
- Applied Chemistry and Biochemical Engineering Course, Department of Engineering, Graduate School of Integrated Science and Technology, Shizuoka University, Shizuoka, Japan.,Japan Collection of Microorganisms, RIKEN BioResource Research Center, Tsukuba, Japan.,Research Institute of Green Science and Technology, Shizuoka University, Shizuoka, Japan
| |
Collapse
|
7
|
Genome analysis of a wild rumen bacterium Enterobacter aerogenes LU2 - a novel bio-based succinic acid producer. Sci Rep 2020; 10:1986. [PMID: 32029880 PMCID: PMC7005296 DOI: 10.1038/s41598-020-58929-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2019] [Accepted: 01/22/2020] [Indexed: 01/09/2023] Open
Abstract
Enterobacter aerogenes LU2 was isolated from cow rumen and recognized as a potential succinic acid producer in our previous study. Here, we present the first complete genome sequence of this new, wild strain and report its basic genetic features from a biotechnological perspective. The MinION single-molecule nanopore sequencer supported by the Illumina MiSeq platform yielded a circular 5,062,651 bp chromosome with a GC content of 55% that lacked plasmids. A total of 4,986 genes, including 4,741 protein-coding genes, 22 rRNA-, 86 tRNA-, and 10 ncRNA-encoding genes and 127 pseudogenes, were predicted. The genome features of the studied strain and other Enterobacteriaceae strains were compared. Functional studies on the genome content, metabolic pathways, growth, and carbon transport and utilization were performed. The genomic analysis indicates that succinic acid can be produced by the LU2 strain through the reductive branch of the tricarboxylic acid cycle (TCA) and the glyoxylate pathway. Antibiotic resistance genes were determined, and the potential for bacteriocin production was verified. Furthermore, one intact prophage region of length ~31,9 kb, 47 genomic islands (GIs) and many insertion sequences (ISs) as well as tandem repeats (TRs) were identified. No clustered regularly interspaced short palindromic repeats (CRISPRs) were found. Finally, comparative genome analysis with well-known succinic acid producers was conducted. The genome sequence illustrates that the LU2 strain has several desirable traits, which confirm its potential to be a highly efficient platform for the production of bulk chemicals.
Collapse
|
8
|
Nakamura Y, Hirose S, Taniguchi Y, Moriya Y, Yamada T. Targeted enzyme gene re-positioning: A computational approach for discovering alternative bacterial enzymes for the synthesis of plant-specific secondary metabolites. Metab Eng Commun 2019; 9:e00102. [PMID: 31720217 PMCID: PMC6838473 DOI: 10.1016/j.mec.2019.e00102] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 08/19/2019] [Accepted: 09/08/2019] [Indexed: 12/27/2022] Open
Abstract
Plant-biosynthesised secondary metabolites are unique sources of pharmaceuticals, food additives, and flavourings, among other industrial uses. However, industrial production of these metabolites is difficult because of their structural complexity, dangerousness and unfriendliness to natural environment, so the development of new methods to synthesise them is required. In this study, we developed a novel approach to identifying alternative bacterial enzyme to produce plant-biosynthesised secondary metabolites. Based on the similarity of enzymatic reactions, we searched for candidate bacterial genes encoding enzymes that could potentially replace the enzymes in plant-specific secondary metabolism reactions that are contained in the KEGG database (enzyme re-positioning). As a result, we discovered candidate bacterial alternative enzyme genes for 447 plant-specific secondary metabolic reaction. To validate our approach, we focused on the ability of an enzyme from Streptomyces coelicolor strain A3(2) strain to convert valencene to the grapefruit metabolite nootkatone, and confirmed its enzymatic activity by gas chromatography-mass spectrometry. This enzyme re-positioning approach may offer an entirely new way of screening enzymes that cannot be achieved by most of other conventional methods, and it is applicable to various other metabolites and may enable microbial production of compounds that are currently difficult to produce industrially.
Collapse
Affiliation(s)
- Yuya Nakamura
- School of Life Science and Technology, Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro, Tokyo, 152-8550, Japan
| | - Shuichi Hirose
- NAGASE R&D Center, Nagase & Co., Ltd, Kobe High Tech Park 2-2-3 Murotani, Nishi- ku, Kobe, Hyogo, 651-2241, Japan
| | - Yuko Taniguchi
- NAGASE R&D Center, Nagase & Co., Ltd, Kobe High Tech Park 2-2-3 Murotani, Nishi- ku, Kobe, Hyogo, 651-2241, Japan
| | - Yuki Moriya
- Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Kashiwa, 277-0871, Japan
| | - Takuji Yamada
- School of Life Science and Technology, Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro, Tokyo, 152-8550, Japan
- PRESTO, Japan Science and Technology Agency, 4-1-8 Honcho Kawaguchi, Saitama, 332-0012, Japan
- Metabologenomics Inc, 246-2 Kakuganji, Tsuruoka, Yamagata, 997-0052, Japan
| |
Collapse
|
9
|
Katayama T, Kawashima S, Okamoto S, Moriya Y, Chiba H, Naito Y, Fujisawa T, Mori H, Takagi T. TogoGenome/TogoStanza: modularized Semantic Web genome database. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2019; 2019:5277251. [PMID: 30624651 PMCID: PMC6323299 DOI: 10.1093/database/bay132] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/08/2018] [Accepted: 11/26/2018] [Indexed: 11/12/2022]
Abstract
TogoGenome is a genome database that is purely based on the Semantic Web technology, which enables the integration of heterogeneous data and flexible semantic searches.
All the information is stored as Resource Description Framework (RDF) data, and the reporting web pages are generated on the fly using SPARQL Protocol and RDF Query Language (SPARQL) queries. TogoGenome provides a semantic-faceted search system by gene functional annotation, taxonomy, phenotypes and environment based on the relevant ontologies. TogoGenome also serves as an interface to conduct semantic comparative genomics by which a user can observe pan-organism or organism-specific genes based on the functional aspect of gene annotations and the combinations of organisms from different taxa. The TogoGenome database exhibits a modularized structure, and each module in the report pages is separately served as TogoStanza, which is a generic framework for rendering an information block as IFRAME/Web Components, which can, unlike several other monolithic databases, also be reused to construct other databases. TogoGenome and TogoStanza have been under development since 2012 and are freely available along with their source codes on the GitHub repositories at https://github.com/togogenome/ and https://github.com/togostanza/, respectively, under the MIT license.
Collapse
Affiliation(s)
- Toshiaki Katayama
- Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Wakashiba, Kashiwa-shi, Chiba, Japan
| | - Shuichi Kawashima
- Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Wakashiba, Kashiwa-shi, Chiba, Japan
| | - Shinobu Okamoto
- Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Wakashiba, Kashiwa-shi, Chiba, Japan
| | - Yuki Moriya
- Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Wakashiba, Kashiwa-shi, Chiba, Japan
| | - Hirokazu Chiba
- Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Wakashiba, Kashiwa-shi, Chiba, Japan
| | - Yuki Naito
- Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Wakashiba, Kashiwa-shi, Chiba, Japan
| | | | - Hiroshi Mori
- National Institute of Genetics, Mishima, Shizuoka, Japan
| | - Toshihisa Takagi
- National Institute of Genetics, Mishima, Shizuoka, Japan.,Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Yayoi, Bunkyo-ku, Tokyo, Japan
| |
Collapse
|
10
|
Tarasova O, Poroikov V. HIV Resistance Prediction to Reverse Transcriptase Inhibitors: Focus on Open Data. Molecules 2018; 23:E956. [PMID: 29671808 PMCID: PMC6017644 DOI: 10.3390/molecules23040956] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2018] [Revised: 04/16/2018] [Accepted: 04/17/2018] [Indexed: 12/16/2022] Open
Abstract
Research and development of new antiretroviral agents are in great demand due to issues with safety and efficacy of the antiretroviral drugs. HIV reverse transcriptase (RT) is an important target for HIV treatment. RT inhibitors targeting early stages of the virus-host interaction are of great interest for researchers. There are a lot of clinical and biochemical data on relationships between the occurring of the single point mutations and their combinations in the pol gene of HIV and resistance of the particular variants of HIV to nucleoside and non-nucleoside reverse transcriptase inhibitors. The experimental data stored in the databases of HIV sequences can be used for development of methods that are able to predict HIV resistance based on amino acid or nucleotide sequences. The data on HIV sequences resistance can be further used for (1) development of new antiretroviral agents with high potential for HIV inhibition and elimination and (2) optimization of antiretroviral therapy. In our communication, we focus on the data on the RT sequences and HIV resistance, which are available on the Internet. The experimental methods, which are applied to produce the data on HIV-1 resistance, the known data on their concordance, are also discussed.
Collapse
Affiliation(s)
- Olga Tarasova
- Institute of Biomedical Chemistry, 10 building 8, Pogodinskaya st., Moscow 119121, Russia.
| | - Vladimir Poroikov
- Institute of Biomedical Chemistry, 10 building 8, Pogodinskaya st., Moscow 119121, Russia.
| |
Collapse
|
11
|
Herdt O, Neumann A, Timmermann B, Heyd F. The cancer-associated U2AF35 470A>G (Q157R) mutation creates an in-frame alternative 5' splice site that impacts splicing regulation in Q157R patients. RNA (NEW YORK, N.Y.) 2017; 23:1796-1806. [PMID: 28893951 PMCID: PMC5689001 DOI: 10.1261/rna.061432.117] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/22/2017] [Accepted: 09/05/2017] [Indexed: 06/07/2023]
Abstract
Recent work has identified cancer-associated U2AF35 missense mutations in two zinc-finger (ZnF) domains, but little is known about Q157R/P substitutions within the second ZnF. Surprisingly, we find that the c.470A>G mutation not only leads to the Q157R substitution, but also creates an alternative 5' splice site (ss) resulting in the deletion of four amino acids (Q157Rdel). Q157P, Q157R, and Q157Rdel control alternative splicing of distinct groups of exons in cell culture and in human patients, suggesting that missplicing of different targets may contribute to cellular aberrations. Our data emphasize the importance to explore missense mutations beyond altered protein sequence.
Collapse
Affiliation(s)
- Olga Herdt
- Freie Universität Berlin, Institute of Chemistry and Biochemistry, Laboratory of RNA Biochemistry, 14195 Berlin, Germany
| | - Alexander Neumann
- Freie Universität Berlin, Institute of Chemistry and Biochemistry, Laboratory of RNA Biochemistry, 14195 Berlin, Germany
| | - Bernd Timmermann
- Sequencing Core Facility, Max-Planck-Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Florian Heyd
- Freie Universität Berlin, Institute of Chemistry and Biochemistry, Laboratory of RNA Biochemistry, 14195 Berlin, Germany
| |
Collapse
|
12
|
Gough A, Vernetti L, Bergenthal L, Shun TY, Taylor DL. The Microphysiology Systems Database for Analyzing and Modeling Compound Interactions with Human and Animal Organ Models. ACTA ACUST UNITED AC 2016; 2:103-117. [PMID: 28781990 DOI: 10.1089/aivt.2016.0011] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Microfluidic human organ models, microphysiology systems (MPS), are currently being developed as predictive models of drug safety and efficacy in humans. To design and validate MPS as predictive of human safety liabilities requires safety data for a reference set of compounds, combined with in vitro data from the human organ models. To address this need, we have developed an internet database, the MPS database (MPS-Db), as a powerful platform for experimental design, data management, and analysis, and to combine experimental data with reference data, to enable computational modeling. The present study demonstrates the capability of the MPS-Db in early safety testing using a human liver MPS to relate the effects of tolcapone and entacapone in the in vitro model to human in vivo effects. These two compounds were chosen to be evaluated as a representative pair of marketed drugs because they are structurally similar, have the same target, and were found safe or had an acceptable risk in preclinical and clinical trials, yet tolcapone induced unacceptable levels of hepatotoxicity while entacapone was found to be safe. Results demonstrate the utility of the MPS-Db as an essential resource for relating in vitro organ model data to the multiple biochemical, preclinical, and clinical data sources on in vivo drug effects.
Collapse
Affiliation(s)
- Albert Gough
- University of Pittsburgh Drug Discovery Institute, Pittsburgh, Pennsylvania.,Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Lawrence Vernetti
- University of Pittsburgh Drug Discovery Institute, Pittsburgh, Pennsylvania.,Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Luke Bergenthal
- University of Pittsburgh Drug Discovery Institute, Pittsburgh, Pennsylvania
| | - Tong Ying Shun
- University of Pittsburgh Drug Discovery Institute, Pittsburgh, Pennsylvania
| | - D Lansing Taylor
- University of Pittsburgh Drug Discovery Institute, Pittsburgh, Pennsylvania.,Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, Pennsylvania.,University of Pittsburgh Cancer Institute, Pittsburgh, Pennsylvania
| |
Collapse
|
13
|
Fujiwara T, Yamamoto Y. Colil: a database and search service for citation contexts in the life sciences domain. J Biomed Semantics 2015; 6:38. [PMID: 26500753 PMCID: PMC4617487 DOI: 10.1186/s13326-015-0037-x] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2015] [Accepted: 09/23/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND To promote research activities in a particular research area, it is important to efficiently identify current research trends, advances, and issues in that area. Although review papers in the research area can suffice for this purpose in general, researchers are not necessarily able to obtain these papers from research aspects of their interests at the time they are required. Therefore, the utilization of the citation contexts of papers in a research area has been considered as another approach. However, there are few search services to retrieve citation contexts in the life sciences domain; furthermore, efficiently obtaining citation contexts is becoming difficult due to the large volume and rapid growth of life sciences papers. RESULTS Here, we introduce the Colil (Comments on Literature in Literature) database to store citation contexts in the life sciences domain. By using the Resource Description Framework (RDF) and a newly compiled vocabulary, we built the Colil database and made it available through the SPARQL endpoint. In addition, we developed a web-based search service called Colil that searches for a cited paper in the Colil database and then returns a list of citation contexts for it along with papers relevant to it based on co-citations. The citation contexts in the Colil database were extracted from full-text papers of the PubMed Central Open Access Subset (PMC-OAS), which includes 545,147 papers indexed in PubMed. These papers are distributed across 3,171 journals and cite 5,136,741 unique papers that correspond to approximately 25 % of total PubMed entries. CONCLUSIONS By utilizing Colil, researchers can easily refer to a set of citation contexts and relevant papers based on co-citations for a target paper. Colil helps researchers to comprehend life sciences papers in a research area more efficiently and makes their biological research more efficient.
Collapse
Affiliation(s)
| | - Yasunori Yamamoto
- Database Center for Life Science, Research Organization of Information and Systems, 178-4-4 Wakashiba, Kashiwa-shi, Chiba 277-0871 Japan
| |
Collapse
|
14
|
Velloso H, Vialle RA, Ortega JM. BOWS (bioinformatics open web services) to centralize bioinformatics tools in web services. BMC Res Notes 2015; 8:206. [PMID: 26032494 PMCID: PMC4467627 DOI: 10.1186/s13104-015-1190-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2014] [Accepted: 05/20/2015] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Bioinformaticians face a range of difficulties to get locally-installed tools running and producing results; they would greatly benefit from a system that could centralize most of the tools, using an easy interface for input and output. Web services, due to their universal nature and widely known interface, constitute a very good option to achieve this goal. RESULTS Bioinformatics open web services (BOWS) is a system based on generic web services produced to allow programmatic access to applications running on high-performance computing (HPC) clusters. BOWS intermediates the access to registered tools by providing front-end and back-end web services. Programmers can install applications in HPC clusters in any programming language and use the back-end service to check for new jobs and their parameters, and then to send the results to BOWS. Programs running in simple computers consume the BOWS front-end service to submit new processes and read results. BOWS compiles Java clients, which encapsulate the front-end web service requisitions, and automatically creates a web page that disposes the registered applications and clients. CONCLUSIONS Bioinformatics open web services registered applications can be accessed from virtually any programming language through web services, or using standard java clients. The back-end can run in HPC clusters, allowing bioinformaticians to remotely run high-processing demand applications directly from their machines.
Collapse
Affiliation(s)
- Henrique Velloso
- Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG, Brazil.
| | - Ricardo A Vialle
- Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG, Brazil.
| | - J Miguel Ortega
- Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG, Brazil.
| |
Collapse
|
15
|
Abstract
To facilitate the integration and querying of genomics data, a number of generic data warehousing frameworks have been developed. They differ in their design and capabilities, as well as their intended audience. We provide a comprehensive and quantitative review of those genomic data warehousing frameworks in the context of large-scale systems biology. We reviewed in detail four genomic data warehouses (BioMart, BioXRT, InterMine and PathwayTools) freely available to the academic community. We quantified 20 aspects of the warehouses, covering the accuracy of their responses, their computational requirements and development efforts. Performance of the warehouses was evaluated under various hardware configurations to help laboratories optimize hardware expenses. Each aspect of the benchmark may be dynamically weighted by scientists using our online tool BenchDW (http://warehousebenchmark.fungalgenomics.ca/benchmark/) to build custom warehouse profiles and tailor our results to their specific needs.
Collapse
|
16
|
Ferro M, Antonio EA, Souza W, Bacci M. ITScan: a web-based analysis tool for Internal Transcribed Spacer (ITS) sequences. BMC Res Notes 2014; 7:857. [PMID: 25430816 PMCID: PMC4258023 DOI: 10.1186/1756-0500-7-857] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2014] [Accepted: 11/19/2014] [Indexed: 11/17/2022] Open
Abstract
Background Studies on fungal diversity and ecology aim to identify fungi and to investigate their interactions with each other and with the environment. DNA sequence-based tools are essential for these studies because they can speed up the identification process and access greater fungal diversity than traditional methods. The nucleotide sequence encoding for the internal transcribed spacer (ITS) of the nuclear ribosomal RNA has recently been proposed as a standard marker for molecular identification of fungi and evaluation of fungal diversity. However, the analysis of large sets of ITS sequences involves many programs and steps, which makes this task intensive and laborious. Findings We developed the web-based pipeline ITScan, which automates the analysis of fungal ITS sequences generated either by Sanger or Next Generation Sequencing (NGS) platforms. Validation was performed using datasets containing ca. 2,000 to 40,000 sequences each. Conclusions ITScan is an online and user-friendly automated pipeline for fungal diversity analysis and identification based on ITS sequences. It speeds up a process which would otherwise be repetitive and time-consuming for users. The ITScan tool and documentation are available at http://evol.rc.unesp.br:8083/itscan.
Collapse
Affiliation(s)
- Milene Ferro
- Centro de Estudos de Insetos Sociais, Instituto de Biociências, UNESP - Univ Estadual Paulista, Rio Claro SP 13506-900, Brazil.
| | | | | | | |
Collapse
|
17
|
Gutman DA, Dunn WD, Cobb J, Stoner RM, Kalpathy-Cramer J, Erickson B. Web based tools for visualizing imaging data and development of XNATView, a zero footprint image viewer. Front Neuroinform 2014; 8:53. [PMID: 24904399 PMCID: PMC4034701 DOI: 10.3389/fninf.2014.00053] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2014] [Accepted: 04/29/2014] [Indexed: 11/13/2022] Open
Abstract
Advances in web technologies now allow direct visualization of imaging data sets without necessitating the download of large file sets or the installation of software. This allows centralization of file storage and facilitates image review and analysis. XNATView is a light framework recently developed in our lab to visualize DICOM images stored in The Extensible Neuroimaging Archive Toolkit (XNAT). It consists of a PyXNAT-based framework to wrap around the REST application programming interface (API) and query the data in XNAT. XNATView was developed to simplify quality assurance, help organize imaging data, and facilitate data sharing for intra- and inter-laboratory collaborations. Its zero-footprint design allows the user to connect to XNAT from a web browser, navigate through projects, experiments, and subjects, and view DICOM images with accompanying metadata all within a single viewing instance.
Collapse
Affiliation(s)
- David A Gutman
- Department of Biomedical Informatics, Emory University Atlanta, GA, USA
| | - William D Dunn
- Department of Biomedical Informatics, Emory University Atlanta, GA, USA
| | - Jake Cobb
- Georgia Institute of Technology, College of Computing Atlanta, GA, USA
| | - Richard M Stoner
- Department of Neurosciences, University of California San Diego School of Medicine La Jolla, CA, USA
| | - Jayashree Kalpathy-Cramer
- Harvard-MIT Division of Health Sciences and Technology, Martinos Center for Biomedical Imaging Charlestown, MA, USA
| | | |
Collapse
|
18
|
The 3rd DBCLS BioHackathon: improving life science data integration with Semantic Web technologies. J Biomed Semantics 2013; 4:6. [PMID: 23398680 PMCID: PMC3598643 DOI: 10.1186/2041-1480-4-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2012] [Accepted: 02/05/2013] [Indexed: 01/20/2023] Open
Abstract
BACKGROUND BioHackathon 2010 was the third in a series of meetings hosted by the Database Center for Life Sciences (DBCLS) in Tokyo, Japan. The overall goal of the BioHackathon series is to improve the quality and accessibility of life science research data on the Web by bringing together representatives from public databases, analytical tool providers, and cyber-infrastructure researchers to jointly tackle important challenges in the area of in silico biological research. RESULTS The theme of BioHackathon 2010 was the 'Semantic Web', and all attendees gathered with the shared goal of producing Semantic Web data from their respective resources, and/or consuming or interacting those data using their tools and interfaces. We discussed on topics including guidelines for designing semantic data and interoperability of resources. We consequently developed tools and clients for analysis and visualization. CONCLUSION We provide a meeting report from BioHackathon 2010, in which we describe the discussions, decisions, and breakthroughs made as we moved towards compliance with Semantic Web technologies - from source provider, through middleware, to the end-consumer.
Collapse
|
19
|
Wilkinson MD, Vandervalk B, McCarthy L. The Semantic Automated Discovery and Integration (SADI) Web service Design-Pattern, API and Reference Implementation. J Biomed Semantics 2011; 2:8. [PMID: 22024447 PMCID: PMC3212890 DOI: 10.1186/2041-1480-2-8] [Citation(s) in RCA: 83] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2011] [Accepted: 10/24/2011] [Indexed: 11/22/2022] Open
Abstract
Background The complexity and inter-related nature of biological data poses a difficult challenge for data and tool integration. There has been a proliferation of interoperability standards and projects over the past decade, none of which has been widely adopted by the bioinformatics community. Recent attempts have focused on the use of semantics to assist integration, and Semantic Web technologies are being welcomed by this community. Description SADI - Semantic Automated Discovery and Integration - is a lightweight set of fully standards-compliant Semantic Web service design patterns that simplify the publication of services of the type commonly found in bioinformatics and other scientific domains. Using Semantic Web technologies at every level of the Web services "stack", SADI services consume and produce instances of OWL Classes following a small number of very straightforward best-practices. In addition, we provide codebases that support these best-practices, and plug-in tools to popular developer and client software that dramatically simplify deployment of services by providers, and the discovery and utilization of those services by their consumers. Conclusions SADI Services are fully compliant with, and utilize only foundational Web standards; are simple to create and maintain for service providers; and can be discovered and utilized in a very intuitive way by biologist end-users. In addition, the SADI design patterns significantly improve the ability of software to automatically discover appropriate services based on user-needs, and automatically chain these into complex analytical workflows. We show that, when resources are exposed through SADI, data compliant with a given ontological model can be automatically gathered, or generated, from these distributed, non-coordinating resources - a behaviour we have not observed in any other Semantic system. Finally, we show that, using SADI, data dynamically generated from Web services can be explored in a manner very similar to data housed in static triple-stores, thus facilitating the intersection of Web services and Semantic Web technologies.
Collapse
Affiliation(s)
- Mark D Wilkinson
- Department of Medical Genetics, Heart + Lung Institute at St, Paul's Hospital, University of British Columbia, Vancouver, BC, Canada.
| | | | | |
Collapse
|
20
|
Katayama T, Wilkinson MD, Vos R, Kawashima T, Kawashima S, Nakao M, Yamamoto Y, Chun HW, Yamaguchi A, Kawano S, Aerts J, Aoki-Kinoshita KF, Arakawa K, Aranda B, Bonnal RJ, Fernández JM, Fujisawa T, Gordon PM, Goto N, Haider S, Harris T, Hatakeyama T, Ho I, Itoh M, Kasprzyk A, Kido N, Kim YJ, Kinjo AR, Konishi F, Kovarskaya Y, von Kuster G, Labarga A, Limviphuvadh V, McCarthy L, Nakamura Y, Nam Y, Nishida K, Nishimura K, Nishizawa T, Ogishima S, Oinn T, Okamoto S, Okuda S, Ono K, Oshita K, Park KJ, Putnam N, Senger M, Severin J, Shigemoto Y, Sugawara H, Taylor J, Trelles O, Yamasaki C, Yamashita R, Satoh N, Takagi T. The 2nd DBCLS BioHackathon: interoperable bioinformatics Web services for integrated applications. J Biomed Semantics 2011; 2:4. [PMID: 21806842 PMCID: PMC3170566 DOI: 10.1186/2041-1480-2-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2011] [Accepted: 08/02/2011] [Indexed: 01/19/2023] Open
Abstract
Background The interaction between biological researchers and the bioinformatics tools they use is still hampered by incomplete interoperability between such tools. To ensure interoperability initiatives are effectively deployed, end-user applications need to be aware of, and support, best practices and standards. Here, we report on an initiative in which software developers and genome biologists came together to explore and raise awareness of these issues: BioHackathon 2009. Results Developers in attendance came from diverse backgrounds, with experts in Web services, workflow tools, text mining and visualization. Genome biologists provided expertise and exemplar data from the domains of sequence and pathway analysis and glyco-informatics. One goal of the meeting was to evaluate the ability to address real world use cases in these domains using the tools that the developers represented. This resulted in i) a workflow to annotate 100,000 sequences from an invertebrate species; ii) an integrated system for analysis of the transcription factor binding sites (TFBSs) enriched based on differential gene expression data obtained from a microarray experiment; iii) a workflow to enumerate putative physical protein interactions among enzymes in a metabolic pathway using protein structure data; iv) a workflow to analyze glyco-gene-related diseases by searching for human homologs of glyco-genes in other species, such as fruit flies, and retrieving their phenotype-annotated SNPs. Conclusions Beyond deriving prototype solutions for each use-case, a second major purpose of the BioHackathon was to highlight areas of insufficiency. We discuss the issues raised by our exploration of the problem/solution space, concluding that there are still problems with the way Web services are modeled and annotated, including: i) the absence of several useful data or analysis functions in the Web service "space"; ii) the lack of documentation of methods; iii) lack of compliance with the SOAP/WSDL specification among and between various programming-language libraries; and iv) incompatibility between various bioinformatics data formats. Although it was still difficult to solve real world problems posed to the developers by the biological researchers in attendance because of these problems, we note the promise of addressing these issues within a semantic framework.
Collapse
Affiliation(s)
- Toshiaki Katayama
- Database Center for Life Science, Research Organization of Information and Systems, 2-11-16 Yayoi, Bunkyo-ku, Tokyo, 113-0032, Japan.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Kawano S, Ono H, Takagi T, Bono H. Tutorial videos of bioinformatics resources: online distribution trial in Japan named TogoTV. Brief Bioinform 2011; 13:258-68. [PMID: 21803786 PMCID: PMC3294242 DOI: 10.1093/bib/bbr039] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
In recent years, biological web resources such as databases and tools have become more complex because of the enormous amounts of data generated in the field of life sciences. Traditional methods of distributing tutorials include publishing textbooks and posting web documents, but these static contents cannot adequately describe recent dynamic web services. Due to improvements in computer technology, it is now possible to create dynamic content such as video with minimal effort and low cost on most modern computers. The ease of creating and distributing video tutorials instead of static content improves accessibility for researchers, annotators and curators. This article focuses on online video repositories for educational and tutorial videos provided by resource developers and users. It also describes a project in Japan named TogoTV (http://togotv.dbcls.jp/en/) and discusses the production and distribution of high-quality tutorial videos, which would be useful to viewer, with examples. This article intends to stimulate and encourage researchers who develop and use databases and tools to distribute how-to videos as a tool to enhance product usability.
Collapse
Affiliation(s)
- Shin Kawano
- Database Center for Life Science, Research Organization of Information and Systems, 2-11-16 Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan
| | | | | | | |
Collapse
|
22
|
Nolin MA, Dumontier M, Belleau F, Corbeil J. Building an HIV data mashup using Bio2RDF. Brief Bioinform 2011; 13:98-106. [PMID: 22223742 DOI: 10.1093/bib/bbr003] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
We present an update to the Bio2RDF Linked Data Network, which now comprises ∼30 billion statements across 30 data sets. Significant changes to the framework include the accommodation of global mirrors, offline data processing and new search and integration services. The utility of this new network of knowledge is illustrated through a Bio2RDF-based mashup with microarray gene expression results and interaction data obtained from the HIV-1, Human Protein Interaction Database (HHPID) with respect to the infection of human macrophages with the human immunodeficiency virus type 1 (HIV-1).
Collapse
Affiliation(s)
- Marc-Alexandre Nolin
- Department of Biology, Carleton University, 1125 Colonel By Drive, Ottawa, Ontario, Canada K1S 5B6
| | | | | | | |
Collapse
|
23
|
Goto N, Prins P, Nakao M, Bonnal R, Aerts J, Katayama T. BioRuby: bioinformatics software for the Ruby programming language. Bioinformatics 2010; 26:2617-9. [PMID: 20739307 PMCID: PMC2951089 DOI: 10.1093/bioinformatics/btq475] [Citation(s) in RCA: 111] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022] Open
Abstract
Summary: The BioRuby software toolkit contains a comprehensive set of free development tools and libraries for bioinformatics and molecular biology, written in the Ruby programming language. BioRuby has components for sequence analysis, pathway analysis, protein modelling and phylogenetic analysis; it supports many widely used data formats and provides easy access to databases, external programs and public web services, including BLAST, KEGG, GenBank, MEDLINE and GO. BioRuby comes with a tutorial, documentation and an interactive environment, which can be used in the shell, and in the web browser. Availability: BioRuby is free and open source software, made available under the Ruby license. BioRuby runs on all platforms that support Ruby, including Linux, Mac OS X and Windows. And, with JRuby, BioRuby runs on the Java Virtual Machine. The source code is available from http://www.bioruby.org/. Contact:katayama@bioruby.org
Collapse
Affiliation(s)
- Naohisa Goto
- Department of Genome Informatics, Genome Information Research Center, Research Institute for Microbial Diseases, Osaka University, Japan
| | | | | | | | | | | |
Collapse
|
24
|
Katayama T, Arakawa K, Nakao M, Ono K, Aoki-Kinoshita KF, Yamamoto Y, Yamaguchi A, Kawashima S, Chun HW, Aerts J, Aranda B, Barboza LH, Bonnal RJ, Bruskiewich R, Bryne JC, Fernández JM, Funahashi A, Gordon PM, Goto N, Groscurth A, Gutteridge A, Holland R, Kano Y, Kawas EA, Kerhornou A, Kibukawa E, Kinjo AR, Kuhn M, Lapp H, Lehvaslaiho H, Nakamura H, Nakamura Y, Nishizawa T, Nobata C, Noguchi T, Oinn TM, Okamoto S, Owen S, Pafilis E, Pocock M, Prins P, Ranzinger R, Reisinger F, Salwinski L, Schreiber M, Senger M, Shigemoto Y, Standley DM, Sugawara H, Tashiro T, Trelles O, Vos RA, Wilkinson MD, York W, Zmasek CM, Asai K, Takagi T. The DBCLS BioHackathon: standardization and interoperability for bioinformatics web services and workflows. The DBCLS BioHackathon Consortium*. J Biomed Semantics 2010; 1:8. [PMID: 20727200 PMCID: PMC2939597 DOI: 10.1186/2041-1480-1-8] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2009] [Accepted: 08/21/2010] [Indexed: 11/30/2022] Open
Abstract
Web services have become a key technology for bioinformatics, since life science databases are globally decentralized and the exponential increase in the amount of available data demands for efficient systems without the need to transfer entire databases for every step of an analysis. However, various incompatibilities among database resources and analysis services make it difficult to connect and integrate these into interoperable workflows. To resolve this situation, we invited domain specialists from web service providers, client software developers, Open Bio* projects, the BioMoby project and researchers of emerging areas where a standard exchange data format is not well established, for an intensive collaboration entitled the BioHackathon 2008. The meeting was hosted by the Database Center for Life Science (DBCLS) and Computational Biology Research Center (CBRC) and was held in Tokyo from February 11th to 15th, 2008. In this report we highlight the work accomplished and the common issues arisen from this event, including the standardization of data exchange formats and services in the emerging fields of glycoinformatics, biological interaction networks, text mining, and phyloinformatics. In addition, common shared object development based on BioSQL, as well as technical challenges in large data management, asynchronous services, and security are discussed. Consequently, we improved interoperability of web services in several fields, however, further cooperation among major database centers and continued collaborative efforts between service providers and software developers are still necessary for an effective advance in bioinformatics web service technologies.
Collapse
Affiliation(s)
- Toshiaki Katayama
- Database Center for Life Science, Research Organization of Information and Systems, 2-11-16 Yayoi, Bunkyo-ku, Tokyo, 113-0032, Japan.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|