1
|
Callaghan J, Xu CH, Xin J, Cano MA, Riutta A, Zhou E, Juneja R, Yao Y, Narayan M, Hanspers K, Agrawal A, Pico AR, Wu C, Su AI. BioThings Explorer: a query engine for a federated knowledge graph of biomedical APIs. Bioinformatics 2023; 39:7273783. [PMID: 37707514 PMCID: PMC11015316 DOI: 10.1093/bioinformatics/btad570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 08/18/2023] [Accepted: 09/12/2023] [Indexed: 09/15/2023] Open
Abstract
SUMMARY Knowledge graphs are an increasingly common data structure for representing biomedical information. These knowledge graphs can easily represent heterogeneous types of information, and many algorithms and tools exist for querying and analyzing graphs. Biomedical knowledge graphs have been used in a variety of applications, including drug repurposing, identification of drug targets, prediction of drug side effects, and clinical decision support. Typically, knowledge graphs are constructed by centralization and integration of data from multiple disparate sources. Here, we describe BioThings Explorer, an application that can query a virtual, federated knowledge graph derived from the aggregated information in a network of biomedical web services. BioThings Explorer leverages semantically precise annotations of the inputs and outputs for each resource, and automates the chaining of web service calls to execute multi-step graph queries. Because there is no large, centralized knowledge graph to maintain, BioThings Explorer is distributed as a lightweight application that dynamically retrieves information at query time. AVAILABILITY AND IMPLEMENTATION More information can be found at https://explorer.biothings.io and code is available at https://github.com/biothings/biothings_explorer.
Collapse
Affiliation(s)
- Jackson Callaghan
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Colleen H Xu
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Jiwen Xin
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Marco Alvarado Cano
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Anders Riutta
- Data Science and Biotechnology, Gladstone Institutes, University of California, San Francisco, CA 94158, United States
| | - Eric Zhou
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Rohan Juneja
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Yao Yao
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Madhumita Narayan
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Kristina Hanspers
- Data Science and Biotechnology, Gladstone Institutes, University of California, San Francisco, CA 94158, United States
| | - Ayushi Agrawal
- Data Science and Biotechnology, Gladstone Institutes, University of California, San Francisco, CA 94158, United States
| | - Alexander R Pico
- Data Science and Biotechnology, Gladstone Institutes, University of California, San Francisco, CA 94158, United States
| | - Chunlei Wu
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Andrew I Su
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| |
Collapse
|
2
|
Glen AK, Ma C, Mendoza L, Womack F, Wood EC, Sinha M, Acevedo L, Kvarfordt LG, Peene RC, Liu S, Hoffman AS, Roach JC, Deutsch EW, Ramsey SA, Koslicki D. ARAX: a graph-based modular reasoning tool for translational biomedicine. Bioinformatics 2023; 39:7031241. [PMID: 36752514 PMCID: PMC10027432 DOI: 10.1093/bioinformatics/btad082] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 12/17/2022] [Accepted: 02/07/2023] [Indexed: 04/12/2023] Open
Abstract
MOTIVATION With the rapidly growing volume of knowledge and data in biomedical databases, improved methods for knowledge-graph-based computational reasoning are needed in order to answer translational questions. Previous efforts to solve such challenging computational reasoning problems have contributed tools and approaches, but progress has been hindered by the lack of an expressive analysis workflow language for translational reasoning and by the lack of a reasoning engine-supporting that language-that federates semantically integrated knowledge-bases. RESULTS We introduce ARAX, a new reasoning system for translational biomedicine that provides a web browser user interface and an application programming interface (API). ARAX enables users to encode translational biomedical questions and to integrate knowledge across sources to answer the user's query and facilitate exploration of results. For ARAX, we developed new approaches to query planning, knowledge-gathering, reasoning and result ranking and dynamically integrate knowledge providers for answering biomedical questions. To illustrate ARAX's application and utility in specific disease contexts, we present several use-case examples. AVAILABILITY AND IMPLEMENTATION The source code and technical documentation for building the ARAX server-side software and its built-in knowledge database are freely available online (https://github.com/RTXteam/RTX). We provide a hosted ARAX service with a web browser interface at arax.rtx.ai and a web API endpoint at arax.rtx.ai/api/arax/v1.3/ui/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | - Luis Mendoza
- Institute for Systems Biology, Seattle, WA 98109, USA
| | - Finn Womack
- Huck Institutes of the Life Sciences, Pennsylvania State University, State College, PA 16802, USA
| | - E C Wood
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Meghamala Sinha
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Liliana Acevedo
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Lindsey G Kvarfordt
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Ross C Peene
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Shaopeng Liu
- Huck Institutes of the Life Sciences, Pennsylvania State University, State College, PA 16802, USA
| | - Andrew S Hoffman
- Interdisciplinary Hub for Digitalization and Society, Radboud University, Nijmegen 6500GL, The Netherlands
| | - Jared C Roach
- Institute for Systems Biology, Seattle, WA 98109, USA
| | | | | | | |
Collapse
|
3
|
Wood EC, Glen AK, Kvarfordt LG, Womack F, Acevedo L, Yoon TS, Ma C, Flores V, Sinha M, Chodpathumwan Y, Termehchy A, Roach JC, Mendoza L, Hoffman AS, Deutsch EW, Koslicki D, Ramsey SA. RTX-KG2: a system for building a semantically standardized knowledge graph for translational biomedicine. BMC Bioinformatics 2022; 23:400. [PMID: 36175836 PMCID: PMC9520835 DOI: 10.1186/s12859-022-04932-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Accepted: 09/14/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Biomedical translational science is increasingly using computational reasoning on repositories of structured knowledge (such as UMLS, SemMedDB, ChEMBL, Reactome, DrugBank, and SMPDB in order to facilitate discovery of new therapeutic targets and modalities. The NCATS Biomedical Data Translator project is working to federate autonomous reasoning agents and knowledge providers within a distributed system for answering translational questions. Within that project and the broader field, there is a need for a framework that can efficiently and reproducibly build an integrated, standards-compliant, and comprehensive biomedical knowledge graph that can be downloaded in standard serialized form or queried via a public application programming interface (API). RESULTS To create a knowledge provider system within the Translator project, we have developed RTX-KG2, an open-source software system for building-and hosting a web API for querying-a biomedical knowledge graph that uses an Extract-Transform-Load approach to integrate 70 knowledge sources (including the aforementioned core six sources) into a knowledge graph with provenance information including (where available) citations. The semantic layer and schema for RTX-KG2 follow the standard Biolink model to maximize interoperability. RTX-KG2 is currently being used by multiple Translator reasoning agents, both in its downloadable form and via its SmartAPI-registered interface. Serializations of RTX-KG2 are available for download in both the pre-canonicalized form and in canonicalized form (in which synonyms are merged). The current canonicalized version (KG2.7.3) of RTX-KG2 contains 6.4M nodes and 39.3M edges with a hierarchy of 77 relationship types from Biolink. CONCLUSION RTX-KG2 is the first knowledge graph that integrates UMLS, SemMedDB, ChEMBL, DrugBank, Reactome, SMPDB, and 64 additional knowledge sources within a knowledge graph that conforms to the Biolink standard for its semantic layer and schema. RTX-KG2 is publicly available for querying via its API at arax.rtx.ai/api/rtxkg2/v1.2/openapi.json . The code to build RTX-KG2 is publicly available at github:RTXteam/RTX-KG2 .
Collapse
Affiliation(s)
- E C Wood
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR, USA
| | - Amy K Glen
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR, USA.
| | - Lindsey G Kvarfordt
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR, USA
| | - Finn Womack
- Computer Science and Engineering, Penn State University, State College, PA, USA
| | - Liliana Acevedo
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR, USA
| | - Timothy S Yoon
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR, USA
| | - Chunyu Ma
- Huck Institutes of the Life Sciences, Penn State University, State College, PA, USA
| | - Veronica Flores
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR, USA
| | - Meghamala Sinha
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR, USA
| | | | - Arash Termehchy
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR, USA
| | | | | | - Andrew S Hoffman
- Interdisciplinary Hub for Digitalization and Society, Radboud University, Nijmegen, The Netherlands
| | | | - David Koslicki
- Computer Science and Engineering, Penn State University, State College, PA, USA.,Huck Institutes of the Life Sciences, Penn State University, State College, PA, USA.,Department of Biology, Penn State University, State College, PA, USA
| | - Stephen A Ramsey
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR, USA.,Department of Biomedical Sciences, Oregon State University, Corvallis, OR, USA
| |
Collapse
|
4
|
Evangelista JE, Clarke DJB, Xie Z, Lachmann A, Jeon M, Chen K, Jagodnik KM, Jenkins SL, Kuleshov MV, Wojciechowicz ML, Schürer SC, Medvedovic M, Ma'ayan A. SigCom LINCS: data and metadata search engine for a million gene expression signatures. Nucleic Acids Res 2022; 50:W697-W709. [PMID: 35524556 PMCID: PMC9252724 DOI: 10.1093/nar/gkac328] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 04/04/2022] [Accepted: 04/20/2022] [Indexed: 12/13/2022] Open
Abstract
Millions of transcriptome samples were generated by the Library of Integrated Network-based Cellular Signatures (LINCS) program. When these data are processed into searchable signatures along with signatures extracted from Genotype-Tissue Expression (GTEx) and Gene Expression Omnibus (GEO), connections between drugs, genes, pathways and diseases can be illuminated. SigCom LINCS is a webserver that serves over a million gene expression signatures processed, analyzed, and visualized from LINCS, GTEx, and GEO. SigCom LINCS is built with Signature Commons, a cloud-agnostic skeleton Data Commons with a focus on serving searchable signatures. SigCom LINCS provides a rapid signature similarity search for mimickers and reversers given sets of up and down genes, a gene set, a single gene, or any search term. Additionally, users of SigCom LINCS can perform a metadata search to find and analyze subsets of signatures and find information about genes and drugs. SigCom LINCS is findable, accessible, interoperable, and reusable (FAIR) with metadata linked to standard ontologies and vocabularies. In addition, all the data and signatures within SigCom LINCS are available via a well-documented API. In summary, SigCom LINCS, available at https://maayanlab.cloud/sigcom-lincs, is a rich webserver resource for accelerating drug and target discovery in systems pharmacology.
Collapse
Affiliation(s)
- John Erol Evangelista
- Department of Pharmacological Sciences, Department of Artificial Intelligence and Human Health, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Daniel J B Clarke
- Department of Pharmacological Sciences, Department of Artificial Intelligence and Human Health, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Zhuorui Xie
- Department of Pharmacological Sciences, Department of Artificial Intelligence and Human Health, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Alexander Lachmann
- Department of Pharmacological Sciences, Department of Artificial Intelligence and Human Health, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Minji Jeon
- Department of Pharmacological Sciences, Department of Artificial Intelligence and Human Health, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Kerwin Chen
- Department of Pharmacological Sciences, Department of Artificial Intelligence and Human Health, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Kathleen M Jagodnik
- Department of Pharmacological Sciences, Department of Artificial Intelligence and Human Health, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Sherry L Jenkins
- Department of Pharmacological Sciences, Department of Artificial Intelligence and Human Health, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Maxim V Kuleshov
- Department of Pharmacological Sciences, Department of Artificial Intelligence and Human Health, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Megan L Wojciechowicz
- Department of Pharmacological Sciences, Department of Artificial Intelligence and Human Health, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| | - Stephan C Schürer
- Department of Biomedical Informatics, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA
| | - Mario Medvedovic
- Department of Pharmacology, Miller School of Medicine, University of Miami, Miami, FL 33136, USA
| | - Avi Ma'ayan
- Department of Pharmacological Sciences, Department of Artificial Intelligence and Human Health, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| |
Collapse
|
5
|
Martens M, Ammar A, Riutta A, Waagmeester A, Slenter D, Hanspers K, A. Miller R, Digles D, Lopes E, Ehrhart F, Dupuis LJ, Winckers LA, Coort S, Willighagen EL, Evelo CT, Pico AR, Kutmon M. WikiPathways: connecting communities. Nucleic Acids Res 2021; 49:D613-D621. [PMID: 33211851 PMCID: PMC7779061 DOI: 10.1093/nar/gkaa1024] [Citation(s) in RCA: 430] [Impact Index Per Article: 143.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/13/2020] [Accepted: 10/19/2020] [Indexed: 12/17/2022] Open
Abstract
WikiPathways (https://www.wikipathways.org) is a biological pathway database known for its collaborative nature and open science approaches. With the core idea of the scientific community developing and curating biological knowledge in pathway models, WikiPathways lowers all barriers for accessing and using its content. Increasingly more content creators, initiatives, projects and tools have started using WikiPathways. Central in this growth and increased use of WikiPathways are the various communities that focus on particular subsets of molecular pathways such as for rare diseases and lipid metabolism. Knowledge from published pathway figures helps prioritize pathway development, using optical character and named entity recognition. We show the growth of WikiPathways over the last three years, highlight the new communities and collaborations of pathway authors and curators, and describe various technologies to connect to external resources and initiatives. The road toward a sustainable, community-driven pathway database goes through integration with other resources such as Wikidata and allowing more use, curation and redistribution of WikiPathways content.
Collapse
Affiliation(s)
- Marvin Martens
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, 6229 ER Maastricht, the Netherlands
| | - Ammar Ammar
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, 6229 ER Maastricht, the Netherlands
| | - Anders Riutta
- Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA 94158, USA
| | | | - Denise N Slenter
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, 6229 ER Maastricht, the Netherlands
| | - Kristina Hanspers
- Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA 94158, USA
| | - Ryan A. Miller
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, 6229 ER Maastricht, the Netherlands
| | - Daniela Digles
- Department of Pharmaceutical Chemistry/Pharmacoinformatics Research Group, University of Vienna, 1090 Vienna, Austria
| | - Elisson N Lopes
- Instituto de Ciencias Biologicas, Departamento de Bioquimica e Imunologia, Universidade Federal de Minas Gerais, Belo Horizonte, 31270-901, Brazil
| | - Friederike Ehrhart
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, 6229 ER Maastricht, the Netherlands
| | - Lauren J Dupuis
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, 6229 ER Maastricht, the Netherlands
| | - Laurent A Winckers
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, 6229 ER Maastricht, the Netherlands
| | - Susan L Coort
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, 6229 ER Maastricht, the Netherlands
| | - Egon L Willighagen
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, 6229 ER Maastricht, the Netherlands
| | - Chris T Evelo
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, 6229 ER Maastricht, the Netherlands
- Maastricht Centre for Systems Biology (MaCSBio), Maastricht University, 6229 EN Maastricht, the Netherlands
| | - Alexander R Pico
- Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA 94158, USA
| | - Martina Kutmon
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, 6229 ER Maastricht, the Netherlands
- Maastricht Centre for Systems Biology (MaCSBio), Maastricht University, 6229 EN Maastricht, the Netherlands
| |
Collapse
|
6
|
Kuleshov MV, Diaz JEL, Flamholz ZN, Keenan AB, Lachmann A, Wojciechowicz ML, Cagan RL, Ma'ayan A. modEnrichr: a suite of gene set enrichment analysis tools for model organisms. Nucleic Acids Res 2020; 47:W183-W190. [PMID: 31069376 PMCID: PMC6602483 DOI: 10.1093/nar/gkz347] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Revised: 04/23/2019] [Accepted: 04/25/2019] [Indexed: 12/11/2022] Open
Abstract
High-throughput experiments produce increasingly large datasets that are difficult to analyze and integrate. While most data integration approaches focus on aligning metadata, data integration can be achieved by abstracting experimental results into gene sets. Such gene sets can be made available for reuse through gene set enrichment analysis tools such as Enrichr. Enrichr currently only supports gene sets compiled from human and mouse, limiting accessibility for investigators that study other model organisms. modEnrichr is an expansion of Enrichr for four model organisms: fish, fly, worm and yeast. The gene set libraries within FishEnrichr, FlyEnrichr, WormEnrichr and YeastEnrichr are created from the Gene Ontology, mRNA expression profiles, GeneRIF, pathway databases, protein domain databases and other organism-specific resources. Additionally, libraries were created by predicting gene function from RNA-seq co-expression data processed uniformly from the gene expression omnibus for each organism. The modEnrichr suite of tools provides the ability to convert gene lists across species using an ortholog conversion tool that automatically detects the species. For complex analyses, modEnrichr provides API access that enables submitting batch queries. In summary, modEnrichr leverages existing model organism databases and other resources to facilitate comprehensive hypothesis generation through data integration.
Collapse
Affiliation(s)
- Maxim V Kuleshov
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Jennifer E L Diaz
- Department of Cell, Developmental and Regenerative Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1020, New York, NY 10029, USA
| | - Zachary N Flamholz
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Alexandra B Keenan
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Alexander Lachmann
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Megan L Wojciechowicz
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| | - Ross L Cagan
- Department of Cell, Developmental and Regenerative Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1020, New York, NY 10029, USA
| | - Avi Ma'ayan
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029, USA
| |
Collapse
|
7
|
Bizon C, Cox S, Balhoff J, Kebede Y, Wang P, Morton K, Fecho K, Tropsha A. ROBOKOP KG and KGB: Integrated Knowledge Graphs from Federated Sources. J Chem Inf Model 2019; 59:4968-4973. [DOI: 10.1021/acs.jcim.9b00683] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Chris Bizon
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27517, United States
| | - Steven Cox
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27517, United States
| | - James Balhoff
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27517, United States
| | - Yaphet Kebede
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27517, United States
| | - Patrick Wang
- CoVar Applied Technologies, Durham, North Carolina 27701, United States
| | - Kenneth Morton
- CoVar Applied Technologies, Durham, North Carolina 27701, United States
| | - Karamarie Fecho
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27517, United States
| | - Alexander Tropsha
- School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| |
Collapse
|
8
|
Wilkinson MD, Dumontier M, Sansone SA, Bonino da Silva Santos LO, Prieto M, Batista D, McQuilton P, Kuhn T, Rocca-Serra P, Crosas M, Schultes E. Evaluating FAIR maturity through a scalable, automated, community-governed framework. Sci Data 2019; 6:174. [PMID: 31541130 PMCID: PMC6754447 DOI: 10.1038/s41597-019-0184-5] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2019] [Accepted: 08/22/2019] [Indexed: 11/16/2022] Open
Abstract
Transparent evaluations of FAIRness are increasingly required by a wide range of stakeholders, from scientists to publishers, funding agencies and policy makers. We propose a scalable, automatable framework to evaluate digital resources that encompasses measurable indicators, open source tools, and participation guidelines, which come together to accommodate domain relevant community-defined FAIR assessments. The components of the framework are: (1) Maturity Indicators – community-authored specifications that delimit a specific automatically-measurable FAIR behavior; (2) Compliance Tests – small Web apps that test digital resources against individual Maturity Indicators; and (3) the Evaluator, a Web application that registers, assembles, and applies community-relevant sets of Compliance Tests against a digital resource, and provides a detailed report about what a machine “sees” when it visits that resource. We discuss the technical and social considerations of FAIR assessments, and how this translates to our community-driven infrastructure. We then illustrate how the output of the Evaluator tool can serve as a roadmap to assist data stewards to incrementally and realistically improve the FAIRness of their resources.
Collapse
Affiliation(s)
- Mark D Wilkinson
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), Departamento de Biotecnología-Biología Vegetal, Escuela Técnica Superior de Ingeniería Agronómica, Alimentaria y de Biosistemas, Universidad Politécnica de Madrid (UPM), Madrid, Spain.
| | - Michel Dumontier
- Institute of Data Science, Maastricht University, Maastricht, The Netherlands
| | - Susanna-Assunta Sansone
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, Oxford, UK.
| | - Luiz Olavo Bonino da Silva Santos
- GO FAIR International Support and Coordination Office, Leiden, The Netherlands.,Leiden University Medical Center, Leiden, The Netherlands
| | - Mario Prieto
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), Departamento de Biotecnología-Biología Vegetal, Escuela Técnica Superior de Ingeniería Agronómica, Alimentaria y de Biosistemas, Universidad Politécnica de Madrid (UPM), Madrid, Spain
| | - Dominique Batista
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, Oxford, UK
| | - Peter McQuilton
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, Oxford, UK
| | | | - Philippe Rocca-Serra
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, Oxford, UK
| | - Mercѐ Crosas
- Institute for Quantitative Social Science, Harvard University, Cambridge, USA
| | - Erik Schultes
- GO FAIR International Support and Coordination Office, Leiden, The Netherlands. .,Leiden University Medical Center, Leiden, The Netherlands. .,Leiden Center for Data Science, Leiden, The Netherlands.
| |
Collapse
|
9
|
Xin J, Afrasiabi C, Lelong S, Adesara J, Tsueng G, Su AI, Wu C. Cross-linking BioThings APIs through JSON-LD to facilitate knowledge exploration. BMC Bioinformatics 2018; 19:30. [PMID: 29390967 PMCID: PMC5796402 DOI: 10.1186/s12859-018-2041-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2017] [Accepted: 01/24/2018] [Indexed: 01/25/2023] Open
Abstract
BACKGROUND Application Programming Interfaces (APIs) are now widely used to distribute biological data. And many popular biological APIs developed by many different research teams have adopted Javascript Object Notation (JSON) as their primary data format. While usage of a common data format offers significant advantages, that alone is not sufficient for rich integrative queries across APIs. RESULTS Here, we have implemented JSON for Linking Data (JSON-LD) technology on the BioThings APIs that we have developed, MyGene.info , MyVariant.info and MyChem.info . JSON-LD provides a standard way to add semantic context to the existing JSON data structure, for the purpose of enhancing the interoperability between APIs. We demonstrated several use cases that were facilitated by semantic annotations using JSON-LD, including simpler and more precise query capabilities as well as API cross-linking. CONCLUSIONS We believe that this pattern offers a generalizable solution for interoperability of APIs in the life sciences.
Collapse
Affiliation(s)
- Jiwen Xin
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Cyrus Afrasiabi
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Sebastien Lelong
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Julee Adesara
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Ginger Tsueng
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Andrew I Su
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Chunlei Wu
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA.
| |
Collapse
|