1
|
Waagmeester A, Willighagen EL, Su AI, Kutmon M, Gayo JEL, Fernández-Álvarez D, Groom Q, Schaap PJ, Verhagen LM, Koehorst JJ. Author Correction: A protocol for adding knowledge to Wikidata: aligning resources on human coronaviruses. BMC Biol 2023; 21:261. [PMID: 37974169 PMCID: PMC10655412 DOI: 10.1186/s12915-023-01764-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2023] Open
Affiliation(s)
| | | | - Andrew I Su
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Martina Kutmon
- NUTRIM, Maastricht University, Maastricht, The Netherlands
- Maastricht Centre for Systems Biology (MaCSBio), Maastricht University, Maastricht, The Netherlands
| | | | | | | | - Peter J Schaap
- Department of Agrotechnology and Food Sciences, Laboratory of Systems and Synthetic Biology, Wageningen University & Research, Wageningen, The Netherlands
| | | | - Jasper J Koehorst
- Department of Agrotechnology and Food Sciences, Laboratory of Systems and Synthetic Biology, Wageningen University & Research, Wageningen, The Netherlands.
| |
Collapse
|
2
|
Shafee T, Mietchen D, Lubiana T, Jemielniak D, Waagmeester A. Ten quick tips for editing Wikidata. PLoS Comput Biol 2023; 19:e1011235. [PMID: 37471307 PMCID: PMC10358883 DOI: 10.1371/journal.pcbi.1011235] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/22/2023] Open
Affiliation(s)
- Thomas Shafee
- Swinburne University of Technology, Melbourne, Australia
| | - Daniel Mietchen
- Ronin Institute, Montclair, New Jersey, United States of America
- Institute for Globally Distributed Open Research and Education (IGDORE), Gothenburg, Sweden
- Leibniz Institute for Freshwater Ecology and Inland Fisheries (IGB), Berlin, Germany
- FIZ Karlsruhe–Leibniz Institute for Information Infrastructure, Berlin, Germany
| | - Tiago Lubiana
- Ronin Institute, Montclair, New Jersey, United States of America
- University of São Paulo, São Paulo, Brazil
| | | | | |
Collapse
|
3
|
Turki H, Jemielniak D, Hadj Taieb MA, Labra Gayo JE, Ben Aouicha M, Banat M, Shafee T, Prud’hommeaux E, Lubiana T, Das D, Mietchen D. Using logical constraints to validate statistical information about disease outbreaks in collaborative knowledge graphs: the case of COVID-19 epidemiology in Wikidata. PeerJ Comput Sci 2022; 8:e1085. [PMID: 36262159 PMCID: PMC9575845 DOI: 10.7717/peerj-cs.1085] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Accepted: 08/15/2022] [Indexed: 06/16/2023]
Abstract
Urgent global research demands real-time dissemination of precise data. Wikidata, a collaborative and openly licensed knowledge graph available in RDF format, provides an ideal forum for exchanging structured data that can be verified and consolidated using validation schemas and bot edits. In this research article, we catalog an automatable task set necessary to assess and validate the portion of Wikidata relating to the COVID-19 epidemiology. These tasks assess statistical data and are implemented in SPARQL, a query language for semantic databases. We demonstrate the efficiency of our methods for evaluating structured non-relational information on COVID-19 in Wikidata, and its applicability in collaborative ontologies and knowledge graphs more broadly. We show the advantages and limitations of our proposed approach by comparing it to the features of other methods for the validation of linked web data as revealed by previous research.
Collapse
Affiliation(s)
- Houcemeddine Turki
- Data Engineering and Semantics Research Unit, Faculty of Sciences of Sfax, University of Sfax, Sfax, Tunisia
| | - Dariusz Jemielniak
- Department of Management in Networked and Digital Societies, Kozminski University, Warsaw, Masovia, Poland
| | - Mohamed A. Hadj Taieb
- Data Engineering and Semantics Research Unit, Faculty of Sciences of Sfax, University of Sfax, Sfax, Tunisia
| | - Jose E. Labra Gayo
- Web Semantics Oviedo (WESO) Research Group, University of Oviedo, Oviedo, Asturias, Spain
| | - Mohamed Ben Aouicha
- Data Engineering and Semantics Research Unit, Faculty of Sciences of Sfax, University of Sfax, Sfax, Tunisia
| | - Mus’ab Banat
- Faculty of Medicine, Hashemite University, Zarqa, Jordan
| | - Thomas Shafee
- La Trobe University, Melbourne, Victoria, Australia
- Swinburne University of Technology, Melbourne, Victoria, Australia
| | - Eric Prud’hommeaux
- World Wide Web Consortium, Cambridge, Massachusetts, United States of America
| | - Tiago Lubiana
- Computational Systems Biology Laboratory, University of São Paulo, São Paulo, Brazil
| | - Diptanshu Das
- Institute of Child Health (ICH), Kolkata, West Bengal, India
- Medica Superspecialty Hospital, Kolkata, West Bengal, India
| | - Daniel Mietchen
- Ronin Institute, Montclair, New Jersey, United States of America
- Department of Evolutionary and Integrative Ecology, Leibniz Institute of Freshwater Ecology and Inland Fisheries, Berlin, Germany
- School of Data Science, University of Virginia, Charlottesville, Virginia, United States
- Institute for Globally Distributed Open Research and Education (IGDORE), Jena, Germany
| |
Collapse
|
4
|
Miller RA, Kutmon M, Bohler A, Waagmeester A, Evelo CT, Willighagen EL. Understanding signaling and metabolic paths using semantified and harmonized information about biological interactions. PLoS One 2022; 17:e0263057. [PMID: 35436299 PMCID: PMC9015122 DOI: 10.1371/journal.pone.0263057] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 01/11/2022] [Indexed: 11/22/2022] Open
Abstract
To grasp the complexity of biological processes, the biological knowledge is often translated into schematic diagrams of, for example, signalling and metabolic pathways. These pathway diagrams describe relevant connections between biological entities and incorporate domain knowledge in a visual format making it easier for humans to interpret. Still, these diagrams can be represented in machine readable formats, as done in the KEGG, Reactome, and WikiPathways databases. However, while humans are good at interpreting the message of the creators of diagrams, algorithms struggle when the diversity in drawing approaches increases. WikiPathways supports multiple drawing styles which need harmonizing to offer semantically enriched access. Particularly challenging, here, are the interactions between the biological entities that underlie the biological causality. These interactions provide information about the biological process (metabolic conversion, inhibition, etc.), the direction, and the participating entities. Availability of the interactions in a semantic and harmonized format is essential for searching the full network of biological interactions. We here study how the graphically-modelled biological knowledge in diagrams can be semantified and harmonized, and exemplify how the resulting data is used to programmatically answer biological questions. We find that we can translate graphically modelled knowledge to a sufficient degree into a semantic model and discuss some of the current limitations. We then use this to show that reproducible notebooks can be used to explore up- and downstream targets of MECP2 and to analyse the sphingolipid metabolism. Our results demonstrate that most of the graphical biological knowledge from WikiPathways is modelled into the semantic layer with the semantic information intact and connectivity information preserved. Being able to evaluate how biological elements affect each other is useful and allows, for example, the identification of up or downstream targets that will have a similar effect when modified.
Collapse
Affiliation(s)
- Ryan A. Miller
- Department of Bioinformatics (BiGCaT), NUTRIM, Maastricht University, Maastricht, The Netherlands
- * E-mail:
| | - Martina Kutmon
- Department of Bioinformatics (BiGCaT), NUTRIM, Maastricht University, Maastricht, The Netherlands
- Maastricht Centre for Systems Biology (MaCSBio), Maastricht University, Maastricht, The Netherlands
| | - Anwesha Bohler
- Department of Bioinformatics (BiGCaT), NUTRIM, Maastricht University, Maastricht, The Netherlands
| | - Andra Waagmeester
- Department of Bioinformatics (BiGCaT), NUTRIM, Maastricht University, Maastricht, The Netherlands
- Micellio, Antwerp, Belgium
| | - Chris T. Evelo
- Department of Bioinformatics (BiGCaT), NUTRIM, Maastricht University, Maastricht, The Netherlands
- Maastricht Centre for Systems Biology (MaCSBio), Maastricht University, Maastricht, The Netherlands
| | - Egon L. Willighagen
- Department of Bioinformatics (BiGCaT), NUTRIM, Maastricht University, Maastricht, The Netherlands
| |
Collapse
|
5
|
Chaudhri VK, Baru C, Chittar N, Dong XL, Genesereth M, Hendler J, Kalyanpur A, Lenat DB, Sequeda J, Vrandečić D, Wang K. Knowledge graphs: Introduction, history, and perspectives. AI MAG 2022. [DOI: 10.1002/aaai.12033] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
6
|
Willighagen E, Kutmon M, Martens M, Slenter D. BridgeDb and Wikidata: a powerful combination generating interoperable open research (BridgeDb). RESEARCH IDEAS AND OUTCOMES 2022. [DOI: 10.3897/rio.8.e83031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Like humans have a unique social security number and different phone numbers from various providers, so do proteins and metabolites have a unique structure but different identifiers from various databases. BridgeDb is an interoperability platform that allows combining these databases, by matching database-specific identifiers. These matches are called identifier mappings, and they are indispensable when combining experimental (omics) data with knowledge in reference databases. BridgeDb takes care of this interoperability between gene, protein, metabolite, and other databases, thus enabling seamless integration of many knowledge bases and wet-lab results. Since databases get updated continuously, so should the Open Science BridgeDb project.
Collapse
|
7
|
Fernandez-Álvarez D, Labra-Gayo JE, Gayo-Avello D. Automatic extraction of shapes using sheXer. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2021.107975] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
8
|
Zhan C, Zhang Y, Liu X, Wu R, Zhang K, Shi W, Shen L, Shen K, Fan X, Ye F, Shen B. MIKB: A manually curated and comprehensive knowledge base for myocardial infarction. Comput Struct Biotechnol J 2021; 19:6098-6107. [PMID: 34900127 PMCID: PMC8626632 DOI: 10.1016/j.csbj.2021.11.011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 11/11/2021] [Accepted: 11/11/2021] [Indexed: 02/08/2023] Open
Abstract
Myocardial infarction knowledge base (MIKB; http://www.sysbio.org.cn/mikb/; latest update: December 31, 2020) is an open-access and manually curated database dedicated to integrating knowledge about MI to improve the efficiency of translational MI research. MIKB is an updated and expanded version of our previous MI Risk Knowledge Base (MIRKB), which integrated MI-related risk factors and risk models for providing help in risk assessment or diagnostic prediction of MI. The updated MIRKB includes 9701 records with 2054 single factors, 209 combined factors, 243 risk models, 37 MI subtypes and 3406 interactions between single factors and MIs collected from 4817 research articles. The expanded functional module, i.e. MIGD, is a database including not only MI associated genetic variants, but also the other multi-omics factors and the annotations for their functional alterations. The goal of MIGD is to provide a multi-omics level understanding of the molecular pathogenesis of MI. MIGD includes 1782 omics factors, 28 MI subtypes and 2347 omics factor-MI interactions as well as 1253 genes and 6 chromosomal alterations collected from 2647 research articles. The functions of MI associated genes and their interaction with drugs were analyzed. MIKB will be continuously updated and optimized to provide precision and comprehensive knowledge for the study of heterogeneous and personalized MI.
Collapse
Affiliation(s)
- Chaoying Zhan
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Sichuan 610212, China
| | - Yingbo Zhang
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Sichuan 610212, China
- Tropical Crops Genetic Resources Institute, Chinese Academy of Tropical Agricultural Sciences, Haikou 571101, China
| | - Xingyun Liu
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Sichuan 610212, China
| | - Rongrong Wu
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Sichuan 610212, China
| | - Ke Zhang
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Sichuan 610212, China
| | - Wenjing Shi
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Sichuan 610212, China
| | - Li Shen
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Sichuan 610212, China
| | - Ke Shen
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Sichuan 610212, China
| | - Xuemeng Fan
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Sichuan 610212, China
| | - Fei Ye
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Sichuan 610212, China
| | - Bairong Shen
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Sichuan 610212, China
| |
Collapse
|