1
|
Scrivner O, Nguyen T, Ginda M, Simon K, Börner K. Interactive network visualization of opioid crisis research: a tool for reinforcing data linkage skills for public health policy researchers. Front Artif Intell 2024; 7:1208874. [PMID: 38646414 PMCID: PMC11026550 DOI: 10.3389/frai.2024.1208874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Accepted: 01/29/2024] [Indexed: 04/23/2024] Open
Abstract
Background Public health policy researchers face a persistent challenge in identifying and integrating relevant data, particularly in the context of the U.S. opioid crisis, where a comprehensive approach is crucial. Purpose To meet this new workforce demand health policy and health economics programs are increasingly introducing data analysis and data visualization skills. Such skills facilitate data integration and discovery by linking multiple resources. Common linking strategies include individual or aggregate level linking (e.g., patient identifiers) in primary clinical data and conceptual linking (e.g., healthcare workforce, state funding, burnout rates) in secondary data. Often, the combination of primary and secondary datasets is sought, requiring additional skills, for example, understanding metadata and constructing interlinkages. Methods To help improve those skills, we developed a 2-step process using a scoping method to discover data and network visualization to interlink metadata. Results: We show how these new skills enable the discovery of relationships among data sources pertinent to public policy research related to the opioid overdose crisis and facilitate inquiry across heterogeneous data resources. In addition, our interactive network visualization introduces (1) a conceptual approach, drawing from recent systematic review studies and linked by the publications, and (2) an aggregate approach, constructed using publicly available datasets and linked through crosswalks. Conclusions These novel metadata visualization techniques can be used as a teaching tool or a discovery method and can also be extended to other public policy domains.
Collapse
Affiliation(s)
- Olga Scrivner
- Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN, United States
- Rose-Hulman Institute of Technology, Terre Haute, IN, United States
| | - Thuy Nguyen
- School of Public Health, University of Michigan, Ann Arbor, MI, United States
| | - Michael Ginda
- Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN, United States
| | - Kosali Simon
- O'Neill School of Public and Environmental Affairs, Indiana University, Bloomington, IN, United States
- National Bureau of Economic Research, Cambridge, MA, United States
| | - Katy Börner
- Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN, United States
| |
Collapse
|
2
|
Altenhoff A, Bairoch A, Bansal P, Baratin D, Bastian F, Bolleman* J, Bridge A, Burdet F, Crameri K, Dauvillier J, Dessimoz C, Gehant S, Glover N, Gnodtke K, Hayes C, Ibberson M, Kriventseva E, Kuznetsov D, Frédérique L, Mehl F, Mendes de Farias* T, Michel PA, Moretti S, Morgat A, Österle S, Pagni M, Redaschi N, Robinson-Rechavi M, Samarasinghe K, Sima AC, Szklarczyk D, Topalov O, Touré V, Unni D, von Mering C, Wollbrett J, Zahn-Zabal* M, Zdobnov E. The SIB Swiss Institute of Bioinformatics Semantic Web of data. Nucleic Acids Res 2024; 52:D44-D51. [PMID: 37878411 PMCID: PMC10767860 DOI: 10.1093/nar/gkad902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 10/02/2023] [Accepted: 10/05/2023] [Indexed: 10/27/2023] Open
Abstract
The SIB Swiss Institute of Bioinformatics (https://www.sib.swiss/) is a federation of bioinformatics research and service groups. The international life science community in academia and industry has been accessing the freely available databases provided by SIB since its inception in 1998. In this paper we present the 11 databases which currently offer semantically enriched data in accordance with the FAIR principles (Findable, Accessible, Interoperable, Reusable), as well as the Swiss Personalized Health Network initiative (SPHN) which also employs this enrichment. The semantic enrichment facilitates the manipulation of large data sets from public databases and private data sets. Examples are provided to illustrate that the data from the SIB databases can not only be queried using precise criteria individually, but also across multiple databases, including a variety of non-SIB databases. Data manipulation, be it exploration, extraction, annotation, combination, and publication, is possible using the SPARQL query language. Providing documentation, tutorials and sample queries makes it easier to navigate this web of semantic data. Through this paper, the reader will discover how the existing SIB knowledge graphs can be leveraged to tackle the complex biological or clinical questions that are being addressed today.
Collapse
|
3
|
Wang Y, Jiang Q, Geng Y, Hu Y, Tang Y, Li J, Zhang J, Mayer W, Liu S, Zhang HY, Yan X, Feng Z. SGMFQP: An ontology-based Swine Gut Microbiota Federated Query Platform. Methods 2023; 212:12-20. [PMID: 36858137 DOI: 10.1016/j.ymeth.2023.02.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Revised: 02/04/2023] [Accepted: 02/21/2023] [Indexed: 03/03/2023] Open
Abstract
Gut microbiota plays a crucial role in modulating pig development and health, and gut microbiota characteristics are associated with differences in feed efficiency. To answer open questions in feed efficiency analysis, biologists seek to retrieve information across multiple heterogeneous data sources. However, this is error-prone and time-consuming work since the queries can involve a sequence of multiple sub-queries over several databases. We present an implementation of an ontology-based Swine Gut Microbiota Federated Query Platform (SGMFQP) that provides a convenient, automated, and efficient query service about swine feeding and gut microbiota. The system is constructed based on a domain-specific Swine Gut Microbiota Ontology (SGMO), which facilitates the construction of queries independent of the actual organization of the data in the individual sources. This process is supported by a template-based query interface. A Datalog+-based federated query engine transforms the queries into sub-queries tailored for each individual data source, and an automated workflow orchestration mechanism executes the queries in each source database and consolidates the results. The efficiency of the system is demonstrated on several swine feeding scenarios.
Collapse
Affiliation(s)
- Ying Wang
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan 430070, China; College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Qin Jiang
- College of Animal Sciences and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Yilin Geng
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Yuren Hu
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Yue Tang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Jixiang Li
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Junmei Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Wolfgang Mayer
- Industrial AI Research Centre, University of South Australia, Mawson Lakes, SA 5095, Australia
| | - Shanmei Liu
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Hong-Yu Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China; Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan 430070, China; Key Laboratory of Smart Farming for Agricultural Animals, Ministry of Agriculture and Rural Affairs, Huazhong Agricultural University, Wuhan 430070, China
| | - Xianghua Yan
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan 430070, China; College of Animal Sciences and Technology, Huazhong Agricultural University, Wuhan 430070, China.
| | - Zaiwen Feng
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan 430070, China; College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China; Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan 430070, China; Key Laboratory of Smart Farming for Agricultural Animals, Ministry of Agriculture and Rural Affairs, Huazhong Agricultural University, Wuhan 430070, China; Macro Agricultural Research Institute, Huazhong Agricultural University, Wuhan 430070, China.
| |
Collapse
|
4
|
Mendes de Farias T, Wollbrett J, Robinson-Rechavi M, Bastian F. Lessons learned to boost a bioinformatics knowledge base reusability, the Bgee experience. Gigascience 2022; 12:giad058. [PMID: 37589308 PMCID: PMC10433096 DOI: 10.1093/gigascience/giad058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Revised: 05/30/2023] [Accepted: 07/07/2023] [Indexed: 08/18/2023] Open
Abstract
BACKGROUND Enhancing interoperability of bioinformatics knowledge bases is a high-priority requirement to maximize data reusability and thus increase their utility such as the return on investment for biomedical research. A knowledge base may provide useful information for life scientists and other knowledge bases, but it only acquires exchange value once the knowledge base is (re)used, and without interoperability, the utility lies dormant. RESULTS In this article, we discuss several approaches to boost interoperability depending on the interoperable parts. The findings are driven by several real-world scenario examples that were mostly implemented by Bgee, a well-established gene expression knowledge base. To better justify the findings are transferable, for each Bgee interoperability experience, we also highlight similar implementations by major bioinformatics knowledge bases. Moreover, we discuss ten general main lessons learned. These lessons can be applied in the context of any bioinformatics knowledge base to foster data reusability. CONCLUSIONS This work provides pragmatic methods and transferable skills to promote reusability of bioinformatics knowledge bases by focusing on interoperability.
Collapse
Affiliation(s)
- Tarcisio Mendes de Farias
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Lausanne 1015, Switzerland
| | - Julien Wollbrett
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Lausanne 1015, Switzerland
| | - Marc Robinson-Rechavi
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Lausanne 1015, Switzerland
| | - Frederic Bastian
- SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Lausanne 1015, Switzerland
| |
Collapse
|
5
|
Sima AC, Mendes de Farias T, Anisimova M, Dessimoz C, Robinson-Rechavi M, Zbinden E, Stockinger K. Bio-SODA UX: enabling natural language question answering over knowledge graphs with user disambiguation. DISTRIBUTED AND PARALLEL DATABASES 2022; 40:409-440. [PMID: 36097541 PMCID: PMC9458692 DOI: 10.1007/s10619-022-07414-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 06/21/2022] [Indexed: 06/15/2023]
Abstract
The problem of natural language processing over structured data has become a growing research field, both within the relational database and the Semantic Web community, with significant efforts involved in question answering over knowledge graphs (KGQA). However, many of these approaches are either specifically targeted at open-domain question answering using DBpedia, or require large training datasets to translate a natural language question to SPARQL in order to query the knowledge graph. Hence, these approaches often cannot be applied directly to complex scientific datasets where no prior training data is available. In this paper, we focus on the challenges of natural language processing over knowledge graphs of scientific datasets. In particular, we introduce Bio-SODA, a natural language processing engine that does not require training data in the form of question-answer pairs for generating SPARQL queries. Bio-SODA uses a generic graph-based approach for translating user questions to a ranked list of SPARQL candidate queries. Furthermore, Bio-SODA uses a novel ranking algorithm that includes node centrality as a measure of relevance for selecting the best SPARQL candidate query. Our experiments with real-world datasets across several scientific domains, including the official bioinformatics Question Answering over Linked Data (QALD) challenge, as well as the CORDIS dataset of European projects, show that Bio-SODA outperforms publicly available KGQA systems by an F1-score of least 20% and by an even higher factor on more complex bioinformatics datasets. Finally, we introduce Bio-SODA UX, a graphical user interface designed to assist users in the exploration of large knowledge graphs and in dynamically disambiguating natural language questions that target the data available in these graphs.
Collapse
Affiliation(s)
| | - Tarcisio Mendes de Farias
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Maria Anisimova
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- ZHAW Zurich University of Applied Sciences, Zurich, Switzerland
| | - Christophe Dessimoz
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Department of Genetics, Evolution, and Environment, University College London, London, UK
- Department of Computer Science, University College London, London, UK
| | - Marc Robinson-Rechavi
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Erich Zbinden
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- ZHAW Zurich University of Applied Sciences, Zurich, Switzerland
| | - Kurt Stockinger
- ZHAW Zurich University of Applied Sciences, Zurich, Switzerland
| |
Collapse
|
6
|
McGlinn K, Rutherford MA, Gisslander K, Hederman L, Little MA, O'Sullivan D. FAIRVASC: A semantic web approach to rare disease registry integration. Comput Biol Med 2022; 145:105313. [DOI: 10.1016/j.compbiomed.2022.105313] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 02/03/2022] [Accepted: 02/08/2022] [Indexed: 11/26/2022]
|
7
|
Lardos A, Aghaebrahimian A, Koroleva A, Sidorova J, Wolfram E, Anisimova M, Gil M. Computational Literature-based Discovery for Natural Products Research: Current State and Future Prospects. FRONTIERS IN BIOINFORMATICS 2022; 2:827207. [PMID: 36304281 PMCID: PMC9580913 DOI: 10.3389/fbinf.2022.827207] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Accepted: 02/28/2022] [Indexed: 11/21/2022] Open
Abstract
Literature-based discovery (LBD) mines existing literature in order to generate new hypotheses by finding links between previously disconnected pieces of knowledge. Although automated LBD systems are becoming widespread and indispensable in a wide variety of knowledge domains, little has been done to introduce LBD to the field of natural products research. Despite growing knowledge in the natural product domain, most of the accumulated information is found in detached data pools. LBD can facilitate better contextualization and exploitation of this wealth of data, for example by formulating new hypotheses for natural product research, especially in the context of drug discovery and development. Moreover, automated LBD systems promise to accelerate the currently tedious and expensive process of lead identification, optimization, and development. Focusing on natural product research, we briefly reflect the development of automated LBD and summarize its methods and principal data sources. In a thorough review of published use cases of LBD in the biomedical domain, we highlight the immense potential of this data mining approach for natural product research, especially in context with drug discovery or repurposing, mode of action, as well as drug or substance interactions. Most of the 91 natural product-related discoveries in our sample of reported use cases of LBD were addressed at a computer science audience. Therefore, it is the wider goal of this review to introduce automated LBD to researchers who work with natural products and to facilitate the dialogue between this community and the developers of automated LBD systems.
Collapse
Affiliation(s)
- Andreas Lardos
- Natural Product Chemistry and Phytopharmacy Research Group, Institute of Chemistry and Biotechnology, School of Life Sciences and Facility Management, Zurich University of Applied Sciences (ZHAW), Waedenswil, Switzerland
- *Correspondence: Andreas Lardos,
| | - Ahmad Aghaebrahimian
- Institute of Applied Simulation, School of Life Sciences and Facility Management, Zürich University of Applied Sciences (ZHAW), Waedenswil, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Anna Koroleva
- Institute of Applied Simulation, School of Life Sciences and Facility Management, Zürich University of Applied Sciences (ZHAW), Waedenswil, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Julia Sidorova
- Instituto de Tecnología del Conocimiento, Universidad Complutense de Madrid, Madrid, Spain
| | - Evelyn Wolfram
- Natural Product Chemistry and Phytopharmacy Research Group, Institute of Chemistry and Biotechnology, School of Life Sciences and Facility Management, Zurich University of Applied Sciences (ZHAW), Waedenswil, Switzerland
| | - Maria Anisimova
- Institute of Applied Simulation, School of Life Sciences and Facility Management, Zürich University of Applied Sciences (ZHAW), Waedenswil, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Manuel Gil
- Institute of Applied Simulation, School of Life Sciences and Facility Management, Zürich University of Applied Sciences (ZHAW), Waedenswil, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| |
Collapse
|
8
|
Calvanese D, Lanti D, Mendes De Farias T, Mosca A, Xiao G. Accessing scientific data through knowledge graphs with Ontop. PATTERNS (NEW YORK, N.Y.) 2021; 2:100346. [PMID: 34693372 PMCID: PMC8515008 DOI: 10.1016/j.patter.2021.100346] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
In this tutorial, we learn how to set up and exploit the virtual knowledge graph (VKG) approach to access data stored in relational legacy systems and to enrich such data with domain knowledge coming from different heterogeneous (biomedical) resources. The VKG approach is based on an ontology that describes a domain of interest in terms of a vocabulary familiar to the user and exposes a high-level conceptual view of the data. Users can access the data by exploiting the conceptual view, and in this way they do not need to be aware of low-level storage details. They can easily integrate ontologies coming from different sources and can obtain richer answers thanks to the interaction between data and domain knowledge.
Collapse
Affiliation(s)
- Diego Calvanese
- Faculty of Computer Science, Free University of Bozen-Bolzano, 39100 Bolzano, Italy.,Department of Computing Science, Umeå University, 901 87 Umeå, Sweden.,Ontopic S.R.L., 39100 Bolzano, Italy
| | - Davide Lanti
- Faculty of Computer Science, Free University of Bozen-Bolzano, 39100 Bolzano, Italy
| | - Tarcisio Mendes De Farias
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland.,Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
| | - Alessandro Mosca
- Faculty of Computer Science, Free University of Bozen-Bolzano, 39100 Bolzano, Italy
| | - Guohui Xiao
- Faculty of Computer Science, Free University of Bozen-Bolzano, 39100 Bolzano, Italy.,Ontopic S.R.L., 39100 Bolzano, Italy
| |
Collapse
|
9
|
Linard B, Ebersberger I, McGlynn SE, Glover N, Mochizuki T, Patricio M, Lecompte O, Nevers Y, Thomas PD, Gabaldón T, Sonnhammer E, Dessimoz C, Uchiyama I. Ten Years of Collaborative Progress in the Quest for Orthologs. Mol Biol Evol 2021; 38:3033-3045. [PMID: 33822172 PMCID: PMC8321534 DOI: 10.1093/molbev/msab098] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Revised: 02/07/2021] [Accepted: 04/01/2021] [Indexed: 12/19/2022] Open
Abstract
Accurate determination of the evolutionary relationships between genes is a foundational challenge in biology. Homology-evolutionary relatedness-is in many cases readily determined based on sequence similarity analysis. By contrast, whether or not two genes directly descended from a common ancestor by a speciation event (orthologs) or duplication event (paralogs) is more challenging, yet provides critical information on the history of a gene. Since 2009, this task has been the focus of the Quest for Orthologs (QFO) Consortium. The sixth QFO meeting took place in Okazaki, Japan in conjunction with the 67th National Institute for Basic Biology conference. Here, we report recent advances, applications, and oncoming challenges that were discussed during the conference. Steady progress has been made toward standardization and scalability of new and existing tools. A feature of the conference was the presentation of a panel of accessible tools for phylogenetic profiling and several developments to bring orthology beyond the gene unit-from domains to networks. This meeting brought into light several challenges to come: leveraging orthology computations to get the most of the incoming avalanche of genomic data, integrating orthology from domain to biological network levels, building better gene models, and adapting orthology approaches to the broad evolutionary and genomic diversity recognized in different forms of life and viruses.
Collapse
Affiliation(s)
- Benjamin Linard
- LIRMM, University of Montpellier, CNRS, Montpellier, France.,SPYGEN, Le Bourget-du-Lac, France
| | - Ingo Ebersberger
- Institute of Cell Biology and Neuroscience, Goethe University Frankfurt, Frankfurt, Germany.,Senckenberg Biodiversity and Climate Research Centre (S-BIKF), Frankfurt, Germany.,LOEWE Center for Translational Biodiversity Genomics (TBG), Frankfurt, Germany
| | - Shawn E McGlynn
- Earth-Life Science Institute, Tokyo Institute of Technology, Meguro, Tokyo, Japan.,Blue Marble Space Institute of Science, Seattle, WA, USA
| | - Natasha Glover
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | - Tomohiro Mochizuki
- Earth-Life Science Institute, Tokyo Institute of Technology, Meguro, Tokyo, Japan
| | - Mateus Patricio
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Odile Lecompte
- Department of Computer Science, ICube, UMR 7357, University of Strasbourg, CNRS, Fédération de Médecine Translationnelle de Strasbourg, Strasbourg, France
| | - Yannis Nevers
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | - Paul D Thomas
- Division of Bioinformatics, Department of Preventive Medicine, University of Southern California, Los Angeles, CA, USA
| | - Toni Gabaldón
- Barcelona Supercomputing Centre (BCS-CNS), Jordi Girona, Barcelona, Spain.,Institute for Research in Biomedicine (IRB), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Erik Sonnhammer
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Solna, Sweden
| | - Christophe Dessimoz
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.,Department of Computer Science, University College London, London, United Kingdom.,Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| | - Ikuo Uchiyama
- Department of Theoretical Biology, National Institute for Basic Biology, National Institutes of Natural Sciences, Okazaki, Aichi, Japan
| | | |
Collapse
|
10
|
Kellman BP, Lewis NE. Big-Data Glycomics: Tools to Connect Glycan Biosynthesis to Extracellular Communication. Trends Biochem Sci 2021; 46:284-300. [PMID: 33349503 PMCID: PMC7954846 DOI: 10.1016/j.tibs.2020.10.004] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2020] [Revised: 10/05/2020] [Accepted: 10/22/2020] [Indexed: 12/12/2022]
Abstract
Characteristically, cells must sense and respond to environmental cues. Despite the importance of cell-cell communication, our understanding remains limited and often lacks glycans. Glycans decorate proteins and cell membranes at the cell-environment interface, and modulate intercellular communication, from development to pathogenesis. Providing further challenges, glycan biosynthesis and cellular behavior are co-regulating systems. Here, we discuss how glycosylation contributes to extracellular responses and signaling. We further organize approaches for disentangling the roles of glycans in multicellular interactions using newly available datasets and tools, including glycan biosynthesis models, omics datasets, and systems-level analyses. Thus, emerging tools in big data analytics and systems biology are facilitating novel insights on glycans and their relationship with multicellular behavior.
Collapse
Affiliation(s)
- Benjamin P Kellman
- Department of Pediatrics, University of California San Diego School of Medicine, La Jolla, CA, USA; Department of Bioengineering, University of California San Diego School of Medicine, La Jolla, CA, USA; Bioinformatics and Systems Biology Program, University of California San Diego School of Medicine, La Jolla, CA, USA
| | - Nathan E Lewis
- Department of Pediatrics, University of California San Diego School of Medicine, La Jolla, CA, USA; Department of Bioengineering, University of California San Diego School of Medicine, La Jolla, CA, USA; Bioinformatics and Systems Biology Program, University of California San Diego School of Medicine, La Jolla, CA, USA; Novo Nordisk Foundation Center for Biosustainability at the University of California San Diego School of Medicine, La Jolla, CA, USA.
| |
Collapse
|
11
|
Bastian FB, Roux J, Niknejad A, Comte A, Fonseca Costa SS, de Farias TM, Moretti S, Parmentier G, de Laval VR, Rosikiewicz M, Wollbrett J, Echchiki A, Escoriza A, Gharib WH, Gonzales-Porta M, Jarosz Y, Laurenczy B, Moret P, Person E, Roelli P, Sanjeev K, Seppey M, Robinson-Rechavi M. The Bgee suite: integrated curated expression atlas and comparative transcriptomics in animals. Nucleic Acids Res 2021; 49:D831-D847. [PMID: 33037820 PMCID: PMC7778977 DOI: 10.1093/nar/gkaa793] [Citation(s) in RCA: 93] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Revised: 08/24/2020] [Accepted: 09/15/2020] [Indexed: 01/24/2023] Open
Abstract
Bgee is a database to retrieve and compare gene expression patterns in multiple animal species, produced by integrating multiple data types (RNA-Seq, Affymetrix, in situ hybridization, and EST data). It is based exclusively on curated healthy wild-type expression data (e.g., no gene knock-out, no treatment, no disease), to provide a comparable reference of normal gene expression. Curation includes very large datasets such as GTEx (re-annotation of samples as ‘healthy’ or not) as well as many small ones. Data are integrated and made comparable between species thanks to consistent data annotation and processing, and to calls of presence/absence of expression, along with expression scores. As a result, Bgee is capable of detecting the conditions of expression of any single gene, accommodating any data type and species. Bgee provides several tools for analyses, allowing, e.g., automated comparisons of gene expression patterns within and between species, retrieval of the prefered conditions of expression of any gene, or enrichment analyses of conditions with expression of sets of genes. Bgee release 14.1 includes 29 animal species, and is available at https://bgee.org/ and through its Bioconductor R package BgeeDB.
Collapse
Affiliation(s)
- Frederic B Bastian
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Julien Roux
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Anne Niknejad
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Aurélie Comte
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Sara S Fonseca Costa
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Tarcisio Mendes de Farias
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Sébastien Moretti
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Gilles Parmentier
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Valentine Rech de Laval
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Marta Rosikiewicz
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Julien Wollbrett
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Amina Echchiki
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Angélique Escoriza
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Walid H Gharib
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Mar Gonzales-Porta
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Yohan Jarosz
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Balazs Laurenczy
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Philippe Moret
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Emilie Person
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Patrick Roelli
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Komal Sanjeev
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Mathieu Seppey
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Marc Robinson-Rechavi
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| |
Collapse
|
12
|
Altenhoff AM, Train CM, Gilbert KJ, Mediratta I, Mendes de Farias T, Moi D, Nevers Y, Radoykova HS, Rossier V, Warwick Vesztrocy A, Glover NM, Dessimoz C. OMA orthology in 2021: website overhaul, conserved isoforms, ancestral gene order and more. Nucleic Acids Res 2021; 49:D373-D379. [PMID: 33174605 PMCID: PMC7779010 DOI: 10.1093/nar/gkaa1007] [Citation(s) in RCA: 93] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/10/2020] [Accepted: 10/14/2020] [Indexed: 01/11/2023] Open
Abstract
OMA is an established resource to elucidate evolutionary relationships among genes from currently 2326 genomes covering all domains of life. OMA provides pairwise and groupwise orthologs, functional annotations, local and global gene order conservation (synteny) information, among many other functions. This update paper describes the reorganisation of the database into gene-, group- and genome-centric pages. Other new and improved features are detailed, such as reporting of the evolutionarily best conserved isoforms of alternatively spliced genes, the inferred local order of ancestral genes, phylogenetic profiling, better cross-references, fast genome mapping, semantic data sharing via RDF, as well as a special coronavirus OMA with 119 viruses from the Nidovirales order, including SARS-CoV-2, the agent of the COVID-19 pandemic. We conclude with improvements to the documentation of the resource through primers, tutorials and short videos. OMA is accessible at https://omabrowser.org.
Collapse
Affiliation(s)
- Adrian M Altenhoff
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
- ETH Zurich, Computer Science, Universitätstr. 6, 8092 Zurich, Switzerland
| | - Clément-Marie Train
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland
| | - Kimberly J Gilbert
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| | - Ishita Mediratta
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland
- Department of Computer Science and Information Systems, BITS Pilani K.K. Birla Goa Campus, India
| | | | - David Moi
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| | - Yannis Nevers
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| | - Hale-Seda Radoykova
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, Gower St, London WC1E 6BT, United Kingdom
- Department of Computer Science, University College London, Gower St, London WC1E 6BT, United Kingdom
| | - Victor Rossier
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| | - Alex Warwick Vesztrocy
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| | - Natasha M Glover
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| | - Christophe Dessimoz
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, Gower St, London WC1E 6BT, United Kingdom
- Department of Computer Science, University College London, Gower St, London WC1E 6BT, United Kingdom
| |
Collapse
|
13
|
Liang S, Stockinger K, de Farias TM, Anisimova M, Gil M. Querying knowledge graphs in natural language. JOURNAL OF BIG DATA 2021; 8:3. [PMID: 33489717 PMCID: PMC7799375 DOI: 10.1186/s40537-020-00383-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Accepted: 11/22/2020] [Indexed: 06/12/2023]
Abstract
Knowledge graphs are a powerful concept for querying large amounts of data. These knowledge graphs are typically enormous and are often not easily accessible to end-users because they require specialized knowledge in query languages such as SPARQL. Moreover, end-users need a deep understanding of the structure of the underlying data models often based on the Resource Description Framework (RDF). This drawback has led to the development of Question-Answering (QA) systems that enable end-users to express their information needs in natural language. While existing systems simplify user access, there is still room for improvement in the accuracy of these systems. In this paper we propose a new QA system for translating natural language questions into SPARQL queries. The key idea is to break up the translation process into 5 smaller, more manageable sub-tasks and use ensemble machine learning methods as well as Tree-LSTM-based neural network models to automatically learn and translate a natural language question into a SPARQL query. The performance of our proposed QA system is empirically evaluated using the two renowned benchmarks-the 7th Question Answering over Linked Data Challenge (QALD-7) and the Large-Scale Complex Question Answering Dataset (LC-QuAD). Experimental results show that our QA system outperforms the state-of-art systems by 15% on the QALD-7 dataset and by 48% on the LC-QuAD dataset, respectively. In addition, we make our source code available.
Collapse
Affiliation(s)
- Shiqi Liang
- ETH Swiss Federal Institute of Technology, Rämistrasse 101, 8092 Zurich, Switzerland
| | - Kurt Stockinger
- Zurich University of Applied Sciences, Obere Kirchgasse 2, 8400 Winterthur, Switzerland
| | - Tarcisio Mendes de Farias
- SIB Swiss Institute of Bioinformatics, Quartier Sorge-Bâtiment Amphipôle, 1015 Lausanne, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Quartier Sorge-Bâtiment Biophore, 1015 Lausanne, Switzerland
| | - Maria Anisimova
- Zurich University of Applied Sciences, Obere Kirchgasse 2, 8400 Winterthur, Switzerland
- SIB Swiss Institute of Bioinformatics, Quartier Sorge-Bâtiment Amphipôle, 1015 Lausanne, Switzerland
| | - Manuel Gil
- Zurich University of Applied Sciences, Obere Kirchgasse 2, 8400 Winterthur, Switzerland
- SIB Swiss Institute of Bioinformatics, Quartier Sorge-Bâtiment Amphipôle, 1015 Lausanne, Switzerland
| |
Collapse
|
14
|
Cox S, Ahalt SC, Balhoff J, Bizon C, Fecho K, Kebede Y, Morton K, Tropsha A, Wang P, Xu H. Visualization Environment for Federated Knowledge Graphs: Development of an Interactive Biomedical Query Language and Web Application Interface. JMIR Med Inform 2020; 8:e17964. [PMID: 33226347 PMCID: PMC7721550 DOI: 10.2196/17964] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2020] [Revised: 06/30/2020] [Accepted: 07/17/2020] [Indexed: 01/29/2023] Open
Abstract
BACKGROUND Efforts are underway to semantically integrate large biomedical knowledge graphs using common upper-level ontologies to federate graph-oriented application programming interfaces (APIs) to the data. However, federation poses several challenges, including query routing to appropriate knowledge sources, generation and evaluation of answer subsets, semantic merger of those answer subsets, and visualization and exploration of results. OBJECTIVE We aimed to develop an interactive environment for query, visualization, and deep exploration of federated knowledge graphs. METHODS We developed a biomedical query language and web application interphase-termed as Translator Query Language (TranQL)-to query semantically federated knowledge graphs and explore query results. TranQL uses the Biolink data model as an upper-level biomedical ontology and an API standard that has been adopted by the Biomedical Data Translator Consortium to specify a protocol for expressing a query as a graph of Biolink data elements compiled from statements in the TranQL query language. Queries are mapped to federated knowledge sources, and answers are merged into a knowledge graph, with mappings between the knowledge graph and specific elements of the query. The TranQL interactive web application includes a user interface to support user exploration of the federated knowledge graph. RESULTS We developed 2 real-world use cases to validate TranQL and address biomedical questions of relevance to translational science. The use cases posed questions that traversed 2 federated Translator API endpoints: Integrated Clinical and Environmental Exposures Service (ICEES) and Reasoning Over Biomedical Objects linked in Knowledge Oriented Pathways (ROBOKOP). ICEES provides open access to observational clinical and environmental data, and ROBOKOP provides access to linked biomedical entities, such as "gene," "chemical substance," and "disease," that are derived largely from curated public data sources. We successfully posed queries to TranQL that traversed these endpoints and retrieved answers that we visualized and evaluated. CONCLUSIONS TranQL can be used to ask questions of relevance to translational science, rapidly obtain answers that require assertions from a federation of knowledge sources, and provide valuable insights for translational research and clinical practice.
Collapse
Affiliation(s)
- Steven Cox
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Stanley C Ahalt
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - James Balhoff
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Chris Bizon
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Karamarie Fecho
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Yaphet Kebede
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | | | - Alexander Tropsha
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States.,UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Patrick Wang
- CoVar Applied Technologies, Durham, NC, United States
| | - Hao Xu
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| |
Collapse
|
15
|
Canakoglu A, Pinoli P, Gulino A, Nanni L, Masseroli M, Ceri S. Federated sharing and processing of genomic datasets for tertiary data analysis. Brief Bioinform 2020; 22:5868062. [PMID: 34020536 DOI: 10.1093/bib/bbaa091] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2020] [Revised: 04/05/2020] [Accepted: 04/27/2020] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION With the spreading of biological and clinical uses of next-generation sequencing (NGS) data, many laboratories and health organizations are facing the need of sharing NGS data resources and easily accessing and processing comprehensively shared genomic data; in most cases, primary and secondary data management of NGS data is done at sequencing stations, and sharing applies to processed data. Based on the previous single-instance GMQL system architecture, here we review the model, language and architectural extensions that make the GMQL centralized system innovatively open to federated computing. RESULTS A well-designed extension of a centralized system architecture to support federated data sharing and query processing. Data is federated thanks to simple data sharing instructions. Queries are assigned to execution nodes; they are translated into an intermediate representation, whose computation drives data and processing distributions. The approach allows writing federated applications according to classical styles: centralized, distributed or externalized. AVAILABILITY The federated genomic data management system is freely available for non-commercial use as an open source project at http://www.bioinformatics.deib.polimi.it/FederatedGMQLsystem/. CONTACT {arif.canakoglu, pietro.pinoli}@polimi.it. SUMMARY
Collapse
Affiliation(s)
| | | | - Andrea Gulino
- Computer Science and Engineering at Politecnico di Milano
| | - Luca Nanni
- Computer Science and Engineering at Politecnico di Milano
| | | | | |
Collapse
|
16
|
Sima AC, Dessimoz C, Stockinger K, Zahn-Zabal M, Mendes de Farias T. A hands-on introduction to querying evolutionary relationships across multiple data sources using SPARQL. F1000Res 2019; 8:1822. [PMID: 32612807 PMCID: PMC7324951 DOI: 10.12688/f1000research.21027.2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/09/2020] [Indexed: 11/20/2022] Open
Abstract
The increasing use of Semantic Web technologies in the life sciences, in particular the use of the Resource Description Framework (RDF) and the RDF query language SPARQL, opens the path for novel integrative analyses, combining information from multiple data sources. However, analyzing evolutionary data in RDF is not trivial, due to the steep learning curve required to understand both the data models adopted by different RDF data sources, as well as the equivalent SPARQL constructs required to benefit from this data - in particular, recursive property paths. In this article, we provide a hands-on introduction to querying evolutionary data across several data sources that publish orthology information in RDF, namely: The Orthologous MAtrix (OMA), the European Bioinformatics Institute (EBI) RDF platform, the Database of Orthologous Groups (OrthoDB) and the Microbial Genome Database (MBGD). We present four protocols in increasing order of complexity. In these protocols, we demonstrate through SPARQL queries how to retrieve pairwise orthologs, homologous groups, and hierarchical orthologous groups. Finally, we show how orthology information in different data sources can be compared, through the use of federated SPARQL queries.
Collapse
Affiliation(s)
- Ana Claudia Sima
- ZHAW Zurich University of Applied Sciences, Winterthur, Zurich, Switzerland.,Department of Computational Biology, University of Lausanne, Lausanne, Vaud, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Vaud, Switzerland
| | - Christophe Dessimoz
- Department of Computational Biology, University of Lausanne, Lausanne, Vaud, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Vaud, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Vaud, Switzerland.,Department of Computer Science, University College London, London, UK.,Department of Genetics, Evolution, and Environment, University College London, London, UK
| | - Kurt Stockinger
- ZHAW Zurich University of Applied Sciences, Winterthur, Zurich, Switzerland
| | - Monique Zahn-Zabal
- Department of Computational Biology, University of Lausanne, Lausanne, Vaud, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Vaud, Switzerland
| | - Tarcisio Mendes de Farias
- Department of Computational Biology, University of Lausanne, Lausanne, Vaud, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Vaud, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Vaud, Switzerland.,Department of Ecology and Evolution, University of Lausanne, Lausanne, Vaud, Switzerland
| |
Collapse
|