1
|
Faria D, Eugénio P, Contreiras Silva M, Balbi L, Bedran G, Kallor AA, Nunes S, Palkowski A, Waleron M, Alfaro JA, Pesquita C. The Immunopeptidomics Ontology (ImPO). Database (Oxford) 2024; 2024:baae014. [PMID: 38857186 PMCID: PMC11164101 DOI: 10.1093/database/baae014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 11/30/2023] [Accepted: 02/22/2024] [Indexed: 06/12/2024]
Abstract
The adaptive immune response plays a vital role in eliminating infected and aberrant cells from the body. This process hinges on the presentation of short peptides by major histocompatibility complex Class I molecules on the cell surface. Immunopeptidomics, the study of peptides displayed on cells, delves into the wide variety of these peptides. Understanding the mechanisms behind antigen processing and presentation is crucial for effectively evaluating cancer immunotherapies. As an emerging domain, immunopeptidomics currently lacks standardization-there is neither an established terminology nor formally defined semantics-a critical concern considering the complexity, heterogeneity, and growing volume of data involved in immunopeptidomics studies. Additionally, there is a disconnection between how the proteomics community delivers the information about antigen presentation and its uptake by the clinical genomics community. Considering the significant relevance of immunopeptidomics in cancer, this shortcoming must be addressed to bridge the gap between research and clinical practice. In this work, we detail the development of the ImmunoPeptidomics Ontology, ImPO, the first effort at standardizing the terminology and semantics in the domain. ImPO aims to encapsulate and systematize data generated by immunopeptidomics experimental processes and bioinformatics analysis. ImPO establishes cross-references to 24 relevant ontologies, including the National Cancer Institute Thesaurus, Mondo Disease Ontology, Logical Observation Identifier Names and Codes and Experimental Factor Ontology. Although ImPO was developed using expert knowledge to characterize a large and representative data collection, it may be readily used to encode other datasets within the domain. Ultimately, ImPO facilitates data integration and analysis, enabling querying, inference and knowledge generation and importantly bridging the gap between the clinical proteomics and genomics communities. As the field of immunogenomics uses protein-level immunopeptidomics data, we expect ImPO to play a key role in supporting a rich and standardized description of the large-scale data that emerging high-throughput technologies are expected to bring in the near future. Ontology URL: https://zenodo.org/record/10237571 Project GitHub: https://github.com/liseda-lab/ImPO/blob/main/ImPO.owl.
Collapse
Affiliation(s)
- Daniel Faria
- INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Rua Alves Redol, 9, Lisboa 1000-029, Portugal
| | - Patrícia Eugénio
- LASIGE, Faculdade de Ciências da Universidade de Lisboa, Campo Grande, Lisboa 1749-016, Portugal
| | - Marta Contreiras Silva
- LASIGE, Faculdade de Ciências da Universidade de Lisboa, Campo Grande, Lisboa 1749-016, Portugal
| | - Laura Balbi
- LASIGE, Faculdade de Ciências da Universidade de Lisboa, Campo Grande, Lisboa 1749-016, Portugal
| | - Georges Bedran
- International Centre for Cancer Vaccine Science, University of Gdansk, ul. Kładki 24, Gdańsk 80-822, Poland
| | - Ashwin Adrian Kallor
- International Centre for Cancer Vaccine Science, University of Gdansk, ul. Kładki 24, Gdańsk 80-822, Poland
| | - Susana Nunes
- LASIGE, Faculdade de Ciências da Universidade de Lisboa, Campo Grande, Lisboa 1749-016, Portugal
| | - Aleksander Palkowski
- International Centre for Cancer Vaccine Science, University of Gdansk, ul. Kładki 24, Gdańsk 80-822, Poland
| | - Michal Waleron
- International Centre for Cancer Vaccine Science, University of Gdansk, ul. Kładki 24, Gdańsk 80-822, Poland
| | - Javier A Alfaro
- International Centre for Cancer Vaccine Science, University of Gdansk, ul. Kładki 24, Gdańsk 80-822, Poland
- Department of Biochemistry and Microbiology, University of Victoria, 3800 Finnerty Rd, Victoria, British Columbia, BC V8P 5C2, Canada
- Institute for Adaptive and Neural Computation, School of Informatics, University of Edinburgh, Old College, South Bridge, Edinburgh, EH8 9YL, UK
- The Canadian Association for Responsible AI in Medicine, Victoria, Canada
| | - Catia Pesquita
- LASIGE, Faculdade de Ciências da Universidade de Lisboa, Campo Grande, Lisboa 1749-016, Portugal
| |
Collapse
|
2
|
Pendleton SC, Slater K, Karwath A, Gilbert RM, Davis N, Pesudovs K, Liu X, Denniston AK, Gkoutos GV, Braithwaite T. Development and application of the ocular immune-mediated inflammatory diseases ontology enhanced with synonyms from online patient support forum conversation. Comput Biol Med 2021; 135:104542. [PMID: 34139439 PMCID: PMC8404035 DOI: 10.1016/j.compbiomed.2021.104542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 05/27/2021] [Accepted: 05/30/2021] [Indexed: 11/28/2022]
Abstract
BACKGROUND Unstructured text created by patients represents a rich, but relatively inaccessible resource for advancing patient-centred care. This study aimed to develop an ontology for ocular immune-mediated inflammatory diseases (OcIMIDo), as a tool to facilitate data extraction and analysis, illustrating its application to online patient support forum data. METHODS We developed OcIMIDo using clinical guidelines, domain expertise, and cross-references to classes from other biomedical ontologies. We developed an approach to add patient-preferred synonyms text-mined from oliviasvision.org online forum, using statistical ranking. We validated the approach with split-sampling and comparison to manual extraction. Using OcIMIDo, we then explored the frequency of OcIMIDo classes and synonyms, and their potential association with natural language sentiment expressed in each online forum post. FINDINGS OcIMIDo (version 1.2) includes 661 classes, describing anatomy, clinical phenotype, disease activity status, complications, investigations, interventions and functional impacts. It contains 1661 relationships and axioms, 2851 annotations, including 1131 database cross-references, and 187 patient-preferred synonyms. To illustrate OcIMIDo's potential applications, we explored 9031 forum posts, revealing frequent mention of different clinical phenotypes, treatments, and complications. Language sentiment analysis of each post was generally positive (median 0.12, IQR 0.01-0.24). In multivariable logistic regression, the odds of a post expressing negative sentiment were significantly associated with first posts as compared to replies (OR 3.3, 95% CI 2.8 to 3.9, p < 0.001). CONCLUSION We report the development and validation of a new ontology for inflammatory eye diseases, which includes patient-preferred synonyms, and can be used to explore unstructured patient or physician-reported text data, with many potential applications.
Collapse
Affiliation(s)
- Samantha C Pendleton
- Institute of Cancer and Genomic Sciences, University of Birmingham, UK; University Hospitals Birmingham NHS Foundation Trust, UK.
| | - Karin Slater
- Institute of Cancer and Genomic Sciences, University of Birmingham, UK; University Hospitals Birmingham NHS Foundation Trust, UK
| | - Andreas Karwath
- Institute of Cancer and Genomic Sciences, University of Birmingham, UK; University Hospitals Birmingham NHS Foundation Trust, UK; Health Data Research, UK
| | - Rose M Gilbert
- Moorfields Eye Hospital NHS Foundation Trust, London, UK; Institute of Ophthalmology, University College London, UK
| | - Nicola Davis
- Olivia's Vision, Southampton Buildings, London, UK
| | - Konrad Pesudovs
- School of Optometry and Vision Science, University of New South Wales, Australia
| | - Xiaoxuan Liu
- University Hospitals Birmingham NHS Foundation Trust, UK; Institute of Inflammation and Ageing, University of Birmingham, UK
| | - Alastair K Denniston
- University Hospitals Birmingham NHS Foundation Trust, UK; Health Data Research, UK; Institute of Inflammation and Ageing, University of Birmingham, UK
| | - Georgios V Gkoutos
- Institute of Cancer and Genomic Sciences, University of Birmingham, UK; University Hospitals Birmingham NHS Foundation Trust, UK; Health Data Research, UK
| | - Tasanee Braithwaite
- University Hospitals Birmingham NHS Foundation Trust, UK; Institute of Applied Health Research, University of Birmingham, UK; The Medical Eye Unit, St Thomas' Hospital NHS Foundation Trust, London, UK
| |
Collapse
|
3
|
Kanza S, Graham Frey J. Semantic Technologies in Drug Discovery. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11520-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022] Open
|
4
|
Abstract
Computational ontologies are machine-processable structures which represent particular domains of interest. They integrate knowledge which can be used by humans or machines for decision making and problem solving. The main aim of this systematic review is to investigate the role of formal ontologies in information systems development, i.e., how these graphs-based structures can be beneficial during the analysis and design of the information systems. Specific online databases were used to identify studies focused on the interconnections between ontologies and systems engineering. One-hundred eighty-seven studies were found during the first phase of the investigation. Twenty-seven studies were examined after the elimination of duplicate and irrelevant documents. Mind mapping was substantially helpful in organising the basic ideas and in identifying five thematic groups that show the main roles of formal ontologies in information systems development. Formal ontologies are mainly used in the interoperability of information systems, human resource management, domain knowledge representation, the involvement of semantics in unified modelling language (UML)-based modelling, and the management of programming code and documentation. We explain the main ideas in the reviewed studies and suggest possible extensions to this research.
Collapse
|
5
|
Lin FP, Groza T, Kocbek S, Antezana E, Epstein RJ. Cancer Care Treatment Outcome Ontology: A Novel Computable Ontology for Profiling Treatment Outcomes in Patients With Solid Tumors. JCO Clin Cancer Inform 2019; 2:1-14. [PMID: 30652600 DOI: 10.1200/cci.18.00026] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
PURPOSE There is as yet no computer-processable resource to describe treatment end points in cancer, hindering our ability to systematically capture and share outcomes data to inform better patient care. To address these unmet needs, we have built an ontology, the Cancer Care Treatment Outcome Ontology (CCTOO), to organize high-level concepts of treatment end points with structured knowledge representation to facilitate standardized sharing of real-world data. METHODS End points from oncology trials in ClinicalTrials.gov were extracted, queried using the keyword cancer, and followed by an expert appraisal. Synonyms and relevant terms were imported from the National Cancer Institute Thesaurus and Common Terminology Criteria for Adverse Events. Logical relationships among concepts were manually represented by production rules. The applicability of 1,847 rules was tested in an index case. RESULTS After removing duplicated terms from 54,705 trial entries, an ontology holding 1,133 terms was built. CCTOO organized concepts into four domains (cancer treatment, health services, physical, and psychosocial health-related concepts), 13 subgroups (including efficacy, safety, and quality of life), and two (taxonomic and evaluative) concept hierarchies. This ontology has a comprehensive term coverage in the cancer trial literature: at least one term was mentioned in 98% of MEDLINE abstracts of phase I to III trials, whereas concepts about efficacy were mentioned in 7,208 (79%) phase I, 15,051 (92%) phase II, and 3,884 (86%) phase III trials. The event sequence of the index case was readily convertible to a comprehensive profile incorporating response, treatment toxicity, and survival by applying the set of production rules curated in the CCTOO. CONCLUSION CCTOO categorizes high-level treatment end points used in oncology and provides a mechanism for profiling individual patient data by outcomes to facilitate translational analysis.
Collapse
Affiliation(s)
- Frank P Lin
- Frank P. Lin and Richard J. Epstein, St Vincent's Hospital and The Kinghorn Cancer Centre; Frank P. Lin, Tudor Groza, Simon Kocbek, and Richard J. Epstein, Garvan Institute of Medical Research, Sydney, Australia; Frank P. Lin, Waikato Hospital, Hamilton, New Zealand; and Erick Antezana, Norwegian University of Science and Technology, Trondheim, Norway
| | - Tudor Groza
- Frank P. Lin and Richard J. Epstein, St Vincent's Hospital and The Kinghorn Cancer Centre; Frank P. Lin, Tudor Groza, Simon Kocbek, and Richard J. Epstein, Garvan Institute of Medical Research, Sydney, Australia; Frank P. Lin, Waikato Hospital, Hamilton, New Zealand; and Erick Antezana, Norwegian University of Science and Technology, Trondheim, Norway
| | - Simon Kocbek
- Frank P. Lin and Richard J. Epstein, St Vincent's Hospital and The Kinghorn Cancer Centre; Frank P. Lin, Tudor Groza, Simon Kocbek, and Richard J. Epstein, Garvan Institute of Medical Research, Sydney, Australia; Frank P. Lin, Waikato Hospital, Hamilton, New Zealand; and Erick Antezana, Norwegian University of Science and Technology, Trondheim, Norway
| | - Erick Antezana
- Frank P. Lin and Richard J. Epstein, St Vincent's Hospital and The Kinghorn Cancer Centre; Frank P. Lin, Tudor Groza, Simon Kocbek, and Richard J. Epstein, Garvan Institute of Medical Research, Sydney, Australia; Frank P. Lin, Waikato Hospital, Hamilton, New Zealand; and Erick Antezana, Norwegian University of Science and Technology, Trondheim, Norway
| | - Richard J Epstein
- Frank P. Lin and Richard J. Epstein, St Vincent's Hospital and The Kinghorn Cancer Centre; Frank P. Lin, Tudor Groza, Simon Kocbek, and Richard J. Epstein, Garvan Institute of Medical Research, Sydney, Australia; Frank P. Lin, Waikato Hospital, Hamilton, New Zealand; and Erick Antezana, Norwegian University of Science and Technology, Trondheim, Norway
| |
Collapse
|
6
|
Kourou KD, Pezoulas VC, Georga EI, Exarchos TP, Tsanakas P, Tsiknakis M, Varvarigou T, De Vita S, Tzioufas A, Fotiadis DI. Cohort Harmonization and Integrative Analysis From a Biomedical Engineering Perspective. IEEE Rev Biomed Eng 2019; 12:303-318. [DOI: 10.1109/rbme.2018.2855055] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
7
|
Esteban-Gil A, Fernández-Breis JT, Boeker M. Analysis and visualization of disease courses in a semantically-enabled cancer registry. J Biomed Semantics 2017; 8:46. [PMID: 28962670 PMCID: PMC5622544 DOI: 10.1186/s13326-017-0154-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2016] [Accepted: 09/19/2017] [Indexed: 12/20/2022] Open
Abstract
Background Regional and epidemiological cancer registries are important for cancer research and the quality management of cancer treatment. Many technological solutions are available to collect and analyse data for cancer registries nowadays. However, the lack of a well-defined common semantic model is a problem when user-defined analyses and data linking to external resources are required. The objectives of this study are: (1) design of a semantic model for local cancer registries; (2) development of a semantically-enabled cancer registry based on this model; and (3) semantic exploitation of the cancer registry for analysing and visualising disease courses. Results Our proposal is based on our previous results and experience working with semantic technologies. Data stored in a cancer registry database were transformed into RDF employing a process driven by OWL ontologies. The semantic representation of the data was then processed to extract semantic patient profiles, which were exploited by means of SPARQL queries to identify groups of similar patients and to analyse the disease timelines of patients. Based on the requirements analysis, we have produced a draft of an ontology that models the semantics of a local cancer registry in a pragmatic extensible way. We have implemented a Semantic Web platform that allows transforming and storing data from cancer registries in RDF. This platform also permits users to formulate incremental user-defined queries through a graphical user interface. The query results can be displayed in several customisable ways. The complex disease timelines of individual patients can be clearly represented. Different events, e.g. different therapies and disease courses, are presented according to their temporal and causal relations. Conclusion The presented platform is an example of the parallel development of ontologies and applications that take advantage of semantic web technologies in the medical field. The semantic structure of the representation renders it easy to analyse key figures of the patients and their evolution at different granularity levels.
Collapse
Affiliation(s)
- Angel Esteban-Gil
- Fundación para la Formación e Investigación Sanitarias de la Región de Murcia, Biomedical Informatics & Bioinformatics Platform, IMIB-Arrixaca, C/ Luis Fontes Pagán, n° 9, Murcia, 30003, Spain
| | - Jesualdo Tomás Fernández-Breis
- Dpto. Informática y Sistemas, Facultad de Informática, Universidad de Murcia, IMIB-Arrixaca, Facultad de Informática, Campus de Espinardo, Murcia, 30100, Spain.
| | - Martin Boeker
- Institute for Medical Biometry and Statistics, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Stefan-Meier-Str. 26, Freiburg, 79104, Germany
| |
Collapse
|
8
|
Kondylakis H, Claerhout B, Keyur M, Koumakis L, van Leeuwen J, Marias K, Perez-Rey D, De Schepper K, Tsiknakis M, Bucur A. The INTEGRATE project: Delivering solutions for efficient multi-centric clinical research and trials. J Biomed Inform 2016; 62:32-47. [PMID: 27224847 DOI: 10.1016/j.jbi.2016.05.006] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2015] [Revised: 05/05/2016] [Accepted: 05/17/2016] [Indexed: 10/21/2022]
Abstract
The objective of the INTEGRATE project (http://www.fp7-integrate.eu/) that has recently concluded successfully was the development of innovative biomedical applications focused on streamlining the execution of clinical research, on enabling multidisciplinary collaboration, on management and large-scale sharing of multi-level heterogeneous datasets, and on the development of new methodologies and of predictive multi-scale models in cancer. In this paper, we present the way the INTEGRATE consortium has approached important challenges such as the integration of multi-scale biomedical data in the context of post-genomic clinical trials, the development of predictive models and the implementation of tools to facilitate the efficient execution of postgenomic multi-centric clinical trials in breast cancer. Furthermore, we provide a number of key "lessons learned" during the process and give directions for further future research and development.
Collapse
Affiliation(s)
- Haridimos Kondylakis
- Computational BioMedicine Laboratory, FORTH-ICS, N. Plastira 100, Heraklion, Greece.
| | - Brecht Claerhout
- Custodix NV, Kortrijksesteenweg 214b3, Sint-Martens-Latem, Belgium
| | - Mehta Keyur
- German Breast Group, GBG Forschungs GmbH, Geschaeftsfuehrer: Prof. Dr. med. Gunter von Minckwitz, Handelsregister: Amtsgericht Offenbach, HRB 40477 Sitz der Gesellschaft ist Neu-Isenburg, Germany
| | - Lefteris Koumakis
- Computational BioMedicine Laboratory, FORTH-ICS, N. Plastira 100, Heraklion, Greece
| | | | - Kostas Marias
- Computational BioMedicine Laboratory, FORTH-ICS, N. Plastira 100, Heraklion, Greece
| | - David Perez-Rey
- Biomedical Informatics Group, DLSIIS & DIA, Facultad de Informática, Universidad Politécnica de Madrid, Campus de Montegancedo S/N, 28660 Boadilla del Monte, Madrid, Spain
| | | | - Manolis Tsiknakis
- Computational BioMedicine Laboratory, FORTH-ICS, N. Plastira 100, Heraklion, Greece; Department of Informatics Engineering, Technological Educational Institute of Crete, Estavromenos 71004, Hearklion, Crete, Greece
| | - Anca Bucur
- PHILIPS Research Europe, High Tech Campus 34, Eindhoven, Netherlands
| |
Collapse
|
9
|
Utilizing a structural meta-ontology for family-based quality assurance of the BioPortal ontologies. J Biomed Inform 2016; 61:63-76. [PMID: 26988001 DOI: 10.1016/j.jbi.2016.03.007] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2015] [Revised: 02/05/2016] [Accepted: 03/04/2016] [Indexed: 11/22/2022]
Abstract
An Abstraction Network is a compact summary of an ontology's structure and content. In previous research, we showed that Abstraction Networks support quality assurance (QA) of biomedical ontologies. The development of an Abstraction Network and its associated QA methodologies, however, is a labor-intensive process that previously was applicable only to one ontology at a time. To improve the efficiency of the Abstraction-Network-based QA methodology, we introduced a QA framework that uses uniform Abstraction Network derivation techniques and QA methodologies that are applicable to whole families of structurally similar ontologies. For the family-based framework to be successful, it is necessary to develop a method for classifying ontologies into structurally similar families. We now describe a structural meta-ontology that classifies ontologies according to certain structural features that are commonly used in the modeling of ontologies (e.g., object properties) and that are important for Abstraction Network derivation. Each class of the structural meta-ontology represents a family of ontologies with identical structural features, indicating which types of Abstraction Networks and QA methodologies are potentially applicable to all of the ontologies in the family. We derive a collection of 81 families, corresponding to classes of the structural meta-ontology, that enable a flexible, streamlined family-based QA methodology, offering multiple choices for classifying an ontology. The structure of 373 ontologies from the NCBO BioPortal is analyzed and each ontology is classified into multiple families modeled by the structural meta-ontology.
Collapse
|
10
|
Anguita A, García-Remesal M, de la Iglesia D, Graf N, Maojo V. Toward a view-oriented approach for aligning RDF-based biomedical repositories. Methods Inf Med 2014; 54:50-5. [PMID: 24777240 DOI: 10.3414/me13-02-0020] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2013] [Accepted: 03/17/2014] [Indexed: 11/09/2022]
Abstract
INTRODUCTION This article is part of the Focus Theme of METHODS of Information in Medicine on "Managing Interoperability and Complexity in Health Systems". BACKGROUND The need for complementary access to multiple RDF databases has fostered new lines of research, but also entailed new challenges due to data representation disparities. While several approaches for RDF-based database integration have been proposed, those focused on schema alignment have become the most widely adopted. All state-of-the-art solutions for aligning RDF-based sources resort to a simple technique inherited from legacy relational database integration methods. This technique - known as element-to-element (e2e) mappings - is based on establishing 1:1 mappings between single primitive elements - e.g. concepts, attributes, relationships, etc. - belonging to the source and target schemas. However, due to the intrinsic nature of RDF - a representation language based on defining tuples < subject, predicate, object > -, one may find RDF elements whose semantics vary dramatically when combined into a view involving other RDF elements - i.e. they depend on their context. The latter cannot be adequately represented in the target schema by resorting to the traditional e2e approach. These approaches fail to properly address this issue without explicitly modifying the target ontology, thus lacking the required expressiveness for properly reflecting the intended semantics in the alignment information. OBJECTIVES To enhance existing RDF schema alignment techniques by providing a mechanism to properly represent elements with context-dependent semantics, thus enabling users to perform more expressive alignments, including scenarios that cannot be adequately addressed by the existing approaches. METHODS Instead of establishing 1:1 correspondences between single primitive elements of the schemas, we propose adopting a view-based approach. The latter is targeted at establishing mapping relationships between RDF subgraphs - that can be regarded as the equivalent of views in traditional databases -, rather than between single schema elements. This approach enables users to represent scenarios defined by context-dependent RDF elements that cannot be properly represented when adopting the currently existing approaches. RESULTS We developed a software tool implementing our view-based strategy. Our tool is currently being used in the context of the European Commission funded p-medicine project, targeted at creating a technological framework to integrate clinical and genomic data to facilitate the development of personalized drugs and therapies for cancer, based on the genetic profile of the patient. We used our tool to integrate different RDF-based databases - including different repositories of clinical trials and DICOM images - using the Health Data Ontology Trunk (HDOT) ontology as the target schema. CONCLUSIONS The importance of database integration methods and tools in the context of biomedical research has been widely recognized. Modern research in this area - e.g. identification of disease biomarkers, or design of personalized therapies - heavily relies on the availability of a technical framework to enable researchers to uniformly access disparate repositories. We present a method and a tool that implement a novel alignment method specifically designed to support and enhance the integration of RDF-based data sources at schema (metadata) level. This approach provides an increased level of expressiveness compared to other existing solutions, and allows solving heterogeneity scenarios that cannot be properly represented using other state-of-the-art techniques.
Collapse
Affiliation(s)
- A Anguita
- Alberto Anguita, PhD, Group of Biomedical Informatics, Universidad Politécnica de Madrid, Campus de Montegancedo s/n, 28660 Boadilla del Monte, Spain, E-mail:
| | | | | | | | | |
Collapse
|
11
|
NCBI2RDF: enabling full RDF-based access to NCBI databases. BIOMED RESEARCH INTERNATIONAL 2013; 2013:983805. [PMID: 23984425 PMCID: PMC3745940 DOI: 10.1155/2013/983805] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/03/2013] [Accepted: 06/30/2013] [Indexed: 12/11/2022]
Abstract
RDF has become the standard technology for enabling interoperability among heterogeneous biomedical databases. The NCBI provides access to a large set of life sciences databases through a common interface called Entrez. However, the latter does not provide RDF-based access to such databases, and, therefore, they cannot be integrated with other RDF-compliant databases and accessed via SPARQL query interfaces. This paper presents the NCBI2RDF system, aimed at providing RDF-based access to the complete NCBI data repository. This API creates a virtual endpoint for servicing SPARQL queries over different NCBI repositories and presenting to users the query results in SPARQL results format, thus enabling this data to be integrated and/or stored with other RDF-compliant repositories. SPARQL queries are dynamically resolved, decomposed, and forwarded to the NCBI-provided E-utilities programmatic interface to access the NCBI data. Furthermore, we show how our approach increases the expressiveness of the native NCBI querying system, allowing several databases to be accessed simultaneously. This feature significantly boosts productivity when working with complex queries and saves time and effort to biomedical researchers. Our approach has been validated with a large number of SPARQL queries, thus proving its reliability and enhanced capabilities in biomedical environments.
Collapse
|
12
|
|
13
|
Anguita A, Martin L, Garcia-Remesal M, Maojo V. RDFBuilder: a tool to automatically build RDF-based interfaces for MAGE-OM microarray data sources. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2013; 111:220-7. [PMID: 23669178 DOI: 10.1016/j.cmpb.2013.04.009] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2012] [Revised: 04/05/2013] [Accepted: 04/18/2013] [Indexed: 05/25/2023]
Abstract
This paper presents RDFBuilder, a tool that enables RDF-based access to MAGE-ML-compliant microarray databases. We have developed a system that automatically transforms the MAGE-OM model and microarray data stored in the ArrayExpress database into RDF format. Additionally, the system automatically enables a SPARQL endpoint. This allows users to execute SPARQL queries for retrieving microarray data, either from specific experiments or from more than one experiment at a time. Our system optimizes response times by caching and reusing information from previous queries. In this paper, we describe our methods for achieving this transformation. We show that our approach is complementary to other existing initiatives, such as Bio2RDF, for accessing and retrieving data from the ArrayExpress database.
Collapse
Affiliation(s)
- Alberto Anguita
- Biomedical Informatics Group, Artificial Intelligence Laboratory, School of Computer Science, Universidad Politécnica de Madrid, Campus de Montegancedo S/N, 28660 Boadilla del Monte, Madrid, Spain.
| | | | | | | |
Collapse
|
14
|
Miyoshi NSB, Pinheiro DG, Silva WA, Felipe JC. Computational framework to support integration of biomolecular and clinical data within a translational approach. BMC Bioinformatics 2013; 14:180. [PMID: 23742129 PMCID: PMC3688149 DOI: 10.1186/1471-2105-14-180] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2012] [Accepted: 05/24/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The use of the knowledge produced by sciences to promote human health is the main goal of translational medicine. To make it feasible we need computational methods to handle the large amount of information that arises from bench to bedside and to deal with its heterogeneity. A computational challenge that must be faced is to promote the integration of clinical, socio-demographic and biological data. In this effort, ontologies play an essential role as a powerful artifact for knowledge representation. Chado is a modular ontology-oriented database model that gained popularity due to its robustness and flexibility as a generic platform to store biological data; however it lacks supporting representation of clinical and socio-demographic information. RESULTS We have implemented an extension of Chado - the Clinical Module - to allow the representation of this kind of information. Our approach consists of a framework for data integration through the use of a common reference ontology. The design of this framework has four levels: data level, to store the data; semantic level, to integrate and standardize the data by the use of ontologies; application level, to manage clinical databases, ontologies and data integration process; and web interface level, to allow interaction between the user and the system. The clinical module was built based on the Entity-Attribute-Value (EAV) model. We also proposed a methodology to migrate data from legacy clinical databases to the integrative framework. A Chado instance was initialized using a relational database management system. The Clinical Module was implemented and the framework was loaded using data from a factual clinical research database. Clinical and demographic data as well as biomaterial data were obtained from patients with tumors of head and neck. We implemented the IPTrans tool that is a complete environment for data migration, which comprises: the construction of a model to describe the legacy clinical data, based on an ontology; the Extraction, Transformation and Load (ETL) process to extract the data from the source clinical database and load it in the Clinical Module of Chado; the development of a web tool and a Bridge Layer to adapt the web tool to Chado, as well as other applications. CONCLUSIONS Open-source computational solutions currently available for translational science does not have a model to represent biomolecular information and also are not integrated with the existing bioinformatics tools. On the other hand, existing genomic data models do not represent clinical patient data. A framework was developed to support translational research by integrating biomolecular information coming from different "omics" technologies with patient's clinical and socio-demographic data. This framework should present some features: flexibility, compression and robustness. The experiments accomplished from a use case demonstrated that the proposed system meets requirements of flexibility and robustness, leading to the desired integration. The Clinical Module can be accessed in http://dcm.ffclrp.usp.br/caib/pg=iptrans.
Collapse
Affiliation(s)
- Newton Shydeo Brandão Miyoshi
- Department of Computing and Mathematics, Faculty of Philosophy, Sciences and Languages of Ribeirão Preto, University of São Paulo, São Paulo, Brazil
| | | | | | | |
Collapse
|
15
|
Eccher C, Scipioni A, Miller AA, Ferro A, Pisanelli DM. An ontology of cancer therapies supporting interoperability and data consistency in EPRs. Comput Biol Med 2013; 43:822-32. [PMID: 23746723 DOI: 10.1016/j.compbiomed.2013.04.012] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2012] [Revised: 04/15/2013] [Accepted: 04/16/2013] [Indexed: 11/29/2022]
Abstract
Ontologies can formally describe the semantics of the medical domain in an unambiguous and machine processable form, acting as a conceptual interface between different applications that must interoperate. In this paper we present an ontology of cancer therapies originally developed to bridge the gap between an oncologic Electronic Patient Record (EPR) and a guideline-based decision support system. We show an application of the ontology complemented by rules to classify therapies recorded in the EPR. The results show how such an ontology can be used also to discover possible problems of data consistency in the EPR.
Collapse
Affiliation(s)
- Claudio Eccher
- Fondazione Bruno Kessler-Center for Information Technology, via Sommarive 18, 38050 Povo, Trento, Italy.
| | | | | | | | | |
Collapse
|
16
|
Riaño D, Real F, López-Vallverdú JA, Campana F, Ercolani S, Mecocci P, Annicchiarico R, Caltagirone C. An ontology-based personalization of health-care knowledge to support clinical decisions for chronically ill patients. J Biomed Inform 2012; 45:429-46. [PMID: 22269224 DOI: 10.1016/j.jbi.2011.12.008] [Citation(s) in RCA: 127] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2011] [Revised: 12/16/2011] [Accepted: 12/25/2011] [Indexed: 02/04/2023]
Abstract
Chronically ill patients are complex health care cases that require the coordinated interaction of multiple professionals. A correct intervention of these sort of patients entails the accurate analysis of the conditions of each concrete patient and the adaptation of evidence-based standard intervention plans to these conditions. There are some other clinical circumstances such as wrong diagnoses, unobserved comorbidities, missing information, unobserved related diseases or prevention, whose detection depends on the capacities of deduction of the professionals involved. In this paper, we introduce an ontology for the care of chronically ill patients and implement two personalization processes and a decision support tool. The first personalization process adapts the contents of the ontology to the particularities observed in the health-care record of a given concrete patient, automatically providing a personalized ontology containing only the clinical information that is relevant for health-care professionals to manage that patient. The second personalization process uses the personalized ontology of a patient to automatically transform intervention plans describing health-care general treatments into individual intervention plans. For comorbid patients, this process concludes with the semi-automatic integration of several individual plans into a single personalized plan. Finally, the ontology is also used as the knowledge base of a decision support tool that helps health-care professionals to detect anomalous circumstances such as wrong diagnoses, unobserved comorbidities, missing information, unobserved related diseases, or preventive actions. Seven health-care centers participating in the K4CARE project, together with the group SAGESA and the Local Health System in the town of Pollenza have served as the validation platform for these two processes and tool. Health-care professionals participating in the evaluation agree about the average quality 84% (5.9/7.0) and utility 90% (6.3/7.0) of the tools and also about the correct reasoning of the decision support tool, according to clinical standards.
Collapse
Affiliation(s)
- David Riaño
- Research Group on Artificial Intelligence, Universitat Rovira i Virgili, Tarragona, Spain
| | | | | | | | | | | | | | | |
Collapse
|
17
|
Tapia JL, Goldberg LJ. The challenges of defining oral cancer: analysis of an ontological approach. Head Neck Pathol 2011; 5:376-84. [PMID: 21915705 PMCID: PMC3210219 DOI: 10.1007/s12105-011-0300-0] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/04/2011] [Accepted: 09/02/2011] [Indexed: 11/30/2022]
Abstract
An important inconsistency currently exists in the literature on oral cancer. Reviewing this literature, one finds that the term oral cancer is defined and described with great variation. In a search in PubMed, at least 17 different terms were found for titles of papers reporting data on oral cancer. The variability of the terms used for designating anatomic regions and type of malignant neoplasms for reporting oral cancer has hampered the ability of researchers to effectively retrieve information concerning oral cancer. Therefore, it is sometimes extremely difficult to provide meaningful comparisons among various studies of oral cancer. Recently, a new ontological strategy that is rooted in consensus-based controlled vocabularies has been proposed to improve the consistency of data in dental research (Smith et al. in J Am Dent Assoc 141:1173-1175, 2010). In this paper, we analyzed the terminology dilemma on oral cancer and explained the current situation. We proposed a possible solution to the dilemma using an ontology-based approach. The advantages for applying this strategy are also discussed.
Collapse
Affiliation(s)
- Jose Luis Tapia
- Department of Oral Diagnostic Sciences, School of Dental Medicine, University at Buffalo, 355 Squire Hall, 3435 Main Street, Buffalo, NY 14214 USA
| | - Louis J. Goldberg
- Department of Oral Diagnostic Sciences, School of Dental Medicine, University at Buffalo, 355 Squire Hall, 3435 Main Street, Buffalo, NY 14214 USA
| |
Collapse
|
18
|
Maojo V, Crespo J, García-Remesal M, de la Iglesia D, Perez-Rey D, Kulikowski C. Biomedical ontologies: toward scientific debate. Methods Inf Med 2011; 50:203-16. [PMID: 21431244 DOI: 10.3414/me10-05-0004] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2010] [Accepted: 01/12/2011] [Indexed: 11/09/2022]
Abstract
OBJECTIVES Biomedical ontologies have been very successful in structuring knowledge for many different applications, receiving widespread praise for their utility and potential. Yet, the role of computational ontologies in scientific research, as opposed to knowledge management applications, has not been extensively discussed. We aim to stimulate further discussion on the advantages and challenges presented by biomedical ontologies from a scientific perspective. METHODS We review various aspects of biomedical ontologies going beyond their practical successes, and focus on some key scientific questions in two ways. First, we analyze and discuss current approaches to improve biomedical ontologies that are based largely on classical, Aristotelian ontological models of reality. Second, we raise various open questions about biomedical ontologies that require further research, analyzing in more detail those related to visual reasoning and spatial ontologies. RESULTS We outline significant scientific issues that biomedical ontologies should consider, beyond current efforts of building practical consensus between them. For spatial ontologies, we suggest an approach for building "morphospatial" taxonomies, as an example that could stimulate research on fundamental open issues for biomedical ontologies. CONCLUSIONS Analysis of a large number of problems with biomedical ontologies suggests that the field is very much open to alternative interpretations of current work, and in need of scientific debate and discussion that can lead to new ideas and research directions.
Collapse
Affiliation(s)
- V Maojo
- Biomedical Informatics Group, Departamento de Inteligencia Artificial, Faculdad de Informática, Universidad Politécnica de Madrid, Boadilla del Monte, 28660 Madrid, Spain.
| | | | | | | | | | | |
Collapse
|
19
|
Smith B, Scheuermann RH. Ontologies for clinical and translational research: Introduction. J Biomed Inform 2011; 44:3-7. [PMID: 21241822 DOI: 10.1016/j.jbi.2011.01.002] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2010] [Revised: 01/05/2011] [Accepted: 01/08/2011] [Indexed: 10/18/2022]
|