1
|
O. M. K. BIOTECHNICAL INFORMATION SYSTEMS FOR MONITORING OF CHEMICALS IN ENVIRONMENT: BIOPHYSICAL APPROACH. BIOTECHNOLOGIA ACTA 2019. [DOI: 10.15407/biotech12.01.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
|
2
|
Klyuchko OM. ELECTRONIC INFORMATION SYSTEMS FOR MONITORING OF POPULATIONS AND MIGRATIONS OF INSECTS. BIOTECHNOLOGIA ACTA 2018. [DOI: 10.15407/biotech11.05.005] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
|
3
|
Klyuchko O. TECHNOLOGIES OF BRAIN IMAGES PROCESSING. BIOTECHNOLOGIA ACTA 2017. [DOI: 10.15407/biotech10.06.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
|
4
|
|
5
|
Carmen Legaz-García MD, Miñarro-Giménez JA, Menárguez-Tortosa M, Fernández-Breis JT. Generation of open biomedical datasets through ontology-driven transformation and integration processes. J Biomed Semantics 2016; 7:32. [PMID: 27255189 PMCID: PMC4891880 DOI: 10.1186/s13326-016-0075-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2015] [Accepted: 05/17/2016] [Indexed: 11/23/2022] Open
Abstract
Background Biomedical research usually requires combining large volumes of data from multiple heterogeneous sources, which makes difficult the integrated exploitation of such data. The Semantic Web paradigm offers a natural technological space for data integration and exploitation by generating content readable by machines. Linked Open Data is a Semantic Web initiative that promotes the publication and sharing of data in machine readable semantic formats. Methods We present an approach for the transformation and integration of heterogeneous biomedical data with the objective of generating open biomedical datasets in Semantic Web formats. The transformation of the data is based on the mappings between the entities of the data schema and the ontological infrastructure that provides the meaning to the content. Our approach permits different types of mappings and includes the possibility of defining complex transformation patterns. Once the mappings are defined, they can be automatically applied to datasets to generate logically consistent content and the mappings can be reused in further transformation processes. Results The results of our research are (1) a common transformation and integration process for heterogeneous biomedical data; (2) the application of Linked Open Data principles to generate interoperable, open, biomedical datasets; (3) a software tool, called SWIT, that implements the approach. In this paper we also describe how we have applied SWIT in different biomedical scenarios and some lessons learned. Conclusions We have presented an approach that is able to generate open biomedical repositories in Semantic Web formats. SWIT is able to apply the Linked Open Data principles in the generation of the datasets, so allowing for linking their content to external repositories and creating linked open datasets. SWIT datasets may contain data from multiple sources and schemas, thus becoming integrated datasets.
Collapse
Affiliation(s)
| | | | - Marcos Menárguez-Tortosa
- Departamento de Informática y Sistemas, Universidad de Murcia, IMIB-Arrixaca, Murcia, 30071, Spain
| | | |
Collapse
|
6
|
ODMedit: uniform semantic annotation for data integration in medicine based on a public metadata repository. BMC Med Res Methodol 2016; 16:65. [PMID: 27245222 PMCID: PMC4888420 DOI: 10.1186/s12874-016-0164-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2015] [Accepted: 05/14/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The volume and complexity of patient data - especially in personalised medicine - is steadily increasing, both regarding clinical data and genomic profiles: Typically more than 1,000 items (e.g., laboratory values, vital signs, diagnostic tests etc.) are collected per patient in clinical trials. In oncology hundreds of mutations can potentially be detected for each patient by genomic profiling. Therefore data integration from multiple sources constitutes a key challenge for medical research and healthcare. METHODS Semantic annotation of data elements can facilitate to identify matching data elements in different sources and thereby supports data integration. Millions of different annotations are required due to the semantic richness of patient data. These annotations should be uniform, i.e., two matching data elements shall contain the same annotations. However, large terminologies like SNOMED CT or UMLS don't provide uniform coding. It is proposed to develop semantic annotations of medical data elements based on a large-scale public metadata repository. To achieve uniform codes, semantic annotations shall be re-used if a matching data element is available in the metadata repository. RESULTS A web-based tool called ODMedit ( https://odmeditor.uni-muenster.de/ ) was developed to create data models with uniform semantic annotations. It contains ~800,000 terms with semantic annotations which were derived from ~5,800 models from the portal of medical data models (MDM). The tool was successfully applied to manually annotate 22 forms with 292 data items from CDISC and to update 1,495 data models of the MDM portal. CONCLUSION Uniform manual semantic annotation of data models is feasible in principle, but requires a large-scale collaborative effort due to the semantic richness of patient data. A web-based tool for these annotations is available, which is linked to a public metadata repository.
Collapse
|
7
|
Liaw ST, Taggart J, Yu H, de Lusignan S, Kuziemsky C, Hayen A. Integrating electronic health record information to support integrated care: practical application of ontologies to improve the accuracy of diabetes disease registers. J Biomed Inform 2014; 52:364-72. [PMID: 25089026 DOI: 10.1016/j.jbi.2014.07.016] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2014] [Revised: 07/21/2014] [Accepted: 07/23/2014] [Indexed: 11/28/2022]
Abstract
BACKGROUND Information in Electronic Health Records (EHRs) are being promoted for use in clinical decision support, patient registers, measurement and improvement of integration and quality of care, and translational research. To do this EHR-derived data product creators need to logically integrate patient data with information and knowledge from diverse sources and contexts. OBJECTIVE To examine the accuracy of an ontological multi-attribute approach to create a Type 2 Diabetes Mellitus (T2DM) register to support integrated care. METHODS Guided by Australian best practice guidelines, the T2DM diagnosis and management ontology was conceptualized, contextualized and validated by clinicians; it was then specified, formalized and implemented. The algorithm was standardized against the domain ontology in SNOMED CT-AU. Accuracy of the implementation was measured in 4 datasets of varying sizes (927-12,057 patients) and an integrated dataset (23,793 patients). Results were cross-checked with sensitivity and specificity calculated with 95% confidence intervals. RESULTS Incrementally integrating Reason for Visit (RFV), medication (Rx), and pathology in the algorithm identified nearly100% of T2DM cases. Incrementally integrating the four datasets improved accuracy; controlling for sample size, data incompleteness and duplicates. Manual validation confirmed the accuracy of the algorithm. CONCLUSION Integrating multiple data elements within an EHR using ontology-based case-finding algorithms can improve the accuracy of the diagnosis and compensate for suboptimal data quality, and hence creating a dataset that is more fit-for-purpose. This clinical and pragmatic application of ontologies to EHR data improves the integration of data and the potential for better use of data to improve the quality of care.
Collapse
Affiliation(s)
- Siaw-Teng Liaw
- School of Public Health and Community Medicine, UNSW Medicine, Sydney, Australia; Centre for PHC & Equity, UNSW Medicine, Sydney, Australia; Academic General Practice Unit, South Western Sydney Local Health District, NSW, Australia.
| | - Jane Taggart
- Centre for PHC & Equity, UNSW Medicine, Sydney, Australia
| | - Hairong Yu
- Centre for PHC & Equity, UNSW Medicine, Sydney, Australia
| | | | - Craig Kuziemsky
- Telfer School of Management, University of Ottawa, Ottawa, Canada
| | - Andrew Hayen
- School of Public Health and Community Medicine, UNSW Medicine, Sydney, Australia
| |
Collapse
|
8
|
Rahimi A, Liaw ST, Taggart J, Ray P, Yu H. Validating an ontology-based algorithm to identify patients with type 2 diabetes mellitus in electronic health records. Int J Med Inform 2014; 83:768-78. [PMID: 25011429 DOI: 10.1016/j.ijmedinf.2014.06.002] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2014] [Revised: 06/02/2014] [Accepted: 06/02/2014] [Indexed: 11/19/2022]
Abstract
BACKGROUND Improving healthcare for people with chronic conditions requires clinical information systems that support integrated care and information exchange, emphasizing a semantic approach to support multiple and disparate Electronic Health Records (EHRs). Using a literature review, the Australian National Guidelines for Type 2 Diabetes Mellitus (T2DM), SNOMED-CT-AU and input from health professionals, we developed a Diabetes Mellitus Ontology (DMO) to diagnose and manage patients with diabetes. This paper describes the manual validation of the DMO-based approach using real world EHR data from a general practice (n=908 active patients) participating in the electronic Practice Based Research Network (ePBRN). METHOD The DMO-based algorithm to query, using Semantic Protocol and RDF Query Language (SPARQL), the structured fields in the ePBRN data repository were iteratively tested and refined. The accuracy of the final DMO-based algorithm was validated with a manual audit of the general practice EHR. Contingency tables were prepared and Sensitivity and Specificity (accuracy) of the algorithm to diagnose T2DM measured, using the T2DM cases found by manual EHR audit as the gold standard. Accuracy was determined with three attributes - reason for visit (RFV), medication (Rx) and pathology (path) - singly and in combination. RESULTS The Sensitivity and Specificity of the algorithm were 100% and 99.88% with RFV; 96.55% and 98.97% with Rx; and 15.6% and 98.92% with Path. This suggests that Rx and Path data were not as complete or correct as the RFV for this general practice, which kept its RFV information complete and current for diabetes. However, the completeness is good enough for this purpose as confirmed by the very small relative deterioration of the accuracy (Sensitivity and Specificity of 97.67% and 99.18%) when calculated for the combination of RFV, Rx and Path. The manual EHR audit suggested that the accuracy of the algorithm was influenced by data quality such as incorrect data due to mistaken units of measurement and unavailable data due to non-documentation or documented in the wrong place or progress notes, problems with data extraction, encryption and data management errors. CONCLUSION This DMO-based algorithm is sufficiently accurate to support a semantic approach, using the RFV, Rx and Path to define patients with T2DM from EHR data. However, the accuracy can be compromised by incomplete or incorrect data. The extent of compromise requires further study, using ontology-based and other approaches.
Collapse
Affiliation(s)
- Alireza Rahimi
- UNSW, School of Public Health & Community Medicine, Sydney, Australia; Isfahan University of Medical Sciences, Health Information Research Centre, Isfahan, Iran; UNSW, Asia-Pacific Ubiquitous Healthcare Research Centre, Sydney, Australia
| | - Siaw-Teng Liaw
- UNSW, School of Public Health & Community Medicine, Sydney, Australia; UNSW, Centre for Primary Health Care & Equity, Sydney, Australia; General Practice Unit, South Western Sydney Local Health District.
| | - Jane Taggart
- UNSW, Centre for Primary Health Care & Equity, Sydney, Australia
| | - Pradeep Ray
- UNSW, Asia-Pacific Ubiquitous Healthcare Research Centre, Sydney, Australia
| | - Hairong Yu
- UNSW, Centre for Primary Health Care & Equity, Sydney, Australia
| |
Collapse
|
9
|
Piovesan L, Molino G, Terenziani P. An ontological knowledge and multiple abstraction level decision support system in healthcare. ACTA ACUST UNITED AC 2014. [DOI: 10.1186/2193-8636-1-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
10
|
Anguita A, García-Remesal M, de la Iglesia D, Graf N, Maojo V. Toward a view-oriented approach for aligning RDF-based biomedical repositories. Methods Inf Med 2014; 54:50-5. [PMID: 24777240 DOI: 10.3414/me13-02-0020] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2013] [Accepted: 03/17/2014] [Indexed: 11/09/2022]
Abstract
INTRODUCTION This article is part of the Focus Theme of METHODS of Information in Medicine on "Managing Interoperability and Complexity in Health Systems". BACKGROUND The need for complementary access to multiple RDF databases has fostered new lines of research, but also entailed new challenges due to data representation disparities. While several approaches for RDF-based database integration have been proposed, those focused on schema alignment have become the most widely adopted. All state-of-the-art solutions for aligning RDF-based sources resort to a simple technique inherited from legacy relational database integration methods. This technique - known as element-to-element (e2e) mappings - is based on establishing 1:1 mappings between single primitive elements - e.g. concepts, attributes, relationships, etc. - belonging to the source and target schemas. However, due to the intrinsic nature of RDF - a representation language based on defining tuples < subject, predicate, object > -, one may find RDF elements whose semantics vary dramatically when combined into a view involving other RDF elements - i.e. they depend on their context. The latter cannot be adequately represented in the target schema by resorting to the traditional e2e approach. These approaches fail to properly address this issue without explicitly modifying the target ontology, thus lacking the required expressiveness for properly reflecting the intended semantics in the alignment information. OBJECTIVES To enhance existing RDF schema alignment techniques by providing a mechanism to properly represent elements with context-dependent semantics, thus enabling users to perform more expressive alignments, including scenarios that cannot be adequately addressed by the existing approaches. METHODS Instead of establishing 1:1 correspondences between single primitive elements of the schemas, we propose adopting a view-based approach. The latter is targeted at establishing mapping relationships between RDF subgraphs - that can be regarded as the equivalent of views in traditional databases -, rather than between single schema elements. This approach enables users to represent scenarios defined by context-dependent RDF elements that cannot be properly represented when adopting the currently existing approaches. RESULTS We developed a software tool implementing our view-based strategy. Our tool is currently being used in the context of the European Commission funded p-medicine project, targeted at creating a technological framework to integrate clinical and genomic data to facilitate the development of personalized drugs and therapies for cancer, based on the genetic profile of the patient. We used our tool to integrate different RDF-based databases - including different repositories of clinical trials and DICOM images - using the Health Data Ontology Trunk (HDOT) ontology as the target schema. CONCLUSIONS The importance of database integration methods and tools in the context of biomedical research has been widely recognized. Modern research in this area - e.g. identification of disease biomarkers, or design of personalized therapies - heavily relies on the availability of a technical framework to enable researchers to uniformly access disparate repositories. We present a method and a tool that implement a novel alignment method specifically designed to support and enhance the integration of RDF-based data sources at schema (metadata) level. This approach provides an increased level of expressiveness compared to other existing solutions, and allows solving heterogeneity scenarios that cannot be properly represented using other state-of-the-art techniques.
Collapse
Affiliation(s)
- A Anguita
- Alberto Anguita, PhD, Group of Biomedical Informatics, Universidad Politécnica de Madrid, Campus de Montegancedo s/n, 28660 Boadilla del Monte, Spain, E-mail:
| | | | | | | | | |
Collapse
|
11
|
Rahimi A, Liaw ST, Ray P, Taggart J, Yu H. Ontological specification of quality of chronic disease data in EHRs to support decision analytics: a realist review. ACTA ACUST UNITED AC 2014. [DOI: 10.1186/2193-8636-1-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Abstract
This systematic review examined the current state of conceptualization and specification of data quality and the role of ontology based approaches to develop data quality based on "fitness for purpose" within the health context. A literature review was conducted of all English language studies, from January 2000-March 2013, which addressed data/information quality, fitness for purpose of data, used and implemented ontology-based approaches. Included papers were critically appraised with a "context-mechanism-impacts/outcomes" overlay. We screened 315 papers, excluded 36 duplicates, 182 on abstract review and 46 on full-text review; leaving 52 papers for critical appraisal. Six papers conceptualized data quality within the "fitness for purpose" definition. While most agree with a multidimensional definition of DQ, there is little consensus on a conceptual framework. We found no reports of systematic and comprehensive ontological approaches to DQ based on fitness for purpose or use. However, 16 papers used ontology-specified implementations in DQ improvement, with most of them focusing on some dimensions of DQ such as completeness, accuracy, correctness, consistency and timeliness. The majority of papers described the processes of the development of DQ in various information systems. There were few evaluative studies, including any comparing ontological with non-ontological approaches, on the assessment of clinical data quality and the performance of the application.
Collapse
|
12
|
Miyazaki FA, Guardia GDA, Vêncio RZN, de Farias CRG. Semantic integration of gene expression analysis tools and data sources using software connectors. BMC Genomics 2013; 14 Suppl 6:S2. [PMID: 24341380 PMCID: PMC3908368 DOI: 10.1186/1471-2164-14-s6-s2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The study and analysis of gene expression measurements is the primary focus of functional genomics. Once expression data is available, biologists are faced with the task of extracting (new) knowledge associated to the underlying biological phenomenon. Most often, in order to perform this task, biologists execute a number of analysis activities on the available gene expression dataset rather than a single analysis activity. The integration of heterogeneous tools and data sources to create an integrated analysis environment represents a challenging and error-prone task. Semantic integration enables the assignment of unambiguous meanings to data shared among different applications in an integrated environment, allowing the exchange of data in a semantically consistent and meaningful way. This work aims at developing an ontology-based methodology for the semantic integration of gene expression analysis tools and data sources. The proposed methodology relies on software connectors to support not only the access to heterogeneous data sources but also the definition of transformation rules on exchanged data. RESULTS We have studied the different challenges involved in the integration of computer systems and the role software connectors play in this task. We have also studied a number of gene expression technologies, analysis tools and related ontologies in order to devise basic integration scenarios and propose a reference ontology for the gene expression domain. Then, we have defined a number of activities and associated guidelines to prescribe how the development of connectors should be carried out. Finally, we have applied the proposed methodology in the construction of three different integration scenarios involving the use of different tools for the analysis of different types of gene expression data. CONCLUSIONS The proposed methodology facilitates the development of connectors capable of semantically integrating different gene expression analysis tools and data sources. The methodology can be used in the development of connectors supporting both simple and nontrivial processing requirements, thus assuring accurate data exchange and information interpretation from exchanged data.
Collapse
Affiliation(s)
- Flávia A Miyazaki
- Department of Computer Science and Mathematics (DCM/FFCLRP), University of São Paulo (USP) Av. Bandeirantes, 3900 - Monte Alegre - Ribeirão Preto - SP - 14040-901 - Brazil
| | - Gabriela DA Guardia
- Department of Computer Science and Mathematics (DCM/FFCLRP), University of São Paulo (USP) Av. Bandeirantes, 3900 - Monte Alegre - Ribeirão Preto - SP - 14040-901 - Brazil
| | - Ricardo ZN Vêncio
- Department of Computer Science and Mathematics (DCM/FFCLRP), University of São Paulo (USP) Av. Bandeirantes, 3900 - Monte Alegre - Ribeirão Preto - SP - 14040-901 - Brazil
| | - Cléver RG de Farias
- Department of Computer Science and Mathematics (DCM/FFCLRP), University of São Paulo (USP) Av. Bandeirantes, 3900 - Monte Alegre - Ribeirão Preto - SP - 14040-901 - Brazil
| |
Collapse
|
13
|
Anguita A, Martin L, Garcia-Remesal M, Maojo V. RDFBuilder: a tool to automatically build RDF-based interfaces for MAGE-OM microarray data sources. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2013; 111:220-7. [PMID: 23669178 DOI: 10.1016/j.cmpb.2013.04.009] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2012] [Revised: 04/05/2013] [Accepted: 04/18/2013] [Indexed: 05/25/2023]
Abstract
This paper presents RDFBuilder, a tool that enables RDF-based access to MAGE-ML-compliant microarray databases. We have developed a system that automatically transforms the MAGE-OM model and microarray data stored in the ArrayExpress database into RDF format. Additionally, the system automatically enables a SPARQL endpoint. This allows users to execute SPARQL queries for retrieving microarray data, either from specific experiments or from more than one experiment at a time. Our system optimizes response times by caching and reusing information from previous queries. In this paper, we describe our methods for achieving this transformation. We show that our approach is complementary to other existing initiatives, such as Bio2RDF, for accessing and retrieving data from the ArrayExpress database.
Collapse
Affiliation(s)
- Alberto Anguita
- Biomedical Informatics Group, Artificial Intelligence Laboratory, School of Computer Science, Universidad Politécnica de Madrid, Campus de Montegancedo S/N, 28660 Boadilla del Monte, Madrid, Spain.
| | | | | | | |
Collapse
|
14
|
Liaw ST, Rahimi A, Ray P, Taggart J, Dennis S, de Lusignan S, Jalaludin B, Yeo AET, Talaei-Khoei A. Towards an ontology for data quality in integrated chronic disease management: a realist review of the literature. Int J Med Inform 2012; 82:10-24. [PMID: 23122633 DOI: 10.1016/j.ijmedinf.2012.10.001] [Citation(s) in RCA: 73] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2012] [Revised: 10/03/2012] [Accepted: 10/05/2012] [Indexed: 11/25/2022]
Abstract
PURPOSE Effective use of routine data to support integrated chronic disease management (CDM) and population health is dependent on underlying data quality (DQ) and, for cross system use of data, semantic interoperability. An ontological approach to DQ is a potential solution but research in this area is limited and fragmented. OBJECTIVE Identify mechanisms, including ontologies, to manage DQ in integrated CDM and whether improved DQ will better measure health outcomes. METHODS A realist review of English language studies (January 2001-March 2011) which addressed data quality, used ontology-based approaches and is relevant to CDM. RESULTS We screened 245 papers, excluded 26 duplicates, 135 on abstract review and 31 on full-text review; leaving 61 papers for critical appraisal. Of the 33 papers that examined ontologies in chronic disease management, 13 defined data quality and 15 used ontologies for DQ. Most saw DQ as a multidimensional construct, the most used dimensions being completeness, accuracy, correctness, consistency and timeliness. The majority of studies reported tool design and development (80%), implementation (23%), and descriptive evaluations (15%). Ontological approaches were used to address semantic interoperability, decision support, flexibility of information management and integration/linkage, and complexity of information models. CONCLUSION DQ lacks a consensus conceptual framework and definition. DQ and ontological research is relatively immature with little rigorous evaluation studies published. Ontology-based applications could support automated processes to address DQ and semantic interoperability in repositories of routinely collected data to deliver integrated CDM. We advocate moving to ontology-based design of information systems to enable more reliable use of routine data to measure health mechanisms and impacts.
Collapse
Affiliation(s)
- S T Liaw
- University of NSW School of Public Health & Community Medicine, Sydney, Australia.
| | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Maojo V, Crespo J, García-Remesal M, de la Iglesia D, Perez-Rey D, Kulikowski C. Biomedical ontologies: toward scientific debate. Methods Inf Med 2011; 50:203-16. [PMID: 21431244 DOI: 10.3414/me10-05-0004] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2010] [Accepted: 01/12/2011] [Indexed: 11/09/2022]
Abstract
OBJECTIVES Biomedical ontologies have been very successful in structuring knowledge for many different applications, receiving widespread praise for their utility and potential. Yet, the role of computational ontologies in scientific research, as opposed to knowledge management applications, has not been extensively discussed. We aim to stimulate further discussion on the advantages and challenges presented by biomedical ontologies from a scientific perspective. METHODS We review various aspects of biomedical ontologies going beyond their practical successes, and focus on some key scientific questions in two ways. First, we analyze and discuss current approaches to improve biomedical ontologies that are based largely on classical, Aristotelian ontological models of reality. Second, we raise various open questions about biomedical ontologies that require further research, analyzing in more detail those related to visual reasoning and spatial ontologies. RESULTS We outline significant scientific issues that biomedical ontologies should consider, beyond current efforts of building practical consensus between them. For spatial ontologies, we suggest an approach for building "morphospatial" taxonomies, as an example that could stimulate research on fundamental open issues for biomedical ontologies. CONCLUSIONS Analysis of a large number of problems with biomedical ontologies suggests that the field is very much open to alternative interpretations of current work, and in need of scientific debate and discussion that can lead to new ideas and research directions.
Collapse
Affiliation(s)
- V Maojo
- Biomedical Informatics Group, Departamento de Inteligencia Artificial, Faculdad de Informática, Universidad Politécnica de Madrid, Boadilla del Monte, 28660 Madrid, Spain.
| | | | | | | | | | | |
Collapse
|
16
|
Brochhausen M, Spear AD, Cocos C, Weiler G, Martín L, Anguita A, Stenzhorn H, Daskalaki E, Schera F, Schwarz U, Sfakianakis S, Kiefer S, Dörr M, Graf N, Tsiknakis M. The ACGT Master Ontology and its applications--towards an ontology-driven cancer research and management system. J Biomed Inform 2011; 44:8-25. [PMID: 20438862 PMCID: PMC5755590 DOI: 10.1016/j.jbi.2010.04.008] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2009] [Revised: 04/23/2010] [Accepted: 04/27/2010] [Indexed: 11/28/2022]
Abstract
OBJECTIVE This paper introduces the objectives, methods and results of ontology development in the EU co-funded project Advancing Clinico-genomic Trials on Cancer-Open Grid Services for Improving Medical Knowledge Discovery (ACGT). While the available data in the life sciences has recently grown both in amount and quality, the full exploitation of it is being hindered by the use of different underlying technologies, coding systems, category schemes and reporting methods on the part of different research groups. The goal of the ACGT project is to contribute to the resolution of these problems by developing an ontology-driven, semantic grid services infrastructure that will enable efficient execution of discovery-driven scientific workflows in the context of multi-centric, post-genomic clinical trials. The focus of the present paper is the ACGT Master Ontology (MO). METHODS ACGT project researchers undertook a systematic review of existing domain and upper-level ontologies, as well as of existing ontology design software, implementation methods, and end-user interfaces. This included the careful study of best practices, design principles and evaluation methods for ontology design, maintenance, implementation, and versioning, as well as for use on the part of domain experts and clinicians. RESULTS To date, the results of the ACGT project include (i) the development of a master ontology (the ACGT-MO) based on clearly defined principles of ontology development and evaluation; (ii) the development of a technical infrastructure (the ACGT Platform) that implements the ACGT-MO utilizing independent tools, components and resources that have been developed based on open architectural standards, and which includes an application updating and evolving the ontology efficiently in response to end-user needs; and (iii) the development of an Ontology-based Trial Management Application (ObTiMA) that integrates the ACGT-MO into the design process of clinical trials in order to guarantee automatic semantic integration without the need to perform a separate mapping process.
Collapse
Affiliation(s)
- Mathias Brochhausen
- Institute of Formal Ontology and Medical, Information Science, Saarland University, P.O. Box 15 11 50, 66041 Saarbrücken, Germany.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Alonso-Calvo R, Crespo J, Garc’ia-Remesal M, Anguita A, Maojo V. On distributing load in cloud computing: A real application for very-large image datasets. ACTA ACUST UNITED AC 2010. [DOI: 10.1016/j.procs.2010.04.300] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
18
|
Maojo V, Martin-Sanchez F, Kulikowski C, Rodriguez-Paton A, Fritts M. Nanoinformatics and DNA-based computing: catalyzing nanomedicine. Pediatr Res 2010; 67:481-9. [PMID: 20118825 DOI: 10.1203/pdr.0b013e3181d6245e] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Five decades of research and practical application of computers in biomedicine has given rise to the discipline of medical informatics, which has made many advances in genomic and translational medicine possible. Developments in nanotechnology are opening up the prospects for nanomedicine and regenerative medicine where informatics and DNA computing can become the catalysts enabling health care applications at sub-molecular or atomic scales. Although nanomedicine promises a new exciting frontier for clinical practice and biomedical research, issues involving cost-effectiveness studies, clinical trials and toxicity assays, drug delivery methods, and the implementation of new personalized therapies still remain challenging. Nanoinformatics can accelerate the introduction of nano-related research and applications into clinical practice, leading to an area that could be called "translational nanoinformatics." At the same time, DNA and RNA computing presents an entirely novel paradigm for computation. Nanoinformatics and DNA-based computing are together likely to completely change the way we model and process information in biomedicine and impact the emerging field of nanomedicine most strongly. In this article, we review work in nanoinformatics and DNA (and RNA)-based computing, including applications in nanopediatrics. We analyze their scientific foundations, current research and projects, envisioned applications and potential problems that might arise from them.
Collapse
Affiliation(s)
- Victor Maojo
- Departamento de Inteligencia Artificial, Universidad Politecnica de Madrid, Madrid 28660 Spain.
| | | | | | | | | |
Collapse
|
19
|
Colombo G, Merico D, Boncoraglio G, De Paoli F, Ellul J, Frisoni G, Nagy Z, van der Lugt A, Vassányi I, Antoniotti M. An ontological modeling approach to cerebrovascular disease studies: the NEUROWEB case. J Biomed Inform 2010; 43:469-84. [PMID: 20074662 DOI: 10.1016/j.jbi.2009.12.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2008] [Revised: 10/29/2009] [Accepted: 12/21/2009] [Indexed: 10/20/2022]
Abstract
The NEUROWEB project supports cerebrovascular researchers' association studies, intended as the search for statistical correlations between a feature (e.g., a genotype) and a phenotype. In this project the phenotype refers to the patients' pathological state, and thus it is formulated on the basis of the clinical data collected during the diagnostic activity. In order to enhance the statistical robustness of the association inquiries, the project involves four European Union clinical institutions. Each institution provides its proprietary repository, storing patients' data. Although all sites comply with common diagnostic guidelines, they also adopt specific protocols, resulting in partially discrepant repository contents. Therefore, in order to effectively exploit NEUROWEB data for association studies, it is necessary to provide a framework for the phenotype formulation, grounded on the clinical repository content which explicitly addresses the inherent integration problem. To that end, we developed an ontological model for cerebrovascular phenotypes, the NEUROWEB Reference Ontology, composed of three layers. The top-layer (Top Phenotypes) is an expert-based cerebrovascular disease taxonomy. The middle-layer deconstructs the Top Phenotypes into more elementary phenotypes (Low Phenotypes) and general-use medical concepts such as anatomical parts and topological concepts. The bottom-layer (Core Data Set, or CDS) comprises the clinical indicators required for cerebrovascular disorder diagnosis. Low Phenotypes are connected to the bottom-layer (CDS) by specifying what combination of CDS values is required for their existence. Finally, CDS elements are mapped to the local repositories of clinical data. The NEUROWEB system exploits the Reference Ontology to query the different repositories and to retrieve patients characterized by a common phenotype.
Collapse
Affiliation(s)
- Gianluca Colombo
- Dipartimento di Informatica, Sistemistica e Comunicazione (DISCo), Università degli Studi di Milano Bicocca, U14 Viale Sarca 336, I-20126 Milan, Italy
| | | | | | | | | | | | | | | | | | | |
Collapse
|
20
|
Mesiti M, Jiménez-Ruiz E, Sanz I, Berlanga-Llavori R, Perlasca P, Valentini G, Manset D. XML-based approaches for the integration of heterogeneous bio-molecular data. BMC Bioinformatics 2009; 10 Suppl 12:S7. [PMID: 19828083 PMCID: PMC2762072 DOI: 10.1186/1471-2105-10-s12-s7] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Background The today's public database infrastructure spans a very large collection of heterogeneous biological data, opening new opportunities for molecular biology, bio-medical and bioinformatics research, but raising also new problems for their integration and computational processing. Results In this paper we survey the most interesting and novel approaches for the representation, integration and management of different kinds of biological data by exploiting XML and the related recommendations and approaches. Moreover, we present new and interesting cutting edge approaches for the appropriate management of heterogeneous biological data represented through XML. Conclusion XML has succeeded in the integration of heterogeneous biomolecular information, and has established itself as the syntactic glue for biological data sources. Nevertheless, a large variety of XML-based data formats have been proposed, thus resulting in a difficult effective integration of bioinformatics data schemes. The adoption of a few semantic-rich standard formats is urgent to achieve a seamless integration of the current biological resources.
Collapse
Affiliation(s)
- Marco Mesiti
- Università degli Studi di Milano, Via Comelico 39, Milan, Italy.
| | | | | | | | | | | | | |
Collapse
|
21
|
Wehbe FH, Brown SH, Massion PP, Gadd CS, Masys DR, Aliferis CF. A novel information retrieval model for high-throughput molecular medicine modalities. Cancer Inform 2009; 8:1-17. [PMID: 19458790 PMCID: PMC2664697 DOI: 10.4137/cin.s964] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Significant research has been devoted to predicting diagnosis, prognosis, and response to treatment using high-throughput assays. Rapid translation into clinical results hinges upon efficient access to up-to-date and high-quality molecular medicine modalities. We first explain why this goal is inadequately supported by existing databases and portals and then introduce a novel semantic indexing and information retrieval model for clinical bioinformatics. The formalism provides the means for indexing a variety of relevant objects (e.g. papers, algorithms, signatures, datasets) and includes a model of the research processes that creates and validates these objects in order to support their systematic presentation once retrieved. We test the applicability of the model by constructing proof-of-concept encodings and visual presentations of evidence and modalities in molecular profiling and prognosis of: (a) diffuse large B-cell lymphoma (DLBCL) and (b) breast cancer.
Collapse
Affiliation(s)
- Firas H Wehbe
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA.
| | | | | | | | | | | |
Collapse
|
22
|
|
23
|
Gómez-López G, Valencia A. Bioinformatics and cancer research: building bridges for translational research. Clin Transl Oncol 2008; 10:85-95. [DOI: 10.1007/s12094-008-0161-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
|
24
|
Abstract
AbstractOntologies play a key role in the advent of the Semantic Web. An important problem when dealing with ontologies is the modification of an existing ontology in response to a certain need for change. This problem is a complex and multifaceted one, because it can take several different forms and includes several related subproblems, like heterogeneity resolution or keeping track of ontology versions. As a result, it is being addressed by several different, but closely related and often overlapping research disciplines. Unfortunately, the boundaries of each such discipline are not clear, as the same term is often used with different meanings in the relevant literature, creating a certain amount of confusion. The purpose of this paper is to identify the exact relationships between these research areas and to determine the boundaries of each field, by performing a broad review of the relevant literature.
Collapse
|
25
|
Masseroli M. Management and analysis of genomic functional and phenotypic controlled annotations to support biomedical investigation and practice. ACTA ACUST UNITED AC 2007; 11:376-85. [PMID: 17674620 DOI: 10.1109/titb.2006.884367] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The growing available genomic information provides new opportunities for novel research approaches and original biomedical applications that can provide effective data management and analysis support. In fact, integration and comprehensive evaluation of available controlled data can highlight information patterns leading to unveil new biomedical knowledge. Here, we describe Genome Function INtegrated Discover (GFINDer), a Web-accessible three-tier multidatabase system we developed to automatically enrich lists of user-classified genes with several functional and phenotypic controlled annotations, and to statistically evaluate them in order to identify annotation categories significantly over- or underrepresented in each considered gene class. Genomic controlled annotations from Gene Ontology (GO), KEGG, Pfam, InterPro, and Online Mendelian Inheritance in Man (OMIM) were integrated in GFINDer and several categorical tests were implemented for their analysis. A controlled vocabulary of inherited disorder phenotypes was obtained by normalizing and hierarchically structuring disease accompanying signs and symptoms from OMIM Clinical Synopsis sections. GFINDer modular architecture is well suited for further system expansion and for sustaining increasing workload. Testing results showed that GFINDer analyses can highlight gene functional and phenotypic characteristics and differences, demonstrating its value in supporting genomic biomedical approaches aiming at understanding the complex biomolecular mechanisms underlying patho-physiological phenotypes, and in helping the transfer of genomic results to medical practice.
Collapse
Affiliation(s)
- Marco Masseroli
- BioMedical Informatics Laboratory, Dipartimento di Bioingegneria, Politecnico di Milano, 1-20133 Milan, Italy.
| |
Collapse
|
26
|
Maojo V, Tsiknakis M. Biomedical informatics and healthGRIDs: a European perspective. IEEE ENGINEERING IN MEDICINE AND BIOLOGY MAGAZINE : THE QUARTERLY MAGAZINE OF THE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY 2007; 26:34-41. [PMID: 17549918 DOI: 10.1109/memb.2007.364927] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Affiliation(s)
- Victor Maojo
- Biomedical Informatics Group, Artificial Intelligence Laboratory, School of Computer Science, Universidad Politecnica de Madrid, Spain.
| | | |
Collapse
|
27
|
Interface analysis between GSVML and HL7 version 3. J Biomed Inform 2007; 40:527-38. [PMID: 17293166 DOI: 10.1016/j.jbi.2006.12.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2006] [Revised: 12/01/2006] [Accepted: 12/10/2006] [Indexed: 10/23/2022]
Abstract
In order to realize gene-based medicine, a number of key challenges must be overcome. Construction of infrastructure capable of integrating genetic and clinical information is one of those challenges. The Genomic Sequence Variation Markup Language (GSVML) and the Health Level Seven Version 3 (HL7v3) are important electronic data exchange standards for clinical genome infrastructure, and compatibility between these two standards will promote the above integration. In this study, we analyzed the interface between GSVML and HL7v3, primarily for the Clinical Genomics Domain, from a view of the GSVML, and were able to create a blueprint for a functional interface between GSVML and HL7v3. We expect that these analytical results will help accelerate the realization of gene-based medicine.
Collapse
|
28
|
Abstract
In recent years, as a knowledge-based discipline, bioinformatics has been made more computationally amenable. After its beginnings as a technology advocated by computer scientists to overcome problems of heterogeneity, ontology has been taken up by biologists themselves as a means to consistently annotate features from genotype to phenotype. In medical informatics, artifacts called ontologies have been used for a longer period of time to produce controlled lexicons for coding schemes. In this article, we review the current position in ontologies and how they have become institutionalized within biomedicine. As the field has matured, the much older philosophical aspects of ontology have come into play. With this and the institutionalization of ontology has come greater formality. We review this trend and what benefits it might bring to ontologies and their use within biomedicine.
Collapse
|