1
|
Yu C, Zong H, Chen Y, Zhou Y, Liu X, Lin Y, Li J, Zheng X, Min H, Shen B. PCAO2: an ontology for integration of prostate cancer associated genotypic, phenotypic and lifestyle data. Brief Bioinform 2024; 25:bbae136. [PMID: 38557678 PMCID: PMC10982949 DOI: 10.1093/bib/bbae136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Revised: 12/19/2023] [Accepted: 03/07/2024] [Indexed: 04/04/2024] Open
Abstract
Disease ontologies facilitate the semantic organization and representation of domain-specific knowledge. In the case of prostate cancer (PCa), large volumes of research results and clinical data have been accumulated and needed to be standardized for sharing and translational researches. A formal representation of PCa-associated knowledge will be essential to the diverse data standardization, data sharing and the future knowledge graph extraction, deep phenotyping and explainable artificial intelligence developing. In this study, we constructed an updated PCa ontology (PCAO2) based on the ontology development life cycle. An online information retrieval system was designed to ensure the usability of the ontology. The PCAO2 with a subclass-based taxonomic hierarchy covers the major biomedical concepts for PCa-associated genotypic, phenotypic and lifestyle data. The current version of the PCAO2 contains 633 concepts organized under three biomedical viewpoints, namely, epidemiology, diagnosis and treatment. These concepts are enriched by the addition of definition, synonym, relationship and reference. For the precision diagnosis and treatment, the PCa-associated genes and lifestyles are integrated in the viewpoint of epidemiological aspects of PCa. PCAO2 provides a standardized and systematized semantic framework for studying large amounts of heterogeneous PCa data and knowledge, which can be further, edited and enriched by the scientific community. The PCAO2 is freely available at https://bioportal.bioontology.org/ontologies/PCAO, http://pcaontology.net/ and http://pcaontology.net/mobile/.
Collapse
Affiliation(s)
- Chunjiang Yu
- Department of Urology and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610041, China
- School of Artificial Intelligence, Suzhou Industrial Park Institute of Services Outsourcing, Suzhou, 215123, China
- Center for Systems Biology, Soochow University, Suzhou, 215006, China
| | - Hui Zong
- Department of Urology and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610041, China
| | - Yalan Chen
- Department of Urology and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610041, China
- Center for Systems Biology, Soochow University, Suzhou, 215006, China
- Department of Medical Informatics, School of Medicine, Nantong University, Nantong, 226001, China
| | - Yibin Zhou
- Department of Urology, The Second Affiliated Hospital of Soochow University, Suzhou, 215011, China
| | - Xingyun Liu
- Department of Urology and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610041, China
| | - Yuxin Lin
- Department of Urology, The First Affiliated Hospital of Soochow University, Suzhou, 215000, China
| | - Jiakun Li
- Department of Urology and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610041, China
| | - Xiaonan Zheng
- Department of Urology and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610041, China
| | - Hua Min
- Department of Health Administration and Policy, George Mason University, Fairfax, VA, USA
| | - Bairong Shen
- Department of Urology and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610041, China
| |
Collapse
|
2
|
Hernández L, Estévez-Priego E, López-Pérez L, Fernanda Cabrera-Umpiérrez M, Arredondo MT, Fico G. HeNeCOn: An ontology for integrative research in Head and Neck cancer. Int J Med Inform 2024; 181:105284. [PMID: 37981440 DOI: 10.1016/j.ijmedinf.2023.105284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 07/14/2023] [Accepted: 11/01/2023] [Indexed: 11/21/2023]
Abstract
BACKGROUND Head and Neck Cancer (HNC) has a high incidence and prevalence in the worldwide population. The broad terminology associated with these diseases and their multimodality treatments generates large amounts of heterogeneous clinical data, which motivates the construction of a high-quality harmonization model to standardize this multi-source clinical data in terms of format and semantics. The use of ontologies and semantic techniques is a well-known approach to face this challenge. OBJECTIVE This work aims to provide a clinically reliable data model for HNC processes during all phases of the disease: prognosis, treatment, and follow-up. Therefore, we built the first ontology specifically focused on the HNC domain, named HeNeCOn (Head and Neck Cancer Ontology). METHODS First, an annotated dataset was established to provide a formal reference description of HNC. Then, 170 clinical variables were organized into a taxonomy, and later expanded and mapped to formalize and integrate multiple databases into the HeNeCOn ontology. The outcomes of this iterative process were reviewed and validated by clinicians and statisticians. RESULTS HeNeCOn is an ontology consisting of 502 classes, a taxonomy with a hierarchical structure, semantic definitions of 283 medical terms and detailed relations between them, which can be used as a tool for information extraction and knowledge management. CONCLUSION HeNeCOn is a reusable, extendible and standardized ontology which establishes a reference data model for terminology structure and standard definitions in the Head and Neck Cancer domain. This ontology allows handling both current and newly generated knowledge in Head and Neck cancer research, by means of data linking and mapping with other public ontologies.
Collapse
Affiliation(s)
- Liss Hernández
- Universidad Politécnica de Madrid-Life Supporting Technologies Research Group, ETSIT, 28040 Madrid, Spain
| | - Estefanía Estévez-Priego
- Universidad Politécnica de Madrid-Life Supporting Technologies Research Group, ETSIT, 28040 Madrid, Spain
| | - Laura López-Pérez
- Universidad Politécnica de Madrid-Life Supporting Technologies Research Group, ETSIT, 28040 Madrid, Spain
| | | | - María Teresa Arredondo
- Universidad Politécnica de Madrid-Life Supporting Technologies Research Group, ETSIT, 28040 Madrid, Spain
| | - Giuseppe Fico
- Universidad Politécnica de Madrid-Life Supporting Technologies Research Group, ETSIT, 28040 Madrid, Spain.
| |
Collapse
|
3
|
Kapoor R, Sleeman WC, Ghosh P, Palta J. Infrastructure tools to support an effective Radiation Oncology Learning Health System. J Appl Clin Med Phys 2023; 24:e14127. [PMID: 37624227 PMCID: PMC10562037 DOI: 10.1002/acm2.14127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 07/17/2023] [Accepted: 07/19/2023] [Indexed: 08/26/2023] Open
Abstract
PURPOSE Radiation Oncology Learning Health System (RO-LHS) is a promising approach to improve the quality of care by integrating clinical, dosimetry, treatment delivery, research data in real-time. This paper describes a novel set of tools to support the development of a RO-LHS and the current challenges they can address. METHODS We present a knowledge graph-based approach to map radiotherapy data from clinical databases to an ontology-based data repository using FAIR concepts. This strategy ensures that the data are easily discoverable, accessible, and can be used by other clinical decision support systems. It allows for visualization, presentation, and data analyses of valuable information to identify trends and patterns in patient outcomes. We designed a search engine that utilizes ontology-based keyword searching, synonym-based term matching that leverages the hierarchical nature of ontologies to retrieve patient records based on parent and children classes, connects to the Bioportal database for relevant clinical attributes retrieval. To identify similar patients, a method involving text corpus creation and vector embedding models (Word2Vec, Doc2Vec, GloVe, and FastText) are employed, using cosine similarity and distance metrics. RESULTS The data pipeline and tool were tested with 1660 patient clinical and dosimetry records resulting in 504 180 RDF (Resource Description Framework) tuples and visualized data relationships using graph-based representations. Patient similarity analysis using embedding models showed that the Word2Vec model had the highest mean cosine similarity, while the GloVe model exhibited more compact embeddings with lower Euclidean and Manhattan distances. CONCLUSIONS The framework and tools described support the development of a RO-LHS. By integrating diverse data sources and facilitating data discovery and analysis, they contribute to continuous learning and improvement in patient care. The tools enhance the quality of care by enabling the identification of cohorts, clinical decision support, and the development of clinical studies and machine learning programs in radiation oncology.
Collapse
Affiliation(s)
- Rishabh Kapoor
- Department of Radiation OncologyVirginia Commonwealth UniversityRichmondVirginiaUSA
| | - William C Sleeman
- Department of Radiation OncologyVirginia Commonwealth UniversityRichmondVirginiaUSA
| | - Preetam Ghosh
- Department of Radiation OncologyVirginia Commonwealth UniversityRichmondVirginiaUSA
| | - Jatinder Palta
- Department of Radiation OncologyVirginia Commonwealth UniversityRichmondVirginiaUSA
| |
Collapse
|
4
|
Moreno A, Solanki AA, Xu T, Lin R, Palta J, Daugherty E, Hong D, Hong J, Kamran SC, Katsoulakis E, Brock K, Feng M, Fuller C, Mayo C, Consortium BDSCPC. Identification of Key Elements in Prostate Cancer for Ontology Building via a Multidisciplinary Consensus Agreement. Cancers (Basel) 2023; 15:3121. [PMID: 37370731 PMCID: PMC10295832 DOI: 10.3390/cancers15123121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2023] [Revised: 05/25/2023] [Accepted: 06/01/2023] [Indexed: 06/29/2023] Open
Abstract
BACKGROUND Clinical data collection related to prostate cancer (PCa) care is often unstructured or heterogeneous among providers, resulting in a high risk for ambiguity in its meaning when sharing or analyzing data. Ontologies, which are shareable formal (i.e., computable) representations of knowledge, can address these challenges by enabling machine-readable semantic interoperability. The purpose of this study was to identify PCa-specific key data elements (KDEs) for standardization in clinic and research. METHODS A modified Delphi method using iterative online surveys was performed to report a consensus agreement on KDEs by a multidisciplinary panel of 39 PCa specialists. Data elements were divided into three themes in PCa and included (1) treatment-related toxicities (TRT), (2) patient-reported outcome measures (PROM), and (3) disease control metrics (DCM). RESULTS The panel reached consensus on a thirty-item, two-tiered list of KDEs focusing mainly on urinary and rectal symptoms. The Expanded Prostate Cancer Index Composite (EPIC-26) questionnaire was considered most robust for PROM multi-domain monitoring, and granular KDEs were defined for DCM. CONCLUSIONS This expert consensus on PCa-specific KDEs has served as a foundation for a professional society-endorsed, publicly available operational ontology developed by the American Association of Physicists in Medicine (AAPM) Big Data Sub Committee (BDSC).
Collapse
Affiliation(s)
- Amy Moreno
- Department of Radiation Oncology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA;
| | - Abhishek A. Solanki
- Department of Radiation Oncology, Loyola University Medical Center, Berwyn, IL 60402, USA;
| | - Tianlin Xu
- Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; (T.X.); (R.L.)
| | - Ruitao Lin
- Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; (T.X.); (R.L.)
| | - Jatinder Palta
- Department of Medical Physics, Virginia Commonwealth University, Richmond, VA 23284, USA;
| | - Emily Daugherty
- Department of Radiation Oncology, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA;
| | - David Hong
- Department of Radiation Oncology, University of Southern California, Los Angeles, CA 90089, USA;
| | - Julian Hong
- Department of Radiation Oncology, University of California San Francisco, San Francisco, CA 93701, USA; (J.H.); (M.F.)
| | - Sophia C. Kamran
- Department of Radiation Oncology, Massachusetts General Hospital, Boston, MA 02129, USA;
| | - Evangelia Katsoulakis
- Department of Radiation Oncology, James A Haley VA Medical Center, Tampa, FL 33612, USA;
| | - Kristy Brock
- Department of Imaging Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA;
| | - Mary Feng
- Department of Radiation Oncology, University of California San Francisco, San Francisco, CA 93701, USA; (J.H.); (M.F.)
| | - Clifton Fuller
- Department of Radiation Oncology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA;
| | - Charles Mayo
- Department of Radiation Physics, University of Michigan, Ann Arbor, MI 48109, USA;
| | | |
Collapse
|
5
|
Zhang H, Lyu T, Yin P, Bost S, He X, Guo Y, Prosperi M, Hogan WR, Bian J. A scoping review of semantic integration of health data and information. Int J Med Inform 2022; 165:104834. [PMID: 35863206 DOI: 10.1016/j.ijmedinf.2022.104834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 07/06/2022] [Accepted: 07/13/2022] [Indexed: 11/25/2022]
Abstract
OBJECTIVE We summarized a decade of new research focusing on semantic data integration (SDI) since 2009, and we aim to: (1) summarize the state-of-art approaches on integrating health data and information; and (2) identify the main gaps and challenges of integrating health data and information from multiple levels and domains. MATERIALS AND METHODS We used PubMed as our focus is applications of SDI in biomedical domains and followed the Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA) to search and report for relevant studies published between January 1, 2009 and December 31, 2021. We used Covidence-a systematic review management system-to carry out this scoping review. RESULTS The initial search from PubMed resulted in 5,326 articles using the two sets of keywords. We then removed 44 duplicates and 5,282 articles were retained for abstract screening. After abstract screening, we included 246 articles for full-text screening, among which 87 articles were deemed eligible for full-text extraction. We summarized the 87 articles from four aspects: (1) methods for the global schema; (2) data integration strategies (i.e., federated system vs. data warehousing); (3) the sources of the data; and (4) downstream applications. CONCLUSION SDI approach can effectively resolve the semantic heterogeneities across different data sources. We identified two key gaps and challenges in existing SDI studies that (1) many of the existing SDI studies used data from only single-level data sources (e.g., integrating individual-level patient records from different hospital systems), and (2) documentation of the data integration processes is sparse, threatening the reproducibility of SDI studies.
Collapse
Affiliation(s)
- Hansi Zhang
- Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, United States
| | - Tianchen Lyu
- Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, United States
| | - Pengfei Yin
- Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, United States
| | - Sarah Bost
- Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, United States
| | - Xing He
- Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, United States
| | - Yi Guo
- Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, United States
| | - Mattia Prosperi
- Department of Epidemiology, College of Medicine, University of Florida, Gainesville, FL, United States
| | - Willian R Hogan
- Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, United States
| | - Jiang Bian
- Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, United States.
| |
Collapse
|
6
|
Kalendralis P, Sloep M, van Soest J, Dekker A, Fijten R. Making radiotherapy more efficient with FAIR data. Phys Med 2021; 82:158-162. [PMID: 33639520 DOI: 10.1016/j.ejmp.2021.01.083] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/03/2020] [Revised: 01/26/2021] [Accepted: 01/29/2021] [Indexed: 12/13/2022] Open
Abstract
Given the rapid growth of artificial intelligence (AI) applications in radiotherapy and the related transformations toward the data-driven healthcare domain, this article summarizes the need and usage of the FAIR (Findable, Accessible, Interoperable, Reusable) data principles in radiotherapy. This work introduces the FAIR data concept, presents practical and relevant use cases and the future role of the different parties involved. The goal of this article is to provide guidance and potential applications of FAIR to various radiotherapy stakeholders, focusing on the central role of medical physicists.
Collapse
Affiliation(s)
- Petros Kalendralis
- Department of Radiation Oncology (Maastro), GROW School for Oncology, Maastricht University Medical Centre+, 6229 ET Maastricht, The Netherlands.
| | - Matthijs Sloep
- Department of Radiation Oncology (Maastro), GROW School for Oncology, Maastricht University Medical Centre+, 6229 ET Maastricht, The Netherlands
| | - Johan van Soest
- Department of Radiation Oncology (Maastro), GROW School for Oncology, Maastricht University Medical Centre+, 6229 ET Maastricht, The Netherlands
| | - Andre Dekker
- Department of Radiation Oncology (Maastro), GROW School for Oncology, Maastricht University Medical Centre+, 6229 ET Maastricht, The Netherlands
| | - Rianne Fijten
- Department of Radiation Oncology (Maastro), GROW School for Oncology, Maastricht University Medical Centre+, 6229 ET Maastricht, The Netherlands
| |
Collapse
|
7
|
|
8
|
Vizza P, Tradigo G, Guzzi PH, Curia R, Sisca L, Aiello F, Fragomeni G, Cannataro M, Cascini GL, Veltri P. An Innovative Framework for Bioimage Annotation and Studies. Interdiscip Sci 2018; 10:544-557. [PMID: 29094319 DOI: 10.1007/s12539-017-0264-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2017] [Revised: 09/11/2017] [Accepted: 09/13/2017] [Indexed: 06/07/2023]
Abstract
The collection and analysis of clinical data are needed to investigate diseases and to define medical protocols and treatments. Bioimages, medical annotations and patient history are clinical data acquired and studied to perform a correct diagnosis and to propose an appropriate therapy. Currently, hospital departments manage these data using legacy systems which do not often allow data integration among different departments or health structures. Thus, in many cases clinical information sharing and exchange are difficult to implement. This is also the case for biomedical images for which data integration or data overlapping is usually not available. Image annotations and comparison can be crucial for physicians in many case studies. In this paper, a general purpose framework for bioimage management and annotations is proposed. Moreover, a simple-to-use information system has been developed to integrate clinical and diagnosis codes. The framework allows physicians (1) to integrate DICOM images from different platforms and (2) to report notes and highlights directly on images, thus offering, among the others, to query and compare similar clinical cases. This contribution is the result of a framework aimed to support oncologists in managing DICOM images and clinical data from different departments. Data integration is performed using a here-proposed XML-based module also utilized to trace temporal changes in image annotations.
Collapse
Affiliation(s)
- Patrizia Vizza
- Department of Surgical and Medical Science, Magna Graecia University, Catanzaro, Italy
| | - Giuseppe Tradigo
- Department of Computer, Modeling, Electronics and Systems Engineering, University of Calabria, Cosenza, Italy
| | - Pietro Hiram Guzzi
- Department of Surgical and Medical Science, Magna Graecia University, Catanzaro, Italy
| | | | | | | | - Gionata Fragomeni
- Department of Surgical and Medical Science, Magna Graecia University, Catanzaro, Italy
| | - Mario Cannataro
- Department of Surgical and Medical Science, Magna Graecia University, Catanzaro, Italy
| | - Giuseppe Lucio Cascini
- Department of Experimental and Clinical Medicine, Magna Graecia University, Catanzaro, Italy
| | - Pierangelo Veltri
- Department of Surgical and Clinical Science, University Magna Graecia of Catanzaro, Catanzaro, Italy.
| |
Collapse
|
9
|
Traverso A, van Soest J, Wee L, Dekker A. The radiation oncology ontology (ROO): Publishing linked data in radiation oncology using semantic web and ontology techniques. Med Phys 2018; 45:e854-e862. [DOI: 10.1002/mp.12879] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2017] [Revised: 01/26/2018] [Accepted: 02/17/2018] [Indexed: 12/30/2022] Open
Affiliation(s)
- Alberto Traverso
- Department of Radiation Oncology (MAASTRO); GROW School for Oncology and Developmental Biology; Maastricht University Medical Centre+; Maastricht 6062 NA the Netherlands
| | - Johan van Soest
- Department of Radiation Oncology (MAASTRO); GROW School for Oncology and Developmental Biology; Maastricht University Medical Centre+; Maastricht 6062 NA the Netherlands
| | - Leonard Wee
- Department of Radiation Oncology (MAASTRO); GROW School for Oncology and Developmental Biology; Maastricht University Medical Centre+; Maastricht 6062 NA the Netherlands
| | - Andre Dekker
- Department of Radiation Oncology (MAASTRO); GROW School for Oncology and Developmental Biology; Maastricht University Medical Centre+; Maastricht 6062 NA the Netherlands
| |
Collapse
|
10
|
Esteban-Gil A, Fernández-Breis JT, Boeker M. Analysis and visualization of disease courses in a semantically-enabled cancer registry. J Biomed Semantics 2017; 8:46. [PMID: 28962670 PMCID: PMC5622544 DOI: 10.1186/s13326-017-0154-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2016] [Accepted: 09/19/2017] [Indexed: 12/20/2022] Open
Abstract
Background Regional and epidemiological cancer registries are important for cancer research and the quality management of cancer treatment. Many technological solutions are available to collect and analyse data for cancer registries nowadays. However, the lack of a well-defined common semantic model is a problem when user-defined analyses and data linking to external resources are required. The objectives of this study are: (1) design of a semantic model for local cancer registries; (2) development of a semantically-enabled cancer registry based on this model; and (3) semantic exploitation of the cancer registry for analysing and visualising disease courses. Results Our proposal is based on our previous results and experience working with semantic technologies. Data stored in a cancer registry database were transformed into RDF employing a process driven by OWL ontologies. The semantic representation of the data was then processed to extract semantic patient profiles, which were exploited by means of SPARQL queries to identify groups of similar patients and to analyse the disease timelines of patients. Based on the requirements analysis, we have produced a draft of an ontology that models the semantics of a local cancer registry in a pragmatic extensible way. We have implemented a Semantic Web platform that allows transforming and storing data from cancer registries in RDF. This platform also permits users to formulate incremental user-defined queries through a graphical user interface. The query results can be displayed in several customisable ways. The complex disease timelines of individual patients can be clearly represented. Different events, e.g. different therapies and disease courses, are presented according to their temporal and causal relations. Conclusion The presented platform is an example of the parallel development of ontologies and applications that take advantage of semantic web technologies in the medical field. The semantic structure of the representation renders it easy to analyse key figures of the patients and their evolution at different granularity levels.
Collapse
Affiliation(s)
- Angel Esteban-Gil
- Fundación para la Formación e Investigación Sanitarias de la Región de Murcia, Biomedical Informatics & Bioinformatics Platform, IMIB-Arrixaca, C/ Luis Fontes Pagán, n° 9, Murcia, 30003, Spain
| | - Jesualdo Tomás Fernández-Breis
- Dpto. Informática y Sistemas, Facultad de Informática, Universidad de Murcia, IMIB-Arrixaca, Facultad de Informática, Campus de Espinardo, Murcia, 30100, Spain.
| | - Martin Boeker
- Institute for Medical Biometry and Statistics, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Stefan-Meier-Str. 26, Freiburg, 79104, Germany
| |
Collapse
|
11
|
Yang C, Pinart M, Kolsteren P, Van Camp J, De Cock N, Nimptsch K, Pischon T, Laird E, Perozzi G, Canali R, Hoge A, Stelmach-Mardas M, Dragsted LO, Palombi SM, Dobre I, Bouwman J, Clarys P, Minervini F, De Angelis M, Gobbetti M, Tafforeau J, Coltell O, Corella D, De Ruyck H, Walton J, Kehoe L, Matthys C, De Baets B, De Tré G, Bronselaer A, Rivellese A, Giacco R, Lombardo R, De Clercq S, Hulstaert N, Lachat C. Perspective: Essential Study Quality Descriptors for Data from Nutritional Epidemiologic Research. Adv Nutr 2017; 8:639-651. [PMID: 28916566 PMCID: PMC5593109 DOI: 10.3945/an.117.015651] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Pooled analysis of secondary data increases the power of research and enables scientific discovery in nutritional epidemiology. Information on study characteristics that determine data quality is needed to enable correct reuse and interpretation of data. This study aims to define essential quality characteristics for data from observational studies in nutrition. First, a literature review was performed to get an insight on existing instruments that assess the quality of cohort, case-control, and cross-sectional studies and dietary measurement. Second, 2 face-to-face workshops were organized to determine the study characteristics that affect data quality. Third, consensus on the data descriptors and controlled vocabulary was obtained. From 4884 papers retrieved, 26 relevant instruments, containing 164 characteristics for study design and 93 characteristics for measurements, were selected. The workshop and consensus process resulted in 10 descriptors allocated to "study design" and 22 to "measurement" domains. Data descriptors were organized as an ordinal scale of items to facilitate the identification, storage, and querying of nutrition data. Further integration of an Ontology for Nutrition Studies will facilitate interoperability of data repositories.
Collapse
Affiliation(s)
- Chen Yang
- Departments of Food Safety and Food Quality, Ghent University, Ghent, Belgium
| | - Mariona Pinart
- Molecular Epidemiology Research Group, Max Delbrück Centre for Molecular Medicine, Berlin, Germany
| | - Patrick Kolsteren
- Departments of Food Safety and Food Quality, Ghent University, Ghent, Belgium
| | - John Van Camp
- Departments of Food Safety and Food Quality, Ghent University, Ghent, Belgium
| | - Nathalie De Cock
- Departments of Food Safety and Food Quality, Ghent University, Ghent, Belgium
| | - Katharina Nimptsch
- Molecular Epidemiology Research Group, Max Delbrück Centre for Molecular Medicine, Berlin, Germany
| | - Tobias Pischon
- Molecular Epidemiology Research Group, Max Delbrück Centre for Molecular Medicine, Berlin, Germany
- Charité - Berlin University of Medicine, Berlin, Germany
- Max Delbrück Center for Molecular Medicine and Berlin Institute of Health, Berlin, Germany
- German Centre for Cardiovascular Research, partner site, Berlin, Germany
| | - Eamon Laird
- Vitamin Research Group, Trinity College Dublin, Dublin, Ireland
| | | | | | - Axelle Hoge
- Department of Public Health, University of Liège, Liège, Belgium
| | - Marta Stelmach-Mardas
- Department of Epidemiology, German Institute of Human Nutrition Potsdam-Rehbruecke, Nuthetal, Germany
- Department of Pediatric Gastroenterology and Metabolic Diseases, Poznan University of Medical Sciences, Poznan, Poland
| | - Lars Ove Dragsted
- Department of Nutrition, Exercise and Sports, University of Copenhagen, Frederiksberg, Denmark
| | - Stéphanie Maria Palombi
- Department of Nutrition, Exercise and Sports, University of Copenhagen, Frederiksberg, Denmark
| | - Irina Dobre
- Department of Nutrition, Exercise and Sports, University of Copenhagen, Frederiksberg, Denmark
| | - Jildau Bouwman
- Netherlands Organisation for Applied Scientific Research, Zeist, Netherlands
| | | | - Fabio Minervini
- Department of Soil, Plant and Food Science, University of Bari Aldo Moro, Bari, Italy
| | - Maria De Angelis
- Department of Soil, Plant and Food Science, University of Bari Aldo Moro, Bari, Italy
| | - Marco Gobbetti
- Faculty of Science and Technology, Free University of Bozen-Bolzano, Bolzano, Italy
| | - Jean Tafforeau
- Department of Public Health and Surveillance, Scientific Institute of Public Health, Brussels, Belgium
| | - Oscar Coltell
- Department of Computer Languages and Systems, University Jaume I, Castellón, Spain
- Department of Preventive Medicine and Public Health, University of Valencia, Valencia, Spain
| | - Dolores Corella
- Department of Preventive Medicine and Public Health, University of Valencia, Valencia, Spain
- Biomedical Research Centre in Physiopathology of Obesity and Nutrition, Institute of Health Carlos III, Madrid, Spain
| | - Hendrik De Ruyck
- Flanders research institute for agriculture, fisheries and food, Technology and Food Science Unit, Food Safety and Product Innovation, Melle, Belgium
| | - Janette Walton
- School of Food and Nutritional Sciences, University College Cork, Cork, Ireland
| | - Laura Kehoe
- School of Food and Nutritional Sciences, University College Cork, Cork, Ireland
| | - Christophe Matthys
- KU Leuven, Clinical and Experimental Endocrinology and University Hospitals Leuven/KU Leuven, Department of Endocrinology, Campus Gasthuisberg, Leuven, Belgium
| | - Bernard De Baets
- Mathematical Modelling, Statistics and Bioinformatic, Ghent University, Ghent, Belgium
| | - Guy De Tré
- Telecommunications and Information Processing, Ghent University, Ghent, Belgium
| | - Antoon Bronselaer
- Telecommunications and Information Processing, Ghent University, Ghent, Belgium
| | - Angela Rivellese
- Department of Clinical Medicine and Surgery, School of Medicine, University Federico II, Naples, Italy
| | - Rosalba Giacco
- Institute of Food Sciences of National Research Council, Avellino, Italy
| | - Rosario Lombardo
- The Microsoft Research, University of Trento Centre for Computational and Systems Biology, Trento, Italy
| | - Sofian De Clercq
- Department of Biochemistry, Ghent University, Faculty of Medicine and Health Sciences, Ghent, Belgium
| | - Niels Hulstaert
- Biochemistry, Ghent University, Ghent, Belgium
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
| | - Carl Lachat
- Departments of Food Safety and Food Quality, Ghent University, Ghent, Belgium
| |
Collapse
|
12
|
Alitto AR, Gatta R, Vanneste B, Vallati M, Meldolesi E, Damiani A, Lanzotti V, Mattiucci GC, Frascino V, Masciocchi C, Catucci F, Dekker A, Lambin P, Valentini V, Mantini G. PRODIGE: PRediction models in prOstate cancer for personalized meDIcine challenGE. Future Oncol 2017; 13:2171-2181. [PMID: 28758431 DOI: 10.2217/fon-2017-0142] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
AIM Identifying the best care for a patient can be extremely challenging. To support the creation of multifactorial Decision Support Systems (DSSs), we propose an Umbrella Protocol, focusing on prostate cancer. MATERIALS & METHODS The PRODIGE project consisted of a workflow for standardizing data, and procedures, to create a consistent dataset useful to elaborate DSSs. Techniques from classical statistics and machine learning will be adopted. The general protocol accepted by our Ethical Committee can be downloaded from cancerdata.org . RESULTS A standardized knowledge sharing process has been implemented by using a semi-formal ontology for the representation of relevant clinical variables. CONCLUSION The development of DSSs, based on standardized knowledge, could be a tool to achieve a personalized decision-making.
Collapse
Affiliation(s)
- A R Alitto
- Radiation Oncology Area, Gemelli-ART, Catholic University of the Sacred Heart, Rome, Italy
| | - R Gatta
- Radiation Oncology Area, Gemelli-ART, Catholic University of the Sacred Heart, Rome, Italy
| | - Bgl Vanneste
- Department of Radiation Oncology (MAASTRO), GROW - School for Oncology & Developmental Biology, Maastricht University Medical Centre, Maastricht, The Netherlands
| | - M Vallati
- School of Computing & Engineering, University of Huddersfield, Huddersfield, UK
| | - E Meldolesi
- Radiation Oncology Area, Gemelli-ART, Catholic University of the Sacred Heart, Rome, Italy
| | - A Damiani
- Radiation Oncology Area, Gemelli-ART, Catholic University of the Sacred Heart, Rome, Italy
| | - V Lanzotti
- Radiation Oncology Area, Gemelli-ART, Catholic University of the Sacred Heart, Rome, Italy
| | - G C Mattiucci
- Radiation Oncology Area, Gemelli-ART, Catholic University of the Sacred Heart, Rome, Italy
| | - V Frascino
- Radiation Oncology Area, Gemelli-ART, Catholic University of the Sacred Heart, Rome, Italy
| | - C Masciocchi
- Radiation Oncology Area, Gemelli-ART, Catholic University of the Sacred Heart, Rome, Italy
| | - F Catucci
- Radiation Oncology Area, Gemelli-ART, Catholic University of the Sacred Heart, Rome, Italy
| | - A Dekker
- Department of Radiation Oncology (MAASTRO), GROW - School for Oncology & Developmental Biology, Maastricht University Medical Centre, Maastricht, The Netherlands
| | - P Lambin
- Department of Radiation Oncology (MAASTRO), GROW - School for Oncology & Developmental Biology, Maastricht University Medical Centre, Maastricht, The Netherlands
| | - V Valentini
- Radiation Oncology Area, Gemelli-ART, Catholic University of the Sacred Heart, Rome, Italy
| | - G Mantini
- Radiation Oncology Area, Gemelli-ART, Catholic University of the Sacred Heart, Rome, Italy
| |
Collapse
|
13
|
Savonnet M, Leclercq E, Naubourg P. eClims: An Extensible and Dynamic Integration Framework for Biomedical Information Systems. IEEE J Biomed Health Inform 2016; 20:1640-1649. [DOI: 10.1109/jbhi.2015.2464353] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
14
|
Hochheiser H, Castine M, Harris D, Savova G, Jacobson RS. An information model for computable cancer phenotypes. BMC Med Inform Decis Mak 2016; 16:121. [PMID: 27629872 PMCID: PMC5024416 DOI: 10.1186/s12911-016-0358-4] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2016] [Accepted: 09/01/2016] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Standards, methods, and tools supporting the integration of clinical data and genomic information are an area of significant need and rapid growth in biomedical informatics. Integration of cancer clinical data and cancer genomic information poses unique challenges, because of the high volume and complexity of clinical data, as well as the heterogeneity and instability of cancer genome data when compared with germline data. Current information models of clinical and genomic data are not sufficiently expressive to represent individual observations and to aggregate those observations into longitudinal summaries over the course of cancer care. These models are acutely needed to support the development of systems and tools for generating the so called clinical "deep phenotype" of individual cancer patients, a process which remains almost entirely manual in cancer research and precision medicine. METHODS Reviews of existing ontologies and interviews with cancer researchers were used to inform iterative development of a cancer phenotype information model. We translated a subset of the Fast Healthcare Interoperability Resources (FHIR) models into the OWL 2 Description Logic (DL) representation, and added extensions as needed for modeling cancer phenotypes with terms derived from the NCI Thesaurus. Models were validated with domain experts and evaluated against competency questions. RESULTS The DeepPhe Information model represents cancer phenotype data at increasing levels of abstraction from mention level in clinical documents to summaries of key events and findings. We describe the model using breast cancer as an example, depicting methods to represent phenotypic features of cancers, tumors, treatment regimens, and specific biologic behaviors that span the entire course of a patient's disease. CONCLUSIONS We present a multi-scale information model for representing individual document mentions, document level classifications, episodes along a disease course, and phenotype summarization, linking individual observations to high-level summaries in support of subsequent integration and analysis.
Collapse
Affiliation(s)
- Harry Hochheiser
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, 5607 Baum Boulevard, Rm 523, Pittsburgh, 15206-3701, PA, USA. .,Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, USA.
| | - Melissa Castine
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, 5607 Baum Boulevard, Rm 523, Pittsburgh, 15206-3701, PA, USA
| | - David Harris
- Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | - Guergana Savova
- Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | - Rebecca S Jacobson
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, 5607 Baum Boulevard, Rm 523, Pittsburgh, 15206-3701, PA, USA.,Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, USA.,University of Pittsburgh Cancer Institute, Pittsburgh, PA, USA
| |
Collapse
|
15
|
Nyholm T, Olsson C, Agrup M, Björk P, Björk-Eriksson T, Gagliardi G, Grinaker H, Gunnlaugsson A, Gustafsson A, Gustafsson M, Johansson B, Johnsson S, Karlsson M, Kristensen I, Nilsson P, Nyström L, Onjukka E, Reizenstein J, Skönevik J, Söderström K, Valdman A, Zackrisson B, Montelius A. A national approach for automated collection of standardized and population-based radiation therapy data in Sweden. Radiother Oncol 2016; 119:344-50. [DOI: 10.1016/j.radonc.2016.04.007] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2015] [Revised: 03/30/2016] [Accepted: 04/02/2016] [Indexed: 10/21/2022]
|
16
|
|
17
|
Hsu W, Gonzalez NR, Chien A, Pablo Villablanca J, Pajukanta P, Viñuela F, Bui AAT. An integrated, ontology-driven approach to constructing observational databases for research. J Biomed Inform 2015; 55:132-42. [PMID: 25817919 DOI: 10.1016/j.jbi.2015.03.008] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2014] [Revised: 02/14/2015] [Accepted: 03/19/2015] [Indexed: 11/28/2022]
Abstract
The electronic health record (EHR) contains a diverse set of clinical observations that are captured as part of routine care, but the incomplete, inconsistent, and sometimes incorrect nature of clinical data poses significant impediments for its secondary use in retrospective studies or comparative effectiveness research. In this work, we describe an ontology-driven approach for extracting and analyzing data from the patient record in a longitudinal and continuous manner. We demonstrate how the ontology helps enforce consistent data representation, integrates phenotypes generated through analyses of available clinical data sources, and facilitates subsequent studies to identify clinical predictors for an outcome of interest. Development and evaluation of our approach are described in the context of studying factors that influence intracranial aneurysm (ICA) growth and rupture. We report our experiences in capturing information on 78 individuals with a total of 120 aneurysms. Two example applications related to assessing the relationship between aneurysm size, growth, gene expression modules, and rupture are described. Our work highlights the challenges with respect to data quality, workflow, and analysis of data and its implications toward a learning health system paradigm.
Collapse
Affiliation(s)
- William Hsu
- Department of Radiological Sciences, UCLA David Geffen School of Medicine, Los Angeles, CA, United States.
| | - Nestor R Gonzalez
- Department of Radiological Sciences, UCLA David Geffen School of Medicine, Los Angeles, CA, United States; Department of Neurosurgery, UCLA David Geffen School of Medicine, Los Angeles, CA, United States
| | - Aichi Chien
- Department of Radiological Sciences, UCLA David Geffen School of Medicine, Los Angeles, CA, United States
| | - J Pablo Villablanca
- Department of Radiological Sciences, UCLA David Geffen School of Medicine, Los Angeles, CA, United States
| | - Päivi Pajukanta
- Department of Human Genetics, UCLA David Geffen School of Medicine, Los Angeles, CA, United States
| | - Fernando Viñuela
- Department of Radiological Sciences, UCLA David Geffen School of Medicine, Los Angeles, CA, United States
| | - Alex A T Bui
- Department of Radiological Sciences, UCLA David Geffen School of Medicine, Los Angeles, CA, United States
| |
Collapse
|
18
|
Rahimi A, Parameswaran N, Ray PK, Taggart J, Yu H, Liaw ST. Development of a Methodological Approach for Data Quality Ontology in Diabetes Management. INTERNATIONAL JOURNAL OF E-HEALTH AND MEDICAL COMMUNICATIONS 2014. [DOI: 10.4018/ijehmc.2014070105] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The role of ontologies in chronic disease management and associated challenges such as defining data quality (DQ) and its specification is a current topic of interest. In domains such as Diabetes Management, a robust Data Quality Ontology (DQO) is required to support the automation of data extraction semantically from Electronic Health Record (EHR) and access and manage DQ, so that the data set is fit for purpose. A five steps strategy is proposed in this paper to create the DQO which captures the semantics of clinical data. It consists of: (1) Knowledge acquisition; (2) Conceptualization; (3) Semantic modeling; (4) Knowledge representation; and (5) Validation. The DQO was applied to the identification of patients with Type 2 Diabetes Mellitus (T2DM) in EHRs, which included an assessment of the DQ of the EHR. The five steps methodology is generalizable and reusable in other domains.
Collapse
Affiliation(s)
- Alireza Rahimi
- UNSW School of Public Health and Community Medicine, Sydney, Australia & Isfahan University of Medical Sciences, Health information Technology Research Centre, Iran & UNSW Asia-Pacific ubiquitous Healthcare Research Centre, Sydney, Australia & SWSLHD General Practice Unit, Sydney, Australia
| | - Nandan Parameswaran
- UNSW, School of Computer Science and Engineering, Sydney, Australia & UNSW Asia-Pacific ubiquitous Healthcare Research Centre, Sydney, Australia
| | - Pradeep Kumar Ray
- UNSW, Asia-Pacific Ubiquitous Healthcare Research Centre, Sydney, Australia & UNSW, Australian School of Business, Sydney, Australia
| | - Jane Taggart
- UNSW, Centre for Primary Health Care & Equity, Sydney, Australia & SWSLHD General Practice Unit, Fairfield, Sydney, Australia
| | - Hairong Yu
- UNSW, Centre for Primary Health Care and Equity, Sydney, Australia
| | - Siaw-Teng Liaw
- UNSW, School of Public Health and Community Medicine, Sydney & UNSW, Centre for Primary Health Care and Equity, Sydney, Australia & UNSW, Asia-Pacific Ubiquitous Healthcare Research Centre, Sydney, Australia & SWSLHD General Practice Unit, Sydney, Australia
| |
Collapse
|
19
|
Rahimi A, Liaw ST, Taggart J, Ray P, Yu H. Validating an ontology-based algorithm to identify patients with type 2 diabetes mellitus in electronic health records. Int J Med Inform 2014; 83:768-78. [PMID: 25011429 DOI: 10.1016/j.ijmedinf.2014.06.002] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2014] [Revised: 06/02/2014] [Accepted: 06/02/2014] [Indexed: 11/19/2022]
Abstract
BACKGROUND Improving healthcare for people with chronic conditions requires clinical information systems that support integrated care and information exchange, emphasizing a semantic approach to support multiple and disparate Electronic Health Records (EHRs). Using a literature review, the Australian National Guidelines for Type 2 Diabetes Mellitus (T2DM), SNOMED-CT-AU and input from health professionals, we developed a Diabetes Mellitus Ontology (DMO) to diagnose and manage patients with diabetes. This paper describes the manual validation of the DMO-based approach using real world EHR data from a general practice (n=908 active patients) participating in the electronic Practice Based Research Network (ePBRN). METHOD The DMO-based algorithm to query, using Semantic Protocol and RDF Query Language (SPARQL), the structured fields in the ePBRN data repository were iteratively tested and refined. The accuracy of the final DMO-based algorithm was validated with a manual audit of the general practice EHR. Contingency tables were prepared and Sensitivity and Specificity (accuracy) of the algorithm to diagnose T2DM measured, using the T2DM cases found by manual EHR audit as the gold standard. Accuracy was determined with three attributes - reason for visit (RFV), medication (Rx) and pathology (path) - singly and in combination. RESULTS The Sensitivity and Specificity of the algorithm were 100% and 99.88% with RFV; 96.55% and 98.97% with Rx; and 15.6% and 98.92% with Path. This suggests that Rx and Path data were not as complete or correct as the RFV for this general practice, which kept its RFV information complete and current for diabetes. However, the completeness is good enough for this purpose as confirmed by the very small relative deterioration of the accuracy (Sensitivity and Specificity of 97.67% and 99.18%) when calculated for the combination of RFV, Rx and Path. The manual EHR audit suggested that the accuracy of the algorithm was influenced by data quality such as incorrect data due to mistaken units of measurement and unavailable data due to non-documentation or documented in the wrong place or progress notes, problems with data extraction, encryption and data management errors. CONCLUSION This DMO-based algorithm is sufficiently accurate to support a semantic approach, using the RFV, Rx and Path to define patients with T2DM from EHR data. However, the accuracy can be compromised by incomplete or incorrect data. The extent of compromise requires further study, using ontology-based and other approaches.
Collapse
Affiliation(s)
- Alireza Rahimi
- UNSW, School of Public Health & Community Medicine, Sydney, Australia; Isfahan University of Medical Sciences, Health Information Research Centre, Isfahan, Iran; UNSW, Asia-Pacific Ubiquitous Healthcare Research Centre, Sydney, Australia
| | - Siaw-Teng Liaw
- UNSW, School of Public Health & Community Medicine, Sydney, Australia; UNSW, Centre for Primary Health Care & Equity, Sydney, Australia; General Practice Unit, South Western Sydney Local Health District.
| | - Jane Taggart
- UNSW, Centre for Primary Health Care & Equity, Sydney, Australia
| | - Pradeep Ray
- UNSW, Asia-Pacific Ubiquitous Healthcare Research Centre, Sydney, Australia
| | - Hairong Yu
- UNSW, Centre for Primary Health Care & Equity, Sydney, Australia
| |
Collapse
|
20
|
Goh WWB, Wong L. Computational proteomics: designing a comprehensive analytical strategy. Drug Discov Today 2014; 19:266-74. [DOI: 10.1016/j.drudis.2013.07.008] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2013] [Revised: 06/28/2013] [Accepted: 07/11/2013] [Indexed: 02/02/2023]
|
21
|
Rahimi A, Liaw ST, Ray P, Taggart J, Yu H. Ontological specification of quality of chronic disease data in EHRs to support decision analytics: a realist review. ACTA ACUST UNITED AC 2014. [DOI: 10.1186/2193-8636-1-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Abstract
This systematic review examined the current state of conceptualization and specification of data quality and the role of ontology based approaches to develop data quality based on "fitness for purpose" within the health context. A literature review was conducted of all English language studies, from January 2000-March 2013, which addressed data/information quality, fitness for purpose of data, used and implemented ontology-based approaches. Included papers were critically appraised with a "context-mechanism-impacts/outcomes" overlay. We screened 315 papers, excluded 36 duplicates, 182 on abstract review and 46 on full-text review; leaving 52 papers for critical appraisal. Six papers conceptualized data quality within the "fitness for purpose" definition. While most agree with a multidimensional definition of DQ, there is little consensus on a conceptual framework. We found no reports of systematic and comprehensive ontological approaches to DQ based on fitness for purpose or use. However, 16 papers used ontology-specified implementations in DQ improvement, with most of them focusing on some dimensions of DQ such as completeness, accuracy, correctness, consistency and timeliness. The majority of papers described the processes of the development of DQ in various information systems. There were few evaluative studies, including any comparing ontological with non-ontological approaches, on the assessment of clinical data quality and the performance of the application.
Collapse
|
22
|
Torshizi AD, Zarandi MHF, Torshizi GD, Eghbali K. A hybrid fuzzy-ontology based intelligent system to determine level of severity and treatment recommendation for Benign Prostatic Hyperplasia. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2013; 113:301-313. [PMID: 24184111 DOI: 10.1016/j.cmpb.2013.09.021] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/17/2013] [Revised: 09/06/2013] [Accepted: 09/30/2013] [Indexed: 06/02/2023]
Abstract
This paper deals with application of fuzzy intelligent systems in diagnosing severity level and recommending appropriate therapies for patients having Benign Prostatic Hyperplasia. Such an intelligent system can have remarkable impacts on correct diagnosis of the disease and reducing risk of mortality. This system captures various factors from the patients using two modules. The first module determines severity level of the Benign Prostatic Hyperplasia and the second module, which is a decision making unit, obtains output of the first module accompanied by some external knowledge and makes an appropriate treatment decision based on its ontology model and a fuzzy type-1 system. In order to validate efficiency and accuracy of the developed system, a case study is conducted by 44 participants. Then the results are compared with the recommendations of a panel of experts on the experimental data. Then precision and accuracy of the results were investigated based on a statistical analysis.
Collapse
Affiliation(s)
- Abolfazl Doostparast Torshizi
- Department of Industrial Engineering, Amirkabir University of Technology (Tehran Polytechnic), 15875-4413 Tehran, Iran
| | | | | | | |
Collapse
|
23
|
Eccher C, Scipioni A, Miller AA, Ferro A, Pisanelli DM. An ontology of cancer therapies supporting interoperability and data consistency in EPRs. Comput Biol Med 2013; 43:822-32. [PMID: 23746723 DOI: 10.1016/j.compbiomed.2013.04.012] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2012] [Revised: 04/15/2013] [Accepted: 04/16/2013] [Indexed: 11/29/2022]
Abstract
Ontologies can formally describe the semantics of the medical domain in an unambiguous and machine processable form, acting as a conceptual interface between different applications that must interoperate. In this paper we present an ontology of cancer therapies originally developed to bridge the gap between an oncologic Electronic Patient Record (EPR) and a guideline-based decision support system. We show an application of the ontology complemented by rules to classify therapies recorded in the EPR. The results show how such an ontology can be used also to discover possible problems of data consistency in the EPR.
Collapse
Affiliation(s)
- Claudio Eccher
- Fondazione Bruno Kessler-Center for Information Technology, via Sommarive 18, 38050 Povo, Trento, Italy.
| | | | | | | | | |
Collapse
|
24
|
Pathak J, Kiefer RC, Bielinski SJ, Chute CG. Applying semantic web technologies for phenome-wide scan using an electronic health record linked Biobank. J Biomed Semantics 2012; 3:10. [PMID: 23244446 PMCID: PMC3554594 DOI: 10.1186/2041-1480-3-10] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2012] [Accepted: 08/22/2012] [Indexed: 01/12/2023] Open
Abstract
Background The ability to conduct genome-wide association studies (GWAS) has enabled new exploration of how genetic variations contribute to health and disease etiology. However, historically GWAS have been limited by inadequate sample size due to associated costs for genotyping and phenotyping of study subjects. This has prompted several academic medical centers to form “biobanks” where biospecimens linked to personal health information, typically in electronic health records (EHRs), are collected and stored on a large number of subjects. This provides tremendous opportunities to discover novel genotype-phenotype associations and foster hypotheses generation. Results In this work, we study how emerging Semantic Web technologies can be applied in conjunction with clinical and genotype data stored at the Mayo Clinic Biobank to mine the phenotype data for genetic associations. In particular, we demonstrate the role of using Resource Description Framework (RDF) for representing EHR diagnoses and procedure data, and enable federated querying via standardized Web protocols to identify subjects genotyped for Type 2 Diabetes and Hypothyroidism to discover gene-disease associations. Our study highlights the potential of Web-scale data federation techniques to execute complex queries. Conclusions This study demonstrates how Semantic Web technologies can be applied in conjunction with clinical data stored in EHRs to accurately identify subjects with specific diseases and phenotypes, and identify genotype-phenotype associations.
Collapse
Affiliation(s)
- Jyotishman Pathak
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA.
| | | | | | | |
Collapse
|
25
|
Liaw ST, Rahimi A, Ray P, Taggart J, Dennis S, de Lusignan S, Jalaludin B, Yeo AET, Talaei-Khoei A. Towards an ontology for data quality in integrated chronic disease management: a realist review of the literature. Int J Med Inform 2012; 82:10-24. [PMID: 23122633 DOI: 10.1016/j.ijmedinf.2012.10.001] [Citation(s) in RCA: 69] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2012] [Revised: 10/03/2012] [Accepted: 10/05/2012] [Indexed: 11/25/2022]
Abstract
PURPOSE Effective use of routine data to support integrated chronic disease management (CDM) and population health is dependent on underlying data quality (DQ) and, for cross system use of data, semantic interoperability. An ontological approach to DQ is a potential solution but research in this area is limited and fragmented. OBJECTIVE Identify mechanisms, including ontologies, to manage DQ in integrated CDM and whether improved DQ will better measure health outcomes. METHODS A realist review of English language studies (January 2001-March 2011) which addressed data quality, used ontology-based approaches and is relevant to CDM. RESULTS We screened 245 papers, excluded 26 duplicates, 135 on abstract review and 31 on full-text review; leaving 61 papers for critical appraisal. Of the 33 papers that examined ontologies in chronic disease management, 13 defined data quality and 15 used ontologies for DQ. Most saw DQ as a multidimensional construct, the most used dimensions being completeness, accuracy, correctness, consistency and timeliness. The majority of studies reported tool design and development (80%), implementation (23%), and descriptive evaluations (15%). Ontological approaches were used to address semantic interoperability, decision support, flexibility of information management and integration/linkage, and complexity of information models. CONCLUSION DQ lacks a consensus conceptual framework and definition. DQ and ontological research is relatively immature with little rigorous evaluation studies published. Ontology-based applications could support automated processes to address DQ and semantic interoperability in repositories of routinely collected data to deliver integrated CDM. We advocate moving to ontology-based design of information systems to enable more reliable use of routine data to measure health mechanisms and impacts.
Collapse
Affiliation(s)
- S T Liaw
- University of NSW School of Public Health & Community Medicine, Sydney, Australia.
| | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Pathak J, Kiefer RC, Chute CG. Using semantic web technologies for cohort identification from electronic health records for clinical research. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2012; 2012:10-9. [PMID: 22779040 PMCID: PMC3392057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
The ability to conduct genome-wide association studies (GWAS) has enabled new exploration of how genetic variations contribute to health and disease etiology. One of the key requirements to perform GWAS is the identification of subject cohorts with accurate classification of disease phenotypes. In this work, we study how emerging Semantic Web technologies can be applied in conjunction with clinical data stored in electronic health records (EHRs) to accurately identify subjects with specific diseases for inclusion in cohort studies. In particular, we demonstrate the role of using Resource Description Framework (RDF) for representing EHR data and enabling federated querying and inferencing via standardized Web protocols for identifying subjects with Diabetes Mellitus. Our study highlights the potential of using Web-scale data federation approaches to execute complex queries.
Collapse
|
27
|
Hu H, Correll M, Kvecher L, Osmond M, Clark J, Bekhash A, Schwab G, Gao D, Gao J, Kubatin V, Shriver CD, Hooke JA, Maxwell LG, Kovatich AJ, Sheldon JG, Liebman MN, Mural RJ. DW4TR: A Data Warehouse for Translational Research. J Biomed Inform 2011; 44:1004-19. [PMID: 21872681 DOI: 10.1016/j.jbi.2011.08.003] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2010] [Revised: 07/05/2011] [Accepted: 08/04/2011] [Indexed: 10/17/2022]
Abstract
The linkage between the clinical and laboratory research domains is a key issue in translational research. Integration of clinicopathologic data alone is a major task given the number of data elements involved. For a translational research environment, it is critical to make these data usable at the point-of-need. Individual systems have been developed to meet the needs of particular projects though the need for a generalizable system has been recognized. Increased use of Electronic Medical Record data in translational research will demand generalizing the system for integrating clinical data to support the study of a broad range of human diseases. To ultimately satisfy these needs, we have developed a system to support multiple translational research projects. This system, the Data Warehouse for Translational Research (DW4TR), is based on a light-weight, patient-centric modularly-structured clinical data model and a specimen-centric molecular data model. The temporal relationships of the data are also part of the model. The data are accessed through an interface composed of an Aggregated Biomedical-Information Browser (ABB) and an Individual Subject Information Viewer (ISIV) which target general users. The system was developed to support a breast cancer translational research program and has been extended to support a gynecological disease program. Further extensions of the DW4TR are underway. We believe that the DW4TR will play an important role in translational research across multiple disease types.
Collapse
Affiliation(s)
- Hai Hu
- Windber Research Institute, Windber, PA 15963, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Hsu W, Taira RK, Viñuela F, Bui AA. A Case-based Retrieval System using Natural Language Processing and Population-based Visualization. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS, IMAGING AND SYSTEMS BIOLOGY 2011; 2011:221-228. [PMID: 27570833 PMCID: PMC5001495 DOI: 10.1109/hisb.2011.3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Electronic medical records capture large quantities of patient data generated as a result of routine care. Secondary use of this data for clinical research could provide new insights into the evolution of diseases and help assess the effectiveness of available interventions. Unfortunately, the unstructured nature of clinical data hinders a user's ability to understand this data: tools are needed to structure, model, and visualize the data to elucidate patterns in a patient population. We present a case-based retrieval framework that incorporates an extraction tool to identify concepts from clinical reports, a disease model to capture necessary context for interpreting extracted concepts, and a model-driven visualization to facilitate querying and interpretation of the results. We describe how the model is used to group, filter, and retrieve similar cases. We present an application of the framework that aids users in exploring a population of intracranial aneurysm patients.
Collapse
Affiliation(s)
- William Hsu
- Medical Imaging Informatics Group, Department of Radiological Sciences, University of California, Los Angeles, CA
| | - Ricky K. Taira
- Medical Imaging Informatics Group, Department of Radiological Sciences, University of California, Los Angeles, CA
| | - Fernando Viñuela
- Division of Interventional Neuroradiology, Department of Radiological Sciences, University of California, Los Angeles, CA
| | - Alex A.T. Bui
- Medical Imaging Informatics Group, Department of Radiological Sciences, University of California, Los Angeles, CA
| |
Collapse
|
29
|
Song YS, Park CH, Chung HJ, Shin H, Kim J, Kim JH. Semantically enabled and statistically supported biological hypothesis testing with tissue microarray databases. BMC Bioinformatics 2011; 12 Suppl 1:S51. [PMID: 21342584 PMCID: PMC3044309 DOI: 10.1186/1471-2105-12-s1-s51] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Background Although many biological databases are applying semantic web technologies, meaningful biological hypothesis testing cannot be easily achieved. Database-driven high throughput genomic hypothesis testing requires both of the capabilities of obtaining semantically relevant experimental data and of performing relevant statistical testing for the retrieved data. Tissue Microarray (TMA) data are semantically rich and contains many biologically important hypotheses waiting for high throughput conclusions. Methods An application-specific ontology was developed for managing TMA and DNA microarray databases by semantic web technologies. Data were represented as Resource Description Framework (RDF) according to the framework of the ontology. Applications for hypothesis testing (Xperanto-RDF) for TMA data were designed and implemented by (1) formulating the syntactic and semantic structures of the hypotheses derived from TMA experiments, (2) formulating SPARQLs to reflect the semantic structures of the hypotheses, and (3) performing statistical test with the result sets returned by the SPARQLs. Results When a user designs a hypothesis in Xperanto-RDF and submits it, the hypothesis can be tested against TMA experimental data stored in Xperanto-RDF. When we evaluated four previously validated hypotheses as an illustration, all the hypotheses were supported by Xperanto-RDF. Conclusions We demonstrated the utility of high throughput biological hypothesis testing. We believe that preliminary investigation before performing highly controlled experiment can be benefited.
Collapse
Affiliation(s)
- Young Soo Song
- Department of Industrial & Information Systems Engineering, Ajou University, Suwon 443-749, Korea
| | | | | | | | | | | |
Collapse
|