1
|
Silva MC, Eugénio P, Faria D, Pesquita C. Ontologies and Knowledge Graphs in Oncology Research. Cancers (Basel) 2022; 14:cancers14081906. [PMID: 35454813 PMCID: PMC9029532 DOI: 10.3390/cancers14081906] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 03/25/2022] [Accepted: 04/07/2022] [Indexed: 11/16/2022] Open
Abstract
The complexity of cancer research stems from leaning on several biomedical disciplines for relevant sources of data, many of which are complex in their own right. A holistic view of cancer—which is critical for precision medicine approaches—hinges on integrating a variety of heterogeneous data sources under a cohesive knowledge model, a role which biomedical ontologies can fill. This study reviews the application of ontologies and knowledge graphs in cancer research. In total, our review encompasses 141 published works, which we categorized under 14 hierarchical categories according to their usage of ontologies and knowledge graphs. We also review the most commonly used ontologies and newly developed ones. Our review highlights the growing traction of ontologies in biomedical research in general, and cancer research in particular. Ontologies enable data accessibility, interoperability and integration, support data analysis, facilitate data interpretation and data mining, and more recently, with the emergence of the knowledge graph paradigm, support the application of Artificial Intelligence methods to unlock new knowledge from a holistic view of the available large volumes of heterogeneous data.
Collapse
|
2
|
Gazzotti R, Faron C, Gandon F, Lacroix-Hugues V, Darmon D. Extending electronic medical records vector models with knowledge graphs to improve hospitalization prediction. J Biomed Semantics 2022; 13:6. [PMID: 35193692 PMCID: PMC8861628 DOI: 10.1186/s13326-022-00261-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Accepted: 12/23/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Artificial intelligence methods applied to electronic medical records (EMRs) hold the potential to help physicians save time by sharpening their analysis and decisions, thereby improving the health of patients. On the one hand, machine learning algorithms have proven their effectiveness in extracting information and exploiting knowledge extracted from data. On the other hand, knowledge graphs capture human knowledge by relying on conceptual schemas and formalization and supporting reasoning. Leveraging knowledge graphs that are legion in the medical field, it is possible to pre-process and enrich data representation used by machine learning algorithms. Medical data standardization is an opportunity to jointly exploit the richness of knowledge graphs and the capabilities of machine learning algorithms. METHODS We propose to address the problem of hospitalization prediction for patients with an approach that enriches vector representation of EMRs with information extracted from different knowledge graphs before learning and predicting. In addition, we performed an automatic selection of features resulting from knowledge graphs to distinguish noisy ones from those that can benefit the decision making. We report the results of our experiments on the PRIMEGE PACA database that contains more than 600,000 consultations carried out by 17 general practitioners (GPs). RESULTS A statistical evaluation shows that our proposed approach improves hospitalization prediction. More precisely, injecting features extracted from cross-domain knowledge graphs in the vector representation of EMRs given as input to the prediction algorithm significantly increases the F1 score of the prediction. CONCLUSIONS By injecting knowledge from recognized reference sources into the representation of EMRs, it is possible to significantly improve the prediction of medical events. Future work would be to evaluate the impact of a feature selection step coupled with a combination of features extracted from several knowledge graphs. A possible avenue is to study more hierarchical levels and properties related to concepts, as well as to integrate more semantic annotators to exploit unstructured data.
Collapse
Affiliation(s)
- Raphaël Gazzotti
- Université Côte d'Azur, Inria, CNRS, I3S, 2004, route des Lucioles, Sophia-Antipolis, BP 93 06902, France.
| | - Catherine Faron
- Université Côte d'Azur, Inria, CNRS, I3S, 2004, route des Lucioles, Sophia-Antipolis, BP 93 06902, France
| | - Fabien Gandon
- Université Côte d'Azur, Inria, CNRS, I3S, 2004, route des Lucioles, Sophia-Antipolis, BP 93 06902, France
| | - Virginie Lacroix-Hugues
- Université Côte d'Azur, RETINES, Département de Médecine Générale, 28, Avenue de Valombrose, Nice, 06107, France
| | - David Darmon
- Université Côte d'Azur, RETINES, Département de Médecine Générale, 28, Avenue de Valombrose, Nice, 06107, France
| |
Collapse
|
3
|
Pendleton SC, Slater K, Karwath A, Gilbert RM, Davis N, Pesudovs K, Liu X, Denniston AK, Gkoutos GV, Braithwaite T. Development and application of the ocular immune-mediated inflammatory diseases ontology enhanced with synonyms from online patient support forum conversation. Comput Biol Med 2021; 135:104542. [PMID: 34139439 PMCID: PMC8404035 DOI: 10.1016/j.compbiomed.2021.104542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 05/27/2021] [Accepted: 05/30/2021] [Indexed: 11/28/2022]
Abstract
BACKGROUND Unstructured text created by patients represents a rich, but relatively inaccessible resource for advancing patient-centred care. This study aimed to develop an ontology for ocular immune-mediated inflammatory diseases (OcIMIDo), as a tool to facilitate data extraction and analysis, illustrating its application to online patient support forum data. METHODS We developed OcIMIDo using clinical guidelines, domain expertise, and cross-references to classes from other biomedical ontologies. We developed an approach to add patient-preferred synonyms text-mined from oliviasvision.org online forum, using statistical ranking. We validated the approach with split-sampling and comparison to manual extraction. Using OcIMIDo, we then explored the frequency of OcIMIDo classes and synonyms, and their potential association with natural language sentiment expressed in each online forum post. FINDINGS OcIMIDo (version 1.2) includes 661 classes, describing anatomy, clinical phenotype, disease activity status, complications, investigations, interventions and functional impacts. It contains 1661 relationships and axioms, 2851 annotations, including 1131 database cross-references, and 187 patient-preferred synonyms. To illustrate OcIMIDo's potential applications, we explored 9031 forum posts, revealing frequent mention of different clinical phenotypes, treatments, and complications. Language sentiment analysis of each post was generally positive (median 0.12, IQR 0.01-0.24). In multivariable logistic regression, the odds of a post expressing negative sentiment were significantly associated with first posts as compared to replies (OR 3.3, 95% CI 2.8 to 3.9, p < 0.001). CONCLUSION We report the development and validation of a new ontology for inflammatory eye diseases, which includes patient-preferred synonyms, and can be used to explore unstructured patient or physician-reported text data, with many potential applications.
Collapse
Affiliation(s)
- Samantha C Pendleton
- Institute of Cancer and Genomic Sciences, University of Birmingham, UK; University Hospitals Birmingham NHS Foundation Trust, UK.
| | - Karin Slater
- Institute of Cancer and Genomic Sciences, University of Birmingham, UK; University Hospitals Birmingham NHS Foundation Trust, UK
| | - Andreas Karwath
- Institute of Cancer and Genomic Sciences, University of Birmingham, UK; University Hospitals Birmingham NHS Foundation Trust, UK; Health Data Research, UK
| | - Rose M Gilbert
- Moorfields Eye Hospital NHS Foundation Trust, London, UK; Institute of Ophthalmology, University College London, UK
| | - Nicola Davis
- Olivia's Vision, Southampton Buildings, London, UK
| | - Konrad Pesudovs
- School of Optometry and Vision Science, University of New South Wales, Australia
| | - Xiaoxuan Liu
- University Hospitals Birmingham NHS Foundation Trust, UK; Institute of Inflammation and Ageing, University of Birmingham, UK
| | - Alastair K Denniston
- University Hospitals Birmingham NHS Foundation Trust, UK; Health Data Research, UK; Institute of Inflammation and Ageing, University of Birmingham, UK
| | - Georgios V Gkoutos
- Institute of Cancer and Genomic Sciences, University of Birmingham, UK; University Hospitals Birmingham NHS Foundation Trust, UK; Health Data Research, UK
| | - Tasanee Braithwaite
- University Hospitals Birmingham NHS Foundation Trust, UK; Institute of Applied Health Research, University of Birmingham, UK; The Medical Eye Unit, St Thomas' Hospital NHS Foundation Trust, London, UK
| |
Collapse
|
4
|
Wojtusiak J, Asadzadehzanjani N, Levy C, Alemi F, Williams AE. Computational Barthel Index: an automated tool for assessing and predicting activities of daily living among nursing home patients. BMC Med Inform Decis Mak 2021; 21:17. [PMID: 33422059 PMCID: PMC7796534 DOI: 10.1186/s12911-020-01368-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Accepted: 12/08/2020] [Indexed: 12/12/2022] Open
Abstract
Background Assessment of functional ability, including activities of daily living (ADLs), is a manual process completed by skilled health professionals. In the presented research, an automated decision support tool, the Computational Barthel Index Tool (CBIT), was constructed that can automatically assess and predict probabilities of current and future ADLs based on patients’ medical history. Methods The data used to construct the tool include the demographic information, inpatient and outpatient diagnosis codes, and reported disabilities of 181,213 residents of the Department of Veterans Affairs’ (VA) Community Living Centers. Supervised machine learning methods were applied to construct the CBIT. Temporal information about times from the first and the most recent occurrence of diagnoses was encoded. Ten-fold cross-validation was used to tune hyperparameters, and independent test sets were used to evaluate models using AUC, accuracy, recall and precision. Random forest achieved the best model quality. Models were calibrated using isotonic regression. Results The unabridged version of CBIT uses 578 patient characteristics and achieved average AUC of 0.94 (0.93–0.95), accuracy of 0.90 (0.89–0.91), precision of 0.91 (0.89–0.92), and recall of 0.90 (0.84–0.95) when re-evaluating patients. CBIT is also capable of predicting ADLs up to one year ahead, with accuracy decreasing over time, giving average AUC of 0.77 (0.73–0.79), accuracy of 0.73 (0.69–0.80), precision of 0.74 (0.66–0.81), and recall of 0.69 (0.34–0.96). A simplified version of CBIT with 50 top patient characteristics reached performance that does not significantly differ from full CBIT. Conclusion Discharge planners, disability application reviewers and clinicians evaluating comparative effectiveness of treatments can use CBIT to assess and predict information on functional status of patients.
Collapse
Affiliation(s)
- Janusz Wojtusiak
- Health Informatics Program, Department of Health Administration and Policy, George Mason University, Fairfax, VA, USA.
| | - Negin Asadzadehzanjani
- Health Informatics Program, Department of Health Administration and Policy, George Mason University, Fairfax, VA, USA
| | - Cari Levy
- Department of Veterans Affairs, Denver, CO, USA
| | - Farrokh Alemi
- Health Informatics Program, Department of Health Administration and Policy, George Mason University, Fairfax, VA, USA
| | | |
Collapse
|
5
|
Haendel MA, McMurry JA, Relevo R, Mungall CJ, Robinson PN, Chute CG. A Census of Disease Ontologies. Annu Rev Biomed Data Sci 2018. [DOI: 10.1146/annurev-biodatasci-080917-013459] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
For centuries, humans have sought to classify diseases based on phenotypic presentation and available treatments. Today, a wide landscape of strategies, resources, and tools exist to classify patients and diseases. Ontologies can provide a robust foundation of logic for precise stratification and classification along diverse axes such as etiology, development, treatment, and genetics. Disease and phenotype ontologies are used in four primary ways: ( a) search, retrieval, and annotation of knowledge; ( b) data integration and analysis; ( c) clinical decision support; and ( d) knowledge discovery. Computational inference can connect existing knowledge and generate new insights and hypotheses about drug targets, prognosis prediction, or diagnosis. In this review, we examine the rise of disease and phenotype ontologies and the diverse ways they are represented and applied in biomedicine.
Collapse
Affiliation(s)
- Melissa A. Haendel
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, Oregon 97239, USA
- Linus Pauling Institute, Oregon State University, Corvallis, Oregon 97331, USA
| | - Julie A. McMurry
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, Oregon 97239, USA
| | - Rose Relevo
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, Oregon 97239, USA
| | - Christopher J. Mungall
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
| | | | - Christopher G. Chute
- School of Medicine, School of Public Health, and School of Nursing, Johns Hopkins University, Baltimore, Maryland 21205, USA
| |
Collapse
|