1
|
Gouareb R, Bornet A, Proios D, Pereira SG, Teodoro D. Detection of Patients at Risk of Multidrug-Resistant Enterobacteriaceae Infection Using Graph Neural Networks: A Retrospective Study. HEALTH DATA SCIENCE 2023; 3:0099. [PMID: 38487204 PMCID: PMC10904075 DOI: 10.34133/hds.0099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Accepted: 10/23/2023] [Indexed: 03/17/2024]
Abstract
Background: While Enterobacteriaceae bacteria are commonly found in the healthy human gut, their colonization of other body parts can potentially evolve into serious infections and health threats. We investigate a graph-based machine learning model to predict risks of inpatient colonization by multidrug-resistant (MDR) Enterobacteriaceae. Methods: Colonization prediction was defined as a binary task, where the goal is to predict whether a patient is colonized by MDR Enterobacteriaceae in an undesirable body part during their hospital stay. To capture topological features, interactions among patients and healthcare workers were modeled using a graph structure, where patients are described by nodes and their interactions are described by edges. Then, a graph neural network (GNN) model was trained to learn colonization patterns from the patient network enriched with clinical and spatiotemporal features. Results: The GNN model achieves performance between 0.91 and 0.96 area under the receiver operating characteristic curve (AUROC) when trained in inductive and transductive settings, respectively, up to 8% above a logistic regression baseline (0.88). Comparing network topologies, the configuration considering ward-related edges (0.91 inductive, 0.96 transductive) outperforms the configurations considering caregiver-related edges (0.88, 0.89) and both types of edges (0.90, 0.94). For the top 3 most prevalent MDR Enterobacteriaceae, the AUROC varies from 0.94 for Citrobacter freundii up to 0.98 for Enterobacter cloacae using the best-performing GNN model. Conclusion: Topological features via graph modeling improve the performance of machine learning models for Enterobacteriaceae colonization prediction. GNNs could be used to support infection prevention and control programs to detect patients at risk of colonization by MDR Enterobacteriaceae and other bacteria families.
Collapse
Affiliation(s)
- Racha Gouareb
- Department of Radiology and Medical Informatics,
University of Geneva, Geneva, Switzerland
| | - Alban Bornet
- Department of Radiology and Medical Informatics,
University of Geneva, Geneva, Switzerland
- HES-SO University of Applied Arts Sciences and Arts of Western Switzerland, Geneva, Switzerland
| | - Dimitrios Proios
- Department of Radiology and Medical Informatics,
University of Geneva, Geneva, Switzerland
- HES-SO University of Applied Arts Sciences and Arts of Western Switzerland, Geneva, Switzerland
| | | | - Douglas Teodoro
- Department of Radiology and Medical Informatics,
University of Geneva, Geneva, Switzerland
- HES-SO University of Applied Arts Sciences and Arts of Western Switzerland, Geneva, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
2
|
Eysenbach G, Ulrich H, Bergh B, Schreiweis B. Functional Requirements for Medical Data Integration into Knowledge Management Environments: Requirements Elicitation Approach Based on Systematic Literature Analysis. J Med Internet Res 2023; 25:e41344. [PMID: 36757764 PMCID: PMC9951079 DOI: 10.2196/41344] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 10/24/2022] [Accepted: 11/17/2022] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND In patient care, data are historically generated and stored in heterogeneous databases that are domain specific and often noninteroperable or isolated. As the amount of health data increases, the number of isolated data silos is also expected to grow, limiting the accessibility of the collected data. Medical informatics is developing ways to move from siloed data to a more harmonized arrangement in information architectures. This paradigm shift will allow future research to integrate medical data at various levels and from various sources. Currently, comprehensive requirements engineering is working on data integration projects in both patient care- and research-oriented contexts, and it is significantly contributing to the success of such projects. In addition to various stakeholder-based methods, document-based requirement elicitation is a valid method for improving the scope and quality of requirements. OBJECTIVE Our main objective was to provide a general catalog of functional requirements for integrating medical data into knowledge management environments. We aimed to identify where integration projects intersect to derive consistent and representative functional requirements from the literature. On the basis of these findings, we identified which functional requirements for data integration exist in the literature and thus provide a general catalog of requirements. METHODS This work began by conducting a literature-based requirement elicitation based on a broad requirement engineering approach. Thus, in the first step, we performed a web-based systematic literature review to identify published articles that dealt with the requirements for medical data integration. We identified and analyzed the available literature by applying the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. In the second step, we screened the results for functional requirements using the requirements engineering method of document analysis and derived the requirements into a uniform requirement syntax. Finally, we classified the elicited requirements into a category scheme that represents the data life cycle. RESULTS Our 2-step requirements elicitation approach yielded 821 articles, of which 61 (7.4%) were included in the requirement elicitation process. There, we identified 220 requirements, which were covered by 314 references. We assigned the requirements to different data life cycle categories as follows: 25% (55/220) to data acquisition, 35.9% (79/220) to data processing, 12.7% (28/220) to data storage, 9.1% (20/220) to data analysis, 6.4% (14/220) to metadata management, 2.3% (5/220) to data lineage, 3.2% (7/220) to data traceability, and 5.5% (12/220) to data security. CONCLUSIONS The aim of this study was to present a cross-section of functional data integration-related requirements defined in the literature by other researchers. The aim was achieved with 220 distinct requirements from 61 publications. We concluded that scientific publications are, in principle, a reliable source of information for functional requirements with respect to medical data integration. Finally, we provide a broad catalog to support other scientists in the requirement elicitation phase.
Collapse
Affiliation(s)
- G Eysenbach
- Institute for Medical Informatics and StatisticsKiel University and University Hospital Schleswig-HolsteinKielGermany
| | - Hannes Ulrich
- Institute for Medical Informatics and Statistics, Kiel University and University Hospital Schleswig-Holstein, Kiel, Germany
| | - Björn Bergh
- Institute for Medical Informatics and Statistics, Kiel University and University Hospital Schleswig-Holstein, Kiel, Germany
| | - Björn Schreiweis
- Institute for Medical Informatics and Statistics, Kiel University and University Hospital Schleswig-Holstein, Kiel, Germany
| |
Collapse
|
3
|
Zhang H, Lyu T, Yin P, Bost S, He X, Guo Y, Prosperi M, Hogan WR, Bian J. A scoping review of semantic integration of health data and information. Int J Med Inform 2022; 165:104834. [PMID: 35863206 DOI: 10.1016/j.ijmedinf.2022.104834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 07/06/2022] [Accepted: 07/13/2022] [Indexed: 11/25/2022]
Abstract
OBJECTIVE We summarized a decade of new research focusing on semantic data integration (SDI) since 2009, and we aim to: (1) summarize the state-of-art approaches on integrating health data and information; and (2) identify the main gaps and challenges of integrating health data and information from multiple levels and domains. MATERIALS AND METHODS We used PubMed as our focus is applications of SDI in biomedical domains and followed the Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA) to search and report for relevant studies published between January 1, 2009 and December 31, 2021. We used Covidence-a systematic review management system-to carry out this scoping review. RESULTS The initial search from PubMed resulted in 5,326 articles using the two sets of keywords. We then removed 44 duplicates and 5,282 articles were retained for abstract screening. After abstract screening, we included 246 articles for full-text screening, among which 87 articles were deemed eligible for full-text extraction. We summarized the 87 articles from four aspects: (1) methods for the global schema; (2) data integration strategies (i.e., federated system vs. data warehousing); (3) the sources of the data; and (4) downstream applications. CONCLUSION SDI approach can effectively resolve the semantic heterogeneities across different data sources. We identified two key gaps and challenges in existing SDI studies that (1) many of the existing SDI studies used data from only single-level data sources (e.g., integrating individual-level patient records from different hospital systems), and (2) documentation of the data integration processes is sparse, threatening the reproducibility of SDI studies.
Collapse
Affiliation(s)
- Hansi Zhang
- Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, United States
| | - Tianchen Lyu
- Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, United States
| | - Pengfei Yin
- Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, United States
| | - Sarah Bost
- Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, United States
| | - Xing He
- Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, United States
| | - Yi Guo
- Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, United States
| | - Mattia Prosperi
- Department of Epidemiology, College of Medicine, University of Florida, Gainesville, FL, United States
| | - Willian R Hogan
- Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, United States
| | - Jiang Bian
- Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, United States.
| |
Collapse
|
4
|
On Graph Construction for Classification of Clinical Trials Protocols Using Graph Neural Networks. Artif Intell Med 2022. [DOI: 10.1007/978-3-031-09342-5_24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
5
|
Lv J, Deng S, Zhang L. A review of artificial intelligence applications for antimicrobial resistance. BIOSAFETY AND HEALTH 2021. [DOI: 10.1016/j.bsheal.2020.08.003] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
|
6
|
Torres Silva EA, Uribe S, Smith J, Luna Gomez IF, Florez-Arango JF. XML Data and Knowledge-Encoding Structure for a Web-Based and Mobile Antenatal Clinical Decision Support System: Development Study. JMIR Form Res 2020; 4:e17512. [PMID: 33064087 PMCID: PMC7600017 DOI: 10.2196/17512] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2019] [Revised: 06/24/2020] [Accepted: 08/16/2020] [Indexed: 01/30/2023] Open
Abstract
Background Displeasure with the functionality of clinical decision support systems (CDSSs) is considered the primary challenge in CDSS development. A major difficulty in CDSS design is matching the functionality to the desired and actual clinical workflow. Computer-interpretable guidelines (CIGs) are used to formalize medical knowledge in clinical practice guidelines (CPGs) in a computable language. However, existing CIG frameworks require a specific interpreter for each CIG language, hindering the ease of implementation and interoperability. Objective This paper aims to describe a different approach to the representation of clinical knowledge and data. We intended to change the clinician’s perception of a CDSS with sufficient expressivity of the representation while maintaining a small communication and software footprint for both a web application and a mobile app. This approach was originally intended to create a readable and minimal syntax for a web CDSS and future mobile app for antenatal care guidelines with improved human-computer interaction and enhanced usability by aligning the system behavior with clinical workflow. Methods We designed and implemented an architecture design for our CDSS, which uses the model-view-controller (MVC) architecture and a knowledge engine in the MVC architecture based on XML. The knowledge engine design also integrated the requirement of matching clinical care workflow that was desired in the CDSS. For this component of the design task, we used a work ontology analysis of the CPGs for antenatal care in our particular target clinical settings. Results In comparison to other common CIGs used for CDSSs, our XML approach can be used to take advantage of the flexible format of XML to facilitate the electronic sharing of structured data. More importantly, we can take advantage of its flexibility to standardize CIG structure design in a low-level specification language that is ubiquitous, universal, computationally efficient, integrable with web technologies, and human readable. Conclusions Our knowledge representation framework incorporates fundamental elements of other CIGs used in CDSSs in medicine and proved adequate to encode a number of antenatal health care CPGs and their associated clinical workflows. The framework appears general enough to be used with other CPGs in medicine. XML proved to be a language expressive enough to describe planning problems in a computable form and restrictive and expressive enough to implement in a clinical system. It can also be effective for mobile apps, where intermittent communication requires a small footprint and an autonomous app. This approach can be used to incorporate overlapping capabilities of more specialized CIGs in medicine.
Collapse
Affiliation(s)
| | - Sebastian Uribe
- Bioengineering Research Group, Universidad Pontificia Bolivariana, Medellin, Colombia
| | - Jack Smith
- Department of Microbial Pathogenesis and Immunology, Texas A&M Unversity, College Station, TX, United States
| | | | | |
Collapse
|
7
|
George J, Häsler B, Mremi I, Sindato C, Mboera L, Rweyemamu M, Mlangwa J. A systematic review on integration mechanisms in human and animal health surveillance systems with a view to addressing global health security threats. ONE HEALTH OUTLOOK 2020; 2:11. [PMID: 33829132 PMCID: PMC7993536 DOI: 10.1186/s42522-020-00017-4] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Accepted: 05/05/2020] [Indexed: 05/20/2023]
Abstract
BACKGROUND Health surveillance is an important element of disease prevention, control, and management. During the past two decades, there have been several initiatives to integrate health surveillance systems using various mechanisms ranging from the integration of data sources to changing organizational structures and responses. The need for integration is caused by an increasing demand for joint data collection, use and preparedness for emerging infectious diseases. OBJECTIVE To review the integration mechanisms in human and animal health surveillance systems and identify their contributions in strengthening surveillance systems attributes. METHOD The review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analysis Protocols (PRISMA-P) 2015 checklist. Peer-reviewed articles were searched from PubMed, HINARI, Web of Science, Science Direct and advanced Google search engines. The review included articles published in English from 1900 to 2018. The study selection considered all articles that used quantitative, qualitative or mixed research methods. Eligible articles were assessed independently for quality by two authors using the QualSyst Tool and relevant information including year of publication, field, continent, addressed attributes and integration mechanism were extracted. RESULTS A total of 102 publications were identified and categorized into four pre-set integration mechanisms: interoperability (35), convergent integration (27), semantic consistency (21) and interconnectivity (19). Most integration mechanisms focused on sensitivity (44.1%), timeliness (41.2%), data quality (23.5%) and acceptability (17.6%) of the surveillance systems. Generally, the majority of the surveillance system integrations were centered on addressing infectious diseases and all hazards. The sensitivity of the integrated systems reported in these studies ranged from 63.9 to 100% (median = 79.6%, n = 16) and the rate of data quality improvement ranged from 73 to 95.4% (median = 87%, n = 4). The integrated systems were also shown improve timeliness where the recorded changes were reported to be ranging from 10 to 91% (median = 67.3%, n = 8). CONCLUSION Interoperability and semantic consistency are the common integration mechanisms in human and animal health surveillance systems. Surveillance system integration is a relatively new concept but has already been shown to enhance surveillance performance. More studies are needed to gain information on further surveillance attributes.
Collapse
Affiliation(s)
- Janeth George
- Department of Veterinary Medicine and Public Health, Sokoine University of Agriculture, P.O. Box 3021, Morogoro, Tanzania
- SACIDS Foundation for One Health, Sokoine University of Agriculture, P.O. Box 3297, Morogoro, Tanzania
| | - Barbara Häsler
- Department of Pathobiology and Population Sciences, Veterinary Epidemiology, Economics, and Public Health Group, Royal Veterinary College, Hawkshead Lane, North Mymms, Hatfield, Hertfordshire, AL97TA UK
| | - Irene Mremi
- Department of Veterinary Medicine and Public Health, Sokoine University of Agriculture, P.O. Box 3021, Morogoro, Tanzania
- SACIDS Foundation for One Health, Sokoine University of Agriculture, P.O. Box 3297, Morogoro, Tanzania
| | - Calvin Sindato
- SACIDS Foundation for One Health, Sokoine University of Agriculture, P.O. Box 3297, Morogoro, Tanzania
- National Institute for Medical Research, Tabora Research Centre, Tabora, Tanzania
| | - Leonard Mboera
- SACIDS Foundation for One Health, Sokoine University of Agriculture, P.O. Box 3297, Morogoro, Tanzania
| | - Mark Rweyemamu
- SACIDS Foundation for One Health, Sokoine University of Agriculture, P.O. Box 3297, Morogoro, Tanzania
| | - James Mlangwa
- Department of Veterinary Medicine and Public Health, Sokoine University of Agriculture, P.O. Box 3021, Morogoro, Tanzania
| |
Collapse
|
8
|
Abstract
Surveillance of antibiotic resistance involves the collection of antibiotic susceptibility patterns undertaken by clinical microbiology laboratories on bacteria isolated from clinical specimens. Global surveillance programs have shown that antibiotic resistance is a major threat to the public at large and play a crucial role in the development of enhanced diagnostics as well as potential vaccines and novel antibiotics with activity against antimicrobial-resistant organisms. This review focuses primarily on examples of global surveillance systems. Local, national, and global integrated surveillance programs with sufficient data linkage between these schemes, accompanied by enhanced genomics and user-friendly bioinformatics systems, promise to overcome some of the stumbling blocks encountered in the understanding, emergence, and transmission of antimicrobial-resistant organisms.
Collapse
|
9
|
Teodoro D, Mottin L, Gobeill J, Gaudinat A, Vachon T, Ruch P. Improving average ranking precision in user searches for biomedical research datasets. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018; 2017:4600047. [PMID: 29220475 PMCID: PMC5714153 DOI: 10.1093/database/bax083] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/21/2017] [Accepted: 10/12/2017] [Indexed: 11/15/2022]
Abstract
Availability of research datasets is keystone for health and life science study reproducibility and scientific progress. Due to the heterogeneity and complexity of these data, a main challenge to be overcome by research data management systems is to provide users with the best answers for their search queries. In the context of the 2016 bioCADDIE Dataset Retrieval Challenge, we investigate a novel ranking pipeline to improve the search of datasets used in biomedical experiments. Our system comprises a query expansion model based on word embeddings, a similarity measure algorithm that takes into consideration the relevance of the query terms, and a dataset categorization method that boosts the rank of datasets matching query constraints. The system was evaluated using a corpus with 800k datasets and 21 annotated user queries, and provided competitive results when compared to the other challenge participants. In the official run, it achieved the highest infAP, being +22.3% higher than the median infAP of the participant’s best submissions. Overall, it is ranked at top 2 if an aggregated metric using the best official measures per participant is considered. The query expansion method showed positive impact on the system’s performance increasing our baseline up to +5.0% and +3.4% for the infAP and infNDCG metrics, respectively. The similarity measure algorithm showed robust performance in different training conditions, with small performance variations compared to the Divergence from Randomness framework. Finally, the result categorization did not have significant impact on the system’s performance. We believe that our solution could be used to enhance biomedical dataset management systems. The use of data driven expansion methods, such as those based on word embeddings, could be an alternative to the complexity of biomedical terminologies. Nevertheless, due to the limited size of the assessment set, further experiments need to be performed to draw conclusive results. Database URL:https://biocaddie.org/benchmark-data
Collapse
Affiliation(s)
- Douglas Teodoro
- Text Mining Group, SIB Swiss Institute of Bioinformatics, 1227 Geneva, Switzerland.,Department of Information Science, HEG Geneva HES-SO, 1227 Geneva, Switzerland
| | - Luc Mottin
- Text Mining Group, SIB Swiss Institute of Bioinformatics, 1227 Geneva, Switzerland.,Department of Information Science, HEG Geneva HES-SO, 1227 Geneva, Switzerland
| | - Julien Gobeill
- Text Mining Group, SIB Swiss Institute of Bioinformatics, 1227 Geneva, Switzerland.,Department of Information Science, HEG Geneva HES-SO, 1227 Geneva, Switzerland
| | - Arnaud Gaudinat
- Department of Information Science, HEG Geneva HES-SO, 1227 Geneva, Switzerland
| | - Thérèse Vachon
- Novartis Institutes for BioMedical Research-Text Mining Services (NIBR Informatics/TMS), Novartis Pharma AG Postfach, 4002 Basel, Switzerland
| | - Patrick Ruch
- Text Mining Group, SIB Swiss Institute of Bioinformatics, 1227 Geneva, Switzerland.,Department of Information Science, HEG Geneva HES-SO, 1227 Geneva, Switzerland
| |
Collapse
|
10
|
Teodoro D, Sundvall E, João Junior M, Ruch P, Miranda Freire S. ORBDA: An openEHR benchmark dataset for performance assessment of electronic health record servers. PLoS One 2018; 13:e0190028. [PMID: 29293556 PMCID: PMC5749730 DOI: 10.1371/journal.pone.0190028] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2017] [Accepted: 12/06/2017] [Indexed: 11/19/2022] Open
Abstract
The openEHR specifications are designed to support implementation of flexible and interoperable Electronic Health Record (EHR) systems. Despite the increasing number of solutions based on the openEHR specifications, it is difficult to find publicly available healthcare datasets in the openEHR format that can be used to test, compare and validate different data persistence mechanisms for openEHR. To foster research on openEHR servers, we present the openEHR Benchmark Dataset, ORBDA, a very large healthcare benchmark dataset encoded using the openEHR formalism. To construct ORBDA, we extracted and cleaned a de-identified dataset from the Brazilian National Healthcare System (SUS) containing hospitalisation and high complexity procedures information and formalised it using a set of openEHR archetypes and templates. Then, we implemented a tool to enrich the raw relational data and convert it into the openEHR model using the openEHR Java reference model library. The ORBDA dataset is available in composition, versioned composition and EHR openEHR representations in XML and JSON formats. In total, the dataset contains more than 150 million composition records. We describe the dataset and provide means to access it. Additionally, we demonstrate the usage of ORBDA for evaluating inserting throughput and query latency performances of some NoSQL database management systems. We believe that ORBDA is a valuable asset for assessing storage models for openEHR-based information systems during the software engineering process. It may also be a suitable component in future standardised benchmarking of available openEHR storage platforms.
Collapse
Affiliation(s)
- Douglas Teodoro
- Departamento de Tecnologia da Informação e Educação em Saúde, Universidade do Estado do Rio de Janeiro, Rio de Janeiro, Brazil
- SIB Text Mining, Swiss Institute of Bioinformatics, Geneva, Switzerland
- Department of Information Science, HEG-Geneva, HES-SO, Geneva, Switzerland
| | - Erik Sundvall
- Department of Biomedical Engineering, Linköping University, Linköping, Sweden
- Region Östergötland, Linköping, Sweden
| | - Mario João Junior
- Departamento de Tecnologia da Informação e Educação em Saúde, Universidade do Estado do Rio de Janeiro, Rio de Janeiro, Brazil
| | - Patrick Ruch
- SIB Text Mining, Swiss Institute of Bioinformatics, Geneva, Switzerland
- Department of Information Science, HEG-Geneva, HES-SO, Geneva, Switzerland
| | - Sergio Miranda Freire
- Departamento de Tecnologia da Informação e Educação em Saúde, Universidade do Estado do Rio de Janeiro, Rio de Janeiro, Brazil
| |
Collapse
|
11
|
Meystre SM, Lovis C, Bürkle T, Tognola G, Budrionis A, Lehmann CU. Clinical Data Reuse or Secondary Use: Current Status and Potential Future Progress. Yearb Med Inform 2017; 26:38-52. [PMID: 28480475 PMCID: PMC6239225 DOI: 10.15265/iy-2017-007] [Citation(s) in RCA: 84] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2017] [Indexed: 12/30/2022] Open
Abstract
Objective: To perform a review of recent research in clinical data reuse or secondary use, and envision future advances in this field. Methods: The review is based on a large literature search in MEDLINE (through PubMed), conference proceedings, and the ACM Digital Library, focusing only on research published between 2005 and early 2016. Each selected publication was reviewed by the authors, and a structured analysis and summarization of its content was developed. Results: The initial search produced 359 publications, reduced after a manual examination of abstracts and full publications. The following aspects of clinical data reuse are discussed: motivations and challenges, privacy and ethical concerns, data integration and interoperability, data models and terminologies, unstructured data reuse, structured data mining, clinical practice and research integration, and examples of clinical data reuse (quality measurement and learning healthcare systems). Conclusion: Reuse of clinical data is a fast-growing field recognized as essential to realize the potentials for high quality healthcare, improved healthcare management, reduced healthcare costs, population health management, and effective clinical research.
Collapse
Affiliation(s)
- S. M. Meystre
- Medical University of South Carolina, Charleston, SC, USA
| | - C. Lovis
- Division of Medical Information Sciences, University Hospitals of Geneva, Switzerland
| | - T. Bürkle
- University of Applied Sciences, Bern, Switzerland
| | - G. Tognola
- Institute of Electronics, Computer and Telecommunication Engineering, Italian Natl. Research Council IEIIT-CNR, Milan, Italy
| | - A. Budrionis
- Norwegian Centre for E-health Research, University Hospital of North Norway, Tromsø, Norway
| | - C. U. Lehmann
- Departments of Biomedical Informatics and Pediatrics, Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
12
|
Elkin PL, Brown SH. ICD9-CM Claims Data are Insufficient for Influenza Surveillance. Int Arch Med 2016; 9. [PMID: 32346398 PMCID: PMC7188306 DOI: 10.3823/2075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Background: Influenza and Influenza like illness are representative of a class of epidemic infectious diseases that have important public health implications. Early detection via biosurveillance can speed lifesaving public heath responses. In the United States, biosurveillance is typically conducted using ICD9 coded visit diagnoses and uncoded chief complaint data. Objective: To determine the accuracy of ICD9 diagnoses using laboratory confirmed cases as the gold standard. Design: A six-year retrospective cohort study. Setting: A tertiary referral center. Patients: All 3,825 patients with an ICD9-CM diagnosis of Influenza and all 1455 patients with laboratory confirmed Influenza. Results: Of the 3,828 patients assigned ICD9-CM visit codes indicating a diagnosis of Influenza, 2,825 were not confirmed by laboratory testing and 1,003 patients under went laboratory testing. Only 664 (66.2%) tested positive for Influenza. Of the 1,455 patients who tested positive for Influenza 45.6% were identified by ICD9-CM code. Conclusion: ICD9-CM had a low 66.2% Positive Predictive Value (precision) for Influenza and a low 45.6% Sensitivity (recall) for Influenza in patients tested for Influenza. ICD9 coded visit diagnoses/claims data are insufficient alone to serve as the basis for Influenza Surveillance. Primary Funding Source: CDC grants PH00022 and HK00014.
Collapse
Affiliation(s)
- Peter L Elkin
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, SUNY
| | - Steven H Brown
- Department of Biomedical Informatics, Vanderbilt Universit. Department of Veteran's Afairs
| |
Collapse
|
13
|
Simões AS, Couto I, Toscano C, Gonçalves E, Póvoa P, Viveiros M, Lapão LV. Prevention and Control of Antimicrobial Resistant Healthcare-Associated Infections: The Microbiology Laboratory Rocks! Front Microbiol 2016; 7:855. [PMID: 27375577 PMCID: PMC4895126 DOI: 10.3389/fmicb.2016.00855] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2015] [Accepted: 05/23/2016] [Indexed: 12/30/2022] Open
Abstract
In Europe, each year, more than four milion patients acquire a healthcare-associated infection (HAI) and almost 40 thousand die as a direct consequence of it. Regardless of many stategies to prevent and control HAIs, they remain an important cause of morbidity and mortality worldwide with a significant economic impact: a recent estimate places it at the ten billion dollars/year. The control of HAIs requires a prompt and efficient identification of the etiological agent and a rapid communication with the clinician. The Microbiology Laboratory has a significant role in the prevention and control of these infections and is a key element of any Infection Control Program. The work of the Microbiology Laboratory covers microbial isolation and identification, determination of antimicrobial susceptibility patterns, epidemiological surveillance and outbreak detection, education, and report of quality assured results. In this paper we address the role and importance of the Microbiology Laboratory in the prevention and control of HAI and in Antibiotic Stewardship Programs and how it can be leveraged when combined with the use of information systems. Additionally, we critically review some challenges that the Microbiology Laboratory has to deal with, including the selection of analytic methods and the proper use of communication channels with other healthcare services.
Collapse
Affiliation(s)
- Alexandra S. Simões
- Global Health and Tropical Medicine, Instituto de Higiene e Medicina Tropical, Universidade Nova de Lisboa, LisbonPortugal
| | - Isabel Couto
- Global Health and Tropical Medicine, Instituto de Higiene e Medicina Tropical, Universidade Nova de Lisboa, LisbonPortugal
| | - Cristina Toscano
- Laboratório de Microbiologia Clínica e Biologia Molecular, Serviço de Patologia Clínica, Hospital de Egas Moniz, Centro Hospitalar de Lisboa Ocidental, LisbonPortugal
- Centro de Estudos de Doenças Crónicas, NOVA Medical School/Faculdade de Ciências Médicas, Universidade Nova de Lisboa, LisbonPortugal
| | - Elsa Gonçalves
- Laboratório de Microbiologia Clínica e Biologia Molecular, Serviço de Patologia Clínica, Hospital de Egas Moniz, Centro Hospitalar de Lisboa Ocidental, LisbonPortugal
- Centro de Estudos de Doenças Crónicas, NOVA Medical School/Faculdade de Ciências Médicas, Universidade Nova de Lisboa, LisbonPortugal
| | - Pedro Póvoa
- Centro de Estudos de Doenças Crónicas, NOVA Medical School/Faculdade de Ciências Médicas, Universidade Nova de Lisboa, LisbonPortugal
- Unidade de Cuidados Intensivos Polivalente, Hospital de São Francisco Xavier, Centro Hospitalar de Lisboa Ocidental, LisbonPortugal
| | - Miguel Viveiros
- Global Health and Tropical Medicine, Instituto de Higiene e Medicina Tropical, Universidade Nova de Lisboa, LisbonPortugal
| | - Luís V. Lapão
- Global Health and Tropical Medicine, Instituto de Higiene e Medicina Tropical, Universidade Nova de Lisboa, LisbonPortugal
- WHO Collaborating Center for Health Workforce Policy and Planning, Instituto de Higiene e Medicina Tropical, Universidade Nova de Lisboa, LisbonPortugal
| |
Collapse
|
14
|
Kawazoe Y, Imai T, Ohe K. A Querying Method over RDF-ized Health Level Seven v2.5 Messages Using Life Science Knowledge Resources. JMIR Med Inform 2016; 4:e12. [PMID: 27050304 PMCID: PMC4837294 DOI: 10.2196/medinform.5275] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2015] [Revised: 01/11/2016] [Accepted: 02/19/2016] [Indexed: 11/23/2022] Open
Abstract
Background Health level seven version 2.5 (HL7 v2.5) is a widespread messaging standard for information exchange between clinical information systems. By applying Semantic Web technologies for handling HL7 v2.5 messages, it is possible to integrate large-scale clinical data with life science knowledge resources. Objective Showing feasibility of a querying method over large-scale resource description framework (RDF)-ized HL7 v2.5 messages using publicly available drug databases. Methods We developed a method to convert HL7 v2.5 messages into the RDF. We also converted five kinds of drug databases into RDF and provided explicit links between the corresponding items among them. With those linked drug data, we then developed a method for query expansion to search the clinical data using semantic information on drug classes along with four types of temporal patterns. For evaluation purpose, medication orders and laboratory test results for a 3-year period at the University of Tokyo Hospital were used, and the query execution times were measured. Results Approximately 650 million RDF triples for medication orders and 790 million RDF triples for laboratory test results were converted. Taking three types of query in use cases for detecting adverse events of drugs as an example, we confirmed these queries were represented in SPARQL Protocol and RDF Query Language (SPARQL) using our methods and comparison with conventional query expressions were performed. The measurement results confirm that the query time is feasible and increases logarithmically or linearly with the amount of data and without diverging. Conclusions The proposed methods enabled query expressions that separate knowledge resources and clinical data, thereby suggesting the feasibility for improving the usability of clinical data by enhancing the knowledge resources. We also demonstrate that when HL7 v2.5 messages are automatically converted into RDF, searches are still possible through SPARQL without modifying the structure. As such, the proposed method benefits not only our hospitals, but also numerous hospitals that handle HL7 v2.5 messages. Our approach highlights a potential of large-scale data federation techniques to retrieve clinical information, which could be applied as applications of clinical intelligence to improve clinical practices, such as adverse drug event monitoring and cohort selection for a clinical study as well as discovering new knowledge from clinical information.
Collapse
Affiliation(s)
- Yoshimasa Kawazoe
- Department of Healthcare Information Management, The University of Tokyo Hospital, Tokyo, Japan.
| | | | | |
Collapse
|
15
|
Development of a clinical decision support system for antibiotic management in a hospital environment. PROGRESS IN ARTIFICIAL INTELLIGENCE 2016. [DOI: 10.1007/s13748-016-0089-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
16
|
Perez F, Villegas MV. The role of surveillance systems in confronting the global crisis of antibiotic-resistant bacteria. Curr Opin Infect Dis 2015; 28:375-83. [PMID: 26098505 PMCID: PMC4707665 DOI: 10.1097/qco.0000000000000182] [Citation(s) in RCA: 49] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
PURPOSE OF REVIEW It is widely accepted that infection control, advanced diagnostics, and novel therapeutics are crucial to mitigate the impact of antibiotic-resistant bacteria. The role of global, national, and regional surveillance systems as part of the response to the challenge posed by antibiotic resistance is not sufficiently highlighted. We provide an overview of contemporary surveillance programs, with emphasis on gram-negative bacteria. RECENT FINDINGS The WHO and public health agencies in Europe and the United States recently published comprehensive surveillance reports. These highlight the emergence and dissemination of carbapenem-resistant Enterobacteriaceae and other multidrug-resistant gram-negative bacteria. In Israel, public health action to control carbapenem-resistant Enterobacteriaceae, especially Klebsiella pneumoniae carbapenemase producing K. pneumoniae, has advanced together with a better understanding of its epidemiology. Surveillance models adapted to the requirements and capacities of each country are in development. SUMMARY Robust surveillance systems are essential to combat antibiotic resistance, and need to emphasize a 'one health' approach. Refinements in surveillance will come from advances in bioinformatics and genomics that permit the integration of global and local information about antibiotic consumption in humans and animals, molecular mechanisms of resistance, and bacterial genotyping.
Collapse
Affiliation(s)
- Federico Perez
- Louis Stokes Cleveland Department of Veterans Affairs Medical Center and Case Western Reserve University School of Medicine; Cleveland, Ohio, United States
| | | |
Collapse
|
17
|
Ping XO, Chung Y, Tseng YJ, Liang JD, Yang PM, Huang GT, Lai F. A web-based data-querying tool based on ontology-driven methodology and flowchart-based model. JMIR Med Inform 2013; 1:e2. [PMID: 25600078 PMCID: PMC4288233 DOI: 10.2196/medinform.2519] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2013] [Revised: 06/20/2013] [Accepted: 08/17/2013] [Indexed: 11/29/2022] Open
Abstract
Background Because of the increased adoption rate of electronic medical record (EMR) systems, more health care records have been increasingly accumulating in clinical data repositories. Therefore, querying the data stored in these repositories is crucial for retrieving the knowledge from such large volumes of clinical data. Objective The aim of this study is to develop a Web-based approach for enriching the capabilities of the data-querying system along the three following considerations: (1) the interface design used for query formulation, (2) the representation of query results, and (3) the models used for formulating query criteria. Methods The Guideline Interchange Format version 3.5 (GLIF3.5), an ontology-driven clinical guideline representation language, was used for formulating the query tasks based on the GLIF3.5 flowchart in the Protégé environment. The flowchart-based data-querying model (FBDQM) query execution engine was developed and implemented for executing queries and presenting the results through a visual and graphical interface. To examine a broad variety of patient data, the clinical data generator was implemented to automatically generate the clinical data in the repository, and the generated data, thereby, were employed to evaluate the system. The accuracy and time performance of the system for three medical query tasks relevant to liver cancer were evaluated based on the clinical data generator in the experiments with varying numbers of patients. Results In this study, a prototype system was developed to test the feasibility of applying a methodology for building a query execution engine using FBDQMs by formulating query tasks using the existing GLIF. The FBDQM-based query execution engine was used to successfully retrieve the clinical data based on the query tasks formatted using the GLIF3.5 in the experiments with varying numbers of patients. The accuracy of the three queries (ie, “degree of liver damage,” “degree of liver damage when applying a mutually exclusive setting,” and “treatments for liver cancer”) was 100% for all four experiments (10 patients, 100 patients, 1000 patients, and 10,000 patients). Among the three measured query phases, (1) structured query language operations, (2) criteria verification, and (3) other, the first two had the longest execution time. Conclusions The ontology-driven FBDQM-based approach enriched the capabilities of the data-querying system. The adoption of the GLIF3.5 increased the potential for interoperability, shareability, and reusability of the query tasks.
Collapse
Affiliation(s)
- Xiao-Ou Ping
- Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
| | | | | | | | | | | | | |
Collapse
|
18
|
Chen WH, Hsieh SL, Hsu KP, Chen HP, Su XY, Tseng YJ, Chien YH, Hwu WL, Lai F. Web-based newborn screening system for metabolic diseases: machine learning versus clinicians. J Med Internet Res 2013; 15:e98. [PMID: 23702487 PMCID: PMC3668606 DOI: 10.2196/jmir.2495] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2012] [Revised: 03/04/2013] [Accepted: 04/07/2013] [Indexed: 01/06/2023] Open
Abstract
Background A hospital information system (HIS) that integrates screening data and interpretation of the data is routinely requested by hospitals and parents. However, the accuracy of disease classification may be low because of the disease characteristics and the analytes used for classification. Objective The objective of this study is to describe a system that enhanced the neonatal screening system of the Newborn Screening Center at the National Taiwan University Hospital. The system was designed and deployed according to a service-oriented architecture (SOA) framework under the Web services .NET environment. The system consists of sample collection, testing, diagnosis, evaluation, treatment, and follow-up services among collaborating hospitals. To improve the accuracy of newborn screening, machine learning and optimal feature selection mechanisms were investigated for screening newborns for inborn errors of metabolism. Methods The framework of the Newborn Screening Hospital Information System (NSHIS) used the embedded Health Level Seven (HL7) standards for data exchanges among heterogeneous platforms integrated by Web services in the C# language. In this study, machine learning classification was used to predict phenylketonuria (PKU), hypermethioninemia, and 3-methylcrotonyl-CoA-carboxylase (3-MCC) deficiency. The classification methods used 347,312 newborn dried blood samples collected at the Center between 2006 and 2011. Of these, 220 newborns had values over the diagnostic cutoffs (positive cases) and 1557 had values that were over the screening cutoffs but did not meet the diagnostic cutoffs (suspected cases). The original 35 analytes and the manifested features were ranked based on F score, then combinations of the top 20 ranked features were selected as input features to support vector machine (SVM) classifiers to obtain optimal feature sets. These feature sets were tested using 5-fold cross-validation and optimal models were generated. The datasets collected in year 2011 were used as predicting cases. Results The feature selection strategies were implemented and the optimal markers for PKU, hypermethioninemia, and 3-MCC deficiency were obtained. The results of the machine learning approach were compared with the cutoff scheme. The number of the false positive cases were reduced from 21 to 2 for PKU, from 30 to 10 for hypermethioninemia, and 209 to 46 for 3-MCC deficiency. Conclusions This SOA Web service–based newborn screening system can accelerate screening procedures effectively and efficiently. An SVM learning methodology for PKU, hypermethioninemia, and 3-MCC deficiency metabolic diseases classification, including optimal feature selection strategies, is presented. By adopting the results of this study, the number of suspected cases could be reduced dramatically.
Collapse
Affiliation(s)
- Wei-Hsin Chen
- National Taiwan University, Graduate Institute of Biomedical Electronics and Bioinformatics, Taipei, Taiwan
| | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Teodoro D, Lovis C. Empirical mode decomposition and k-nearest embedding vectors for timely analyses of antibiotic resistance trends. PLoS One 2013; 8:e61180. [PMID: 23637796 PMCID: PMC3636283 DOI: 10.1371/journal.pone.0061180] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2012] [Accepted: 03/07/2013] [Indexed: 12/03/2022] Open
Abstract
Background Antibiotic resistance is a major worldwide public health concern. In clinical settings, timely antibiotic resistance information is key for care providers as it allows appropriate targeted treatment or improved empirical treatment when the specific results of the patient are not yet available. Objective To improve antibiotic resistance trend analysis algorithms by building a novel, fully data-driven forecasting method from the combination of trend extraction and machine learning models for enhanced biosurveillance systems. Methods We investigate a robust model for extraction and forecasting of antibiotic resistance trends using a decade of microbiology data. Our method consists of breaking down the resistance time series into independent oscillatory components via the empirical mode decomposition technique. The resulting waveforms describing intrinsic resistance trends serve as the input for the forecasting algorithm. The algorithm applies the delay coordinate embedding theorem together with the k-nearest neighbor framework to project mappings from past events into the future dimension and estimate the resistance levels. Results The algorithms that decompose the resistance time series and filter out high frequency components showed statistically significant performance improvements in comparison with a benchmark random walk model. We present further qualitative use-cases of antibiotic resistance trend extraction, where empirical mode decomposition was applied to highlight the specificities of the resistance trends. Conclusion The decomposition of the raw signal was found not only to yield valuable insight into the resistance evolution, but also to produce novel models of resistance forecasters with boosted prediction performance, which could be utilized as a complementary method in the analysis of antibiotic resistance trends.
Collapse
Affiliation(s)
- Douglas Teodoro
- Division of Medical Information Sciences, University Hospitals of Geneva, Geneva, Switzerland.
| | | |
Collapse
|
20
|
Assisted knowledge discovery for the maintenance of clinical guidelines. PLoS One 2013; 8:e62874. [PMID: 23646153 PMCID: PMC3639894 DOI: 10.1371/journal.pone.0062874] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2012] [Accepted: 03/28/2013] [Indexed: 11/19/2022] Open
Abstract
Background Improving antibiotic prescribing practices is an important public-health priority given the widespread antimicrobial resistance. Establishing clinical practice guidelines is crucial to this effort, but their development is a complex task and their quality is directly related to the methodology and source of knowledge used. Objective We present the design and the evaluation of a tool (KART) that aims to facilitate the creation and maintenance of clinical practice guidelines based on information retrieval techniques. Methods KART consists of three main modules 1) a literature-based medical knowledge extraction module, which is built upon a specialized question-answering engine; 2) a module to normalize clinical recommendations based on automatic text categorizers; and 3) a module to manage clinical knowledge, which formalizes and stores clinical recommendations for further use. The evaluation of the usability and utility of KART followed the methodology of the cognitive walkthrough. Results KART was designed and implemented as a standalone web application. The quantitative evaluation of the medical knowledge extraction module showed that 53% of the clinical recommendations generated by KART are consistent with existing clinical guidelines. The user-based evaluation confirmed this result by showing that KART was able to find a relevant antibiotic for half of the clinical scenarios tested. The automatic normalization of the recommendation produced mixed results among end-users. Conclusions We have developed an innovative approach for the process of clinical guidelines development and maintenance in a context where available knowledge is increasing at a rate that cannot be sustained by humans. In contrast to existing knowledge authoring tools, KART not only provides assistance to normalize, formalize and store clinical recommendations, but also aims to facilitate knowledge building.
Collapse
|