76
|
Burgun A, Seka LP, Delamarre D, Le Beux P. Automated Coding of Patient Discharge Summaries Using Conceptual Graphs. Methods Inf Med 2018. [DOI: 10.1055/s-0038-1634611] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
Abstract:In medicine, as in other domains, indexing and classification is a natural human task which is used for information retrieval and representation. In the medical field, encoding of patient discharge summaries is still a manual time-consuming task. This paper describes an automated coding system of patient discharge summaries from the field of coronary diseases into the ICD-9-CM classification. The system is developed in the context of the European AIM MENELAS project, a natural-language understanding system which uses the conceptual-graph formalism. Indexing is performed by using a two-step processing scheme; a first recognition stage is implemented by a matching procedure and a secondary selection stage is made according to the coding priorities. We show the general features of the necessary translation of the classification terms in the conceptual-graph model, and for the coding rules compliance. An advantage of the system is to provide an objective evaluation and assessment procedure for natural-language understanding.
Collapse
|
77
|
Brochhausen M, Burgun A, Ceusters W, Hasman A, Leong TY, Musen M, Oliveira JL, Peleg M, Rector A, Schulz S. Discussion of “Biomedical Ontologies: Toward Scientific Debate”. Methods Inf Med 2018. [DOI: 10.1055/s-0038-1625243] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
78
|
Bibault JE, Zapletal E, Rance B, Giraud P, Burgun A. Labeling for Big Data in radiation oncology: The Radiation Oncology Structures ontology. PLoS One 2018; 13:e0191263. [PMID: 29351341 PMCID: PMC5774757 DOI: 10.1371/journal.pone.0191263] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2017] [Accepted: 01/01/2018] [Indexed: 12/25/2022] Open
Abstract
Purpose Leveraging Electronic Health Records (EHR) and Oncology Information Systems (OIS) has great potential to generate hypotheses for cancer treatment, since they directly provide medical data on a large scale. In order to gather a significant amount of patients with a high level of clinical details, multicenter studies are necessary. A challenge in creating high quality Big Data studies involving several treatment centers is the lack of semantic interoperability between data sources. We present the ontology we developed to address this issue. Methods Radiation Oncology anatomical and target volumes were categorized in anatomical and treatment planning classes. International delineation guidelines specific to radiation oncology were used for lymph nodes areas and target volumes. Hierarchical classes were created to generate The Radiation Oncology Structures (ROS) Ontology. The ROS was then applied to the data from our institution. Results Four hundred and seventeen classes were created with a maximum of 14 children classes (average = 5). The ontology was then converted into a Web Ontology Language (.owl) format and made available online on Bioportal and GitHub under an Apache 2.0 License. We extracted all structures delineated in our department since the opening in 2001. 20,758 structures were exported from our “record-and-verify” system, demonstrating a significant heterogeneity within a single center. All structures were matched to the ROS ontology before integration into our clinical data warehouse (CDW). Conclusion In this study we describe a new ontology, specific to radiation oncology, that reports all anatomical and treatment planning structures that can be delineated. This ontology will be used to integrate dosimetric data in the Assistance Publique—Hôpitaux de Paris CDW that stores data from 6.5 million patients (as of February 2017).
Collapse
|
79
|
Cohen S, Gilutz H, Marelli A, Iserin L, Bonnet D, Burgun A. Administrative Health Databases for addressing emerging issues in adults with congenital heart diseases. ARCHIVES OF CARDIOVASCULAR DISEASES SUPPLEMENTS 2018. [DOI: 10.1016/j.acvdsp.2017.11.160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
80
|
Jantzen R, Rance B, Katsahian S, Burgun A, Looten V. The Need of an Open Data Quality Policy: The Case of the "Transparency - Health" Database in the Prevention of Conflict of Interest. Stud Health Technol Inform 2018; 247:611-615. [PMID: 29678033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Open data available largely and with minimal constraints to the general public and journalists are needed to help rebuild trust between citizens and the health system. By opening data, we can expect to increase the democratic accountability, the self-empowerment of citizens. This article aims at assessing the quality and reusability of the Transparency - Health database with regards to the FAIR principles. More specifically, we observe the quality of the identity of the French medical doctors in the Transp-db. This study shows that the quality of the data in the Transp-db does not allow to identity with certainty those who benefit from an advantage or remuneration to be confirmed, reducing noticeably the impact of the open data effort.
Collapse
|
81
|
Garcelon N, Neuraz A, Benoit V, Salomon R, Burgun A. Improving a full-text search engine: the importance of negation detection and family history context to identify cases in a biomedical data warehouse. J Am Med Inform Assoc 2017; 24:607-613. [PMID: 28339516 DOI: 10.1093/jamia/ocw144] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2016] [Accepted: 08/31/2016] [Indexed: 12/19/2022] Open
Abstract
Objective The repurposing of electronic health records (EHRs) can improve clinical and genetic research for rare diseases. However, significant information in rare disease EHRs is embedded in the narrative reports, which contain many negated clinical signs and family medical history. This paper presents a method to detect family history and negation in narrative reports and evaluates its impact on selecting populations from a clinical data warehouse (CDW). Materials and Methods We developed a pipeline to process 1.6 million reports from multiple sources. This pipeline is part of the load process of the Necker Hospital CDW. Results We identified patients with "Lupus and diarrhea," "Crohn's and diabetes," and "NPHP1" from the CDW. The overall precision, recall, specificity, and F-measure were 0.85, 0.98, 0.93, and 0.91, respectively. Conclusion The proposed method generates a highly accurate identification of cases from a CDW of rare disease EHRs.
Collapse
|
82
|
Dunn W, Burgun A, Krebs MO, Rance B. Exploring and visualizing multidimensional data in translational research platforms. Brief Bioinform 2017; 18:1044-1056. [PMID: 27585944 PMCID: PMC5862238 DOI: 10.1093/bib/bbw080] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2016] [Revised: 07/30/2016] [Accepted: 08/03/2016] [Indexed: 01/20/2023] Open
Abstract
The unprecedented advances in technology and scientific research over the past few years have provided the scientific community with new and more complex forms of data. Large data sets collected from single groups or cross-institution consortiums containing hundreds of omic and clinical variables corresponding to thousands of patients are becoming increasingly commonplace in the research setting. Before any core analyses are performed, visualization often plays a key role in the initial phases of research, especially for projects where no initial hypotheses are dominant. Proper visualization of data at a high level facilitates researcher's abilities to find trends, identify outliers and perform quality checks. In addition, research has uncovered the important role of visualization in data analysis and its implied benefits facilitating our understanding of disease and ultimately improving patient care. In this work, we present a review of the current landscape of existing tools designed to facilitate the visualization of multidimensional data in translational research platforms. Specifically, we reviewed the biomedical literature for translational platforms allowing the visualization and exploration of clinical and omics data, and identified 11 platforms: cBioPortal, interactive genomics patient stratification explorer, Igloo-Plot, The Georgetown Database of Cancer Plus, tranSMART, an unnamed data-cube-based model supporting heterogeneous data, Papilio, Caleydo Domino, Qlucore Omics, Oracle Health Sciences Translational Research Center and OmicsOffice® powered by TIBCO Spotfire. In a health sector continuously witnessing an increase in data from multifarious sources, visualization tools used to better grasp these data will grow in their importance, and we believe our work will be useful in guiding investigators in similar situations.
Collapse
|
83
|
Escudié JB, Rance B, Malamut G, Khater S, Burgun A, Cellier C, Jannot AS. A novel data-driven workflow combining literature and electronic health records to estimate comorbidities burden for a specific disease: a case study on autoimmune comorbidities in patients with celiac disease. BMC Med Inform Decis Mak 2017; 17:140. [PMID: 28962565 PMCID: PMC5622531 DOI: 10.1186/s12911-017-0537-y] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2016] [Accepted: 09/12/2017] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND Data collected in EHRs have been widely used to identifying specific conditions; however there is still a need for methods to define comorbidities and sources to identify comorbidities burden. We propose an approach to assess comorbidities burden for a specific disease using the literature and EHR data sources in the case of autoimmune diseases in celiac disease (CD). METHODS We generated a restricted set of comorbidities using the literature (via the MeSH® co-occurrence file). We extracted the 15 most co-occurring autoimmune diseases of the CD. We used mappings of the comorbidities to EHR terminologies: ICD-10 (billing codes), ATC (drugs) and UMLS (clinical reports). Finally, we extracted the concepts from the different data sources. We evaluated our approach using the correlation between prevalence estimates in our cohort and co-occurrence ranking in the literature. RESULTS We retrieved the comorbidities for 741 patients with CD. 18.1% of patients had at least one of the 15 studied autoimmune disorders. Overall, 79.3% of the mapped concepts were detected only in text, 5.3% only in ICD codes and/or drugs prescriptions, and 15.4% could be found in both sources. Prevalence in our cohort were correlated with literature (Spearman's coefficient 0.789, p = 0.0005). The three most prevalent comorbidities were thyroiditis 12.6% (95% CI 10.1-14.9), type 1 diabetes 2.3% (95% CI 1.2-3.4) and dermatitis herpetiformis 2.0% (95% CI 1.0-3.0). CONCLUSION We introduced a process that leveraged the MeSH terminology to identify relevant autoimmune comorbidities of the CD and several data sources from EHRs to phenotype a large population of CD patients. We achieved prevalence estimates comparable to the literature.
Collapse
|
84
|
Burgun A, Bernal-Delgado E, Kuchinke W, van Staa T, Cunningham J, Lettieri E, Mazzali C, Oksen D, Estupiñan F, Barone A, Chène G. Health Data for Public Health: Towards New Ways of Combining Data Sources to Support Research Efforts in Europe. Yearb Med Inform 2017; 26:235-240. [PMID: 29063571 PMCID: PMC6239221 DOI: 10.15265/iy-2017-034] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2017] [Indexed: 12/21/2022] Open
Abstract
Objectives: To present the European landscape regarding the re-use of health administrative data for research. Methods: We present some collaborative projects and solutions that have been developed by Nordic countries, Italy, Spain, France, Germany, and the UK, to facilitate access to their health data for research purposes. Results: Research in public health is transitioning from siloed systems to more accessible and re-usable data resources. Following the example of the Nordic countries, several European countries aim at facilitating the re-use of their health administrative databases for research purposes. However, the ecosystem is still a complex patchwork, with different rules, policies, and processes for data provision. Conclusion: The challenges are such that with the abundance of health administrative data, only a European, overarching public health research infrastructure, is able to efficiently facilitate access to this data and accelerate research based on these highly valuable resources.
Collapse
|
85
|
Ethier JF, Curcin V, McGilchrist MM, Choi Keung SNL, Zhao L, Andreasson A, Bródka P, Michalski R, Arvanitis TN, Mastellos N, Burgun A, Delaney BC. eSource for clinical trials: Implementation and evaluation of a standards-based approach in a real world trial. Int J Med Inform 2017; 106:17-24. [PMID: 28870379 DOI: 10.1016/j.ijmedinf.2017.06.006] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2017] [Revised: 06/20/2017] [Accepted: 06/24/2017] [Indexed: 01/24/2023]
Abstract
OBJECTIVE The Learning Health System (LHS) requires integration of research into routine practice. 'eSource' or embedding clinical trial functionalities into routine electronic health record (EHR) systems has long been put forward as a solution to the rising costs of research. We aimed to create and validate an eSource solution that would be readily extensible as part of a LHS. MATERIALS AND METHODS The EU FP7 TRANSFoRm project's approach is based on dual modelling, using the Clinical Research Information Model (CRIM) and the Clinical Data Integration Model of meaning (CDIM) to bridge the gap between clinical and research data structures, using the CDISC Operational Data Model (ODM) standard. Validation against GCP requirements was conducted in a clinical site, and a cluster randomised evaluation by site nested into a live clinical trial. RESULTS Using the form definition element of ODM, we linked precisely modelled data queries to data elements, constrained against CDIM concepts, to enable automated patient identification for specific protocols and pre-population of electronic case report forms (e-CRF). Both control and eSource sites recruited better than expected with no significant difference. Completeness of clinical forms was significantly improved by eSource, but Patient Related Outcome Measures (PROMs) were less well completed on smartphones than paper in this population. DISCUSSION The TRANSFoRm approach provides an ontologically-based approach to eSource in a low-resource, heterogeneous, highly distributed environment, that allows precise prospective mapping of data elements in the EHR. CONCLUSION Further studies using this approach to CDISC should optimise the delivery of PROMS, whilst building a sustainable infrastructure for eSource with research networks, trials units and EHR vendors.
Collapse
|
86
|
Abdellaoui R, Schück S, Texier N, Burgun A. Filtering Entities to Optimize Identification of Adverse Drug Reaction From Social Media: How Can the Number of Words Between Entities in the Messages Help? JMIR Public Health Surveill 2017. [PMID: 28642212 PMCID: PMC5500778 DOI: 10.2196/publichealth.6577] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND With the increasing popularity of Web 2.0 applications, social media has made it possible for individuals to post messages on adverse drug reactions. In such online conversations, patients discuss their symptoms, medical history, and diseases. These disorders may correspond to adverse drug reactions (ADRs) or any other medical condition. Therefore, methods must be developed to distinguish between false positives and true ADR declarations. OBJECTIVE The aim of this study was to investigate a method for filtering out disorder terms that did not correspond to adverse events by using the distance (as number of words) between the drug term and the disorder or symptom term in the post. We hypothesized that the shorter the distance between the disorder name and the drug, the higher the probability to be an ADR. METHODS We analyzed a corpus of 648 messages corresponding to a total of 1654 (drug and disorder) pairs from 5 French forums using Gaussian mixture models and an expectation-maximization (EM) algorithm . RESULTS The distribution of the distances between the drug term and the disorder term enabled the filtering of 50.03% (733/1465) of the disorders that were not ADRs. Our filtering strategy achieved a precision of 95.8% and a recall of 50.0%. CONCLUSIONS This study suggests that such distance between terms can be used for identifying false positives, thereby improving ADR detection in social media.
Collapse
|
87
|
Jannot AS, Burgun A, Thervet E, Pallet N. The Diagnosis-Wide Landscape of Hospital-Acquired AKI. Clin J Am Soc Nephrol 2017; 12:874-884. [PMID: 28495862 PMCID: PMC5460713 DOI: 10.2215/cjn.10981016] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2016] [Accepted: 03/01/2017] [Indexed: 11/23/2022]
Abstract
BACKGROUND AND OBJECTIVES The exploration of electronic hospital records offers a unique opportunity to describe in-depth the prevalence of conditions associated with diagnoses at an unprecedented level of comprehensiveness. We used a diagnosis-wide approach, adapted from phenome-wide association studies (PheWAS), to perform an exhaustive analysis of all diagnoses associated with hospital-acquired AKI (HA-AKI) in a French urban tertiary academic hospital over a period of 10 years. DESIGN, SETTING, PARTICIPANTS, & MEASUREMENTS We retrospectively extracted all diagnoses from an i2b2 (Informatics for Integrating Biology and the Bedside) clinical data warehouse for patients who stayed in this hospital between 2006 and 2015 and had at least two plasma creatinine measurements performed during the first week of their stay. We then analyzed the association between HA-AKI and each International Classification of Diseases (ICD)-10 diagnostic category to draw a comprehensive picture of diagnoses associated with AKI. Hospital stays for 126,736 unique individuals were extracted. RESULTS Hemodynamic impairment and surgical procedures are the main factors associated with HA-AKI and five clusters of diagnoses were identified: sepsis, heart diseases, polytrauma, liver disease, and cardiovascular surgery. The ICD-10 code corresponding to AKI (N17) was recorded in 30% of the cases with HA-AKI identified, and in this situation, 20% of the diagnoses associated with HA-AKI corresponded to kidney diseases such as tubulointerstitial nephritis, necrotizing vasculitis, or myeloma cast nephropathy. Codes associated with HA-AKI that demonstrated the greatest increase in prevalence with time were related to influenza, polytrauma, and surgery of neoplasms of the genitourinary system. CONCLUSIONS Our approach, derived from PheWAS, is a valuable way to comprehensively identify and classify all of the diagnoses and clusters of diagnoses associated with HA-AKI. Our analysis delivers insights into how diagnoses associated with HA-AKI evolved over time. On the basis of ICD-10 codes, HA-AKI appears largely underestimated in this academic hospital.
Collapse
|
88
|
Bibault JE, Burgun A, Giraud P. Intelligence artificielle appliquée à la radiothérapie. Cancer Radiother 2017. [DOI: 10.1016/j.canrad.2017.04.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
89
|
Bibault JE, Burgun A, Giraud P. Intelligence artificielle appliquée à la radiothérapie. Cancer Radiother 2017. [DOI: 10.1016/j.canrad.2017.04.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
90
|
Bibault JE, Burgun A, Giraud P. Intelligence artificielle appliquée à la radiothérapie. Cancer Radiother 2017; 21:239-243. [DOI: 10.1016/j.canrad.2016.09.021] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2016] [Revised: 09/21/2016] [Accepted: 09/28/2016] [Indexed: 02/04/2023]
|
91
|
Mamzer MF, Duchange N, Darquy S, Marvanne P, Rambaud C, Marsico G, Cerisey C, Scotté F, Burgun A, Badoual C, Laurent-Puig P, Hervé C. Erratum to: Partnering with patients in translational oncology research: ethical approach. J Transl Med 2017; 15:80. [PMID: 28433049 PMCID: PMC5401763 DOI: 10.1186/s12967-017-1181-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
|
92
|
Mamzer MF, Duchange N, Darquy S, Marvanne P, Rambaud C, Marsico G, Cerisey C, Scotté F, Burgun A, Badoual C, Laurent-Puig P, Hervé C. Partnering with patients in translational oncology research: ethical approach. J Transl Med 2017; 15:74. [PMID: 28390420 PMCID: PMC5385033 DOI: 10.1186/s12967-017-1177-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2017] [Accepted: 04/01/2017] [Indexed: 11/10/2022] Open
Abstract
Background The research program CARPEM (cancer research and personalized medicine) brings together the expertise of researchers and hospital-based oncologists to develop translational research in the context of personalized or “precision” medicine for cancer. There is recognition that patient involvement can help to take into account their needs and priorities in the development of this emerging practice but there is currently no consensus about how this can be achieved. In this study, we developed an empirical ethical research action aiming to improve patient representatives’ involvement in the development of the translational research program together with health professionals. The aim is to promote common understanding and sharing of knowledge between all parties and to establish a long-term partnership integrating patient’s expectations. Methods Two distinct committees were settled in CARPEM: an “Expert Committee”, gathering healthcare and research professionals, and a “Patient Committee”, gathering patients and patient representatives. A multidisciplinary team trained in medical ethics research ensured communication between the two committees as well as analysis of discussions, minutes and outputs from all stakeholders. Results The results highlight the efficiency of the transfer of knowledge between interested parties. Patient representatives and professionals were able to identify new ethical challenges and co-elaborate new procedures to gather information and consent forms for adapting to practices and recommendations developed during the process. Moreover, included patient representatives became full partners and participated in the transfer of knowledge to the public via conferences and publications. Conclusions Empirical ethical research based on a patient-centered approach could help in establishing a fair model for coordination and support actions during cancer research, striking a balance between the regulatory framework, researcher needs and patient expectations. Our approach addresses the concept of translational ethics as a way to handle the main remaining gap between combining care and research activities in the medical pathway and the existing framework.
Collapse
|
93
|
Girardeau Y, Doods J, Zapletal E, Chatellier G, Daniel C, Burgun A, Dugas M, Rance B. Leveraging the EHR4CR platform to support patient inclusion in academic studies: challenges and lessons learned. BMC Med Res Methodol 2017; 17:36. [PMID: 28241798 PMCID: PMC5329914 DOI: 10.1186/s12874-017-0299-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2016] [Accepted: 01/23/2017] [Indexed: 11/10/2022] Open
Abstract
Background The development of Electronic Health Records (EHRs) in hospitals offers the ability to reuse data from patient care activities for clinical research. EHR4CR is a European public-private partnership aiming to develop a computerized platform that enables the re-use of data collected from EHRs over its network. However, the reproducibility of queries may depend on attributes of the local data. Our objective was 1/ to describe the different steps that were achieved in order to use the EHR4CR platform and 2/ to identify the specific issues that could impact the final performance of the platform. Methods We selected three institutional studies covering various medical domains. The studies included a total of 67 inclusion and exclusion criteria and ran in two University Hospitals. We described the steps required to use the EHR4CR platform for a feasibility study. We also defined metrics to assess each of the steps (including criteria complexity, normalization quality, and data completeness of EHRs). Results We identified 114 distinct medical concepts from a total of 67 eligibility criteria Among the 114 concepts: 23 (20.2%) corresponded to non-structured data (i.e. for which transformation is needed before analysis), 92 (81%) could be mapped to terminologies used in EHR4CR, and 86 (75%) could be mapped to local terminologies. We identified 51 computable criteria following the normalization process. The normalization was considered by experts to be satisfactory or higher for 64.2% (43/67) of the computable criteria. All of the computable criteria could be expressed using the EHR4CR platform. Conclusions We identified a set of issues that could affect the future results of the platform: (a) the normalization of free-text criteria, (b) the translation into computer-friendly criteria and (c) issues related to the execution of the query to clinical data warehouses. We developed and evaluated metrics to better describe the platforms and their result. These metrics could be used for future reports of Clinical Trial Recruitment Support Systems assessment studies, and provide experts and readers with tools to insure the quality of constructed dataset. Electronic supplementary material The online version of this article (doi:10.1186/s12874-017-0299-3) contains supplementary material, which is available to authorized users.
Collapse
|
94
|
Jannot AS, Zapletal E, Avillach P, Mamzer MF, Burgun A, Degoulet P. The Georges Pompidou University Hospital Clinical Data Warehouse: A 8-years follow-up experience. Int J Med Inform 2017; 102:21-28. [PMID: 28495345 DOI: 10.1016/j.ijmedinf.2017.02.006] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2017] [Accepted: 02/11/2017] [Indexed: 12/25/2022]
Abstract
BACKGROUND When developed jointly with clinical information systems, clinical data warehouses (CDWs) facilitate the reuse of healthcare data and leverage clinical research. OBJECTIVE To describe both data access and use for clinical research, epidemiology and health service research of the "Hôpital Européen Georges Pompidou" (HEGP) CDW. METHODS The CDW has been developed since 2008 using an i2b2 platform. It was made available to health professionals and researchers in October 2010. Procedures to access data have been implemented and different access levels have been distinguished according to the nature of queries. RESULTS As of July 2016, the CDW contained the consolidated data of over 860,000 patients followed since the opening of the HEGP hospital in July 2000. These data correspond to more than 122 million clinical item values, 124 million biological item values, and 3.7 million free text reports. The ethics committee of the hospital evaluates all CDW projects that generate secondary data marts. Characteristics of the 74 research projects validated between January 2011 and December 2015 are described. CONCLUSION The use of HEGP CDWs is a key facilitator for clinical research studies. It required however important methodological and organizational support efforts from a biomedical informatics department.
Collapse
|
95
|
Barton A, Ethier JF, Duvauferrier R, Burgun A. An ontological analysis of medical Bayesian indicators of performance. J Biomed Semantics 2017; 8:1. [PMID: 28049518 PMCID: PMC5209884 DOI: 10.1186/s13326-016-0099-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2016] [Accepted: 09/06/2016] [Indexed: 11/25/2022] Open
Abstract
Background Biomedical ontologies aim at providing the most exhaustive and rigorous representation of reality as described by biomedical sciences. A large part of medical reasoning deals with diagnosis and is essentially probabilistic. It would be an asset for biomedical ontologies to be able to support such a probabilistic reasoning and formalize Bayesian indicators of performance: sensitivity, specificity, positive predictive value and negative predictive value. In doing so, one has to consider that not only the positive and negative predictive values, but also sensitivity and specificity depend upon the group under consideration: this is the “spectrum effect”. Methods The sensitivity value of an index test IT for a disease M in a group g is identified with the proportion of people in g who have M who would get a positive result to IT if the test IT was realized on them. This value can be estimated by selecting a reference test RT for M and a sample s of g, and measuring the proportion, among members of s having a positive result to RT, of those who got a positive result to IT. Similar approximation strategies hold for prevalence, specificity, PPV and NPV. Indicators of diagnostic performances and their estimations are formalized in the context of the OBO Foundry, built on the realist upper ontology Basic Formal Ontology (BFO). Results Entities and relations from the Ontology for Biomedical investigations (OBI) and the Information Artifact Ontology (IAO) are used and complemented to represent reference tests and index tests, tests executions, tests results and the relations involving those entities, as well as the values of indicators of performance and their estimates. The computations taking as input several estimates of an indicator of performance to produce a finer estimate are also represented. The value of e.g. sensitivity estimates should be dissociated from the real sensitivity value – which involves possible, non-actual conditions, namely the result a person would get if a medical test would be performed on her. Such conditions could not be directly represented in a realist ontology, but a representation is proposed that introduces only actual entities by considering a disposition whose probability value is the real sensitivity value. A sensitivity estimate is a data item which is about such a disposition. Conclusions This model provides theoretical basis for the representation of entities supporting Bayesian reasoning in ontologies.
Collapse
|
96
|
Chen X, Deldossi M, Aboukhamis R, Faviez C, Dahamna B, Karapetiantz P, Guenegou-Arnoux A, Girardeau Y, Guillemin-Lanne S, Lillo-Le-Louët A, Texier N, Burgun A, Katsahian S. Mining Adverse Drug Reactions in Social Media with Named Entity Recognition and Semantic Methods. Stud Health Technol Inform 2017; 245:322-326. [PMID: 29295108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Suspected adverse drug reactions (ADR) reported by patients through social media can be a complementary source to current pharmacovigilance systems. However, the performance of text mining tools applied to social media text data to discover ADRs needs to be evaluated. In this paper, we introduce the approach developed to mine ADR from French social media. A protocol of evaluation is highlighted, which includes a detailed sample size determination and evaluation corpus constitution. Our text mining approach provided very encouraging preliminary results with F-measures of 0.94 and 0.81 for recognition of drugs and symptoms respectively, and with F-measure of 0.70 for ADR detection. Therefore, this approach is promising for downstream pharmacovigilance analysis.
Collapse
|
97
|
Burgun A, Oksen DV, Kuchinke W, Prokosch HU, Ganslandt T, Buchan I, van Staa T, Cunningham J, Gjerstorff ML, Dufour JC, Gibrat JF, Nikolski M, Verger P, Cambon-Thomsen A, Masella C, Lettieri E, Bertele P, Salokannel M, Thiebaut R, Persoz C, Chêne G, Ohmann C. Linking health and administrative data for maternal, child and young adult health. Eur J Public Health 2016. [DOI: 10.1093/eurpub/ckw168.038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
98
|
Hristovski D, Kastrin A, Dinevski D, Burgun A, Žiberna L, Rindflesch TC. Using Literature-Based Discovery to Explain Adverse Drug Effects. J Med Syst 2016; 40:185. [PMID: 27318993 DOI: 10.1007/s10916-016-0544-z] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2015] [Accepted: 06/09/2016] [Indexed: 01/29/2023]
Abstract
We report on our research in using literature-based discovery (LBD) to provide pharmacological and/or pharmacogenomic explanations for reported adverse drug effects. The goal of LBD is to generate novel and potentially useful hypotheses by analyzing the scientific literature and optionally some additional resources. Our assumption is that drugs have effects on some genes or proteins and that these genes or proteins are associated with the observed adverse effects. Therefore, by using LBD we try to find genes or proteins that link the drugs with the reported adverse effects. These genes or proteins can be used to provide insight into the processes causing the adverse effects. Initial results show that our method has the potential to assist in explaining reported adverse drug effects.
Collapse
|
99
|
Bibault JE, Giraud P, Burgun A. Big Data and machine learning in radiation oncology: State of the art and future prospects. Cancer Lett 2016; 382:110-117. [PMID: 27241666 DOI: 10.1016/j.canlet.2016.05.033] [Citation(s) in RCA: 171] [Impact Index Per Article: 21.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2016] [Revised: 05/26/2016] [Accepted: 05/26/2016] [Indexed: 12/13/2022]
Abstract
Precision medicine relies on an increasing amount of heterogeneous data. Advances in radiation oncology, through the use of CT Scan, dosimetry and imaging performed before each fraction, have generated a considerable flow of data that needs to be integrated. In the same time, Electronic Health Records now provide phenotypic profiles of large cohorts of patients that could be correlated to this information. In this review, we describe methods that could be used to create integrative predictive models in radiation oncology. Potential uses of machine learning methods such as support vector machine, artificial neural networks, and deep learning are also discussed.
Collapse
|
100
|
Rance B, Canuel V, Countouris H, Laurent-Puig P, Burgun A. Integrating Heterogeneous Biomedical Data for Cancer Research: the CARPEM infrastructure. Appl Clin Inform 2016; 7:260-74. [PMID: 27437039 PMCID: PMC4941838 DOI: 10.4338/aci-2015-09-ra-0125] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Accepted: 02/07/2016] [Indexed: 01/19/2023] Open
Abstract
Cancer research involves numerous disciplines. The multiplicity of data sources and their heterogeneous nature render the integration and the exploration of the data more and more complex. Translational research platforms are a promising way to assist scientists in these tasks. In this article, we identify a set of scientific and technical principles needed to build a translational research platform compatible with ethical requirements, data protection and data-integration problems. We describe the solution adopted by the CARPEM cancer research program to design and deploy a platform able to integrate retrospective, prospective, and day-to-day care data. We designed a three-layer architecture composed of a data collection layer, a data integration layer and a data access layer. We leverage a set of open-source resources including i2b2 and tranSMART.
Collapse
|